Speeding up Viola-Jones Algorithm using Multi-Core GPU Implementation

Masek J., Burget R., Uher V., Guney S.

36th International Conference on Telecommunications and Signal Processing (TSP), Rome, Italy, 2 - 04 July 2013, pp.808-812 identifier identifier

  • Publication Type: Conference Paper / Full Text
  • Volume:
  • Doi Number: 10.1109/tsp.2013.6614050
  • City: Rome
  • Country: Italy
  • Page Numbers: pp.808-812
  • Karadeniz Technical University Affiliated: Yes


Graphic Processing Units (GPUs) offer cheap and high-performance computation capabilities by offloading compute-intensive portions of the application to the GPU, while the remainder of the code still runs on a CPU. This paper introduces an multi-GPU CUDA implementation of training of object detection using Viola-Jones algorithm that has accelerated of two the most time consuming operations in training process by using two dual-core NVIDIA GeForce GTX 690. When compared to single thread implementation on Intel Core i7 3770 with 3.7 GHz frequency, the first accelerated part of training process was speeded up 151 times and the second accelerated part was speeded up 124 times using two dual-core GPUs. This paper examines overall computational time of the Viola-Jones training process with the use of: one core CPU, one GPU, two GPUs, 3 GPUs and 4 GPUs. Trained detector was applied on testing set containing real world images.