Faculty Advisor or Committee Member

Xinming Huang, Advisor

Faculty Advisor or Committee Member

Lifeng Lai, Committee Member

Faculty Advisor or Committee Member

Haibo He, Committee Member




"Autonomous vehicle is an engineering technology that can improve transportation safety, alleviate traffic congestion and reduce carbon emissions. Research on autonomous vehicles can be categorized by functionality, for example, object detection or recognition, path planning, navigation, lane keeping, speed control and driver status monitoring. The research topics can also be categorized by the equipment or techniques used, for example, image processing, computer vision, machine learning, and localization. This dissertation primarily reports on computer vision and machine learning algorithms and their implementations for autonomous vehicles. The vision-based system can effectively detect and accurately recognize multiple objects on the road, such as traffic signs, traffic lights, and pedestrians. In addition, an autonomous lane keeping system has been proposed using end-to-end learning. In this dissertation, a road simulator is built using data collection and augmentation, which can be used for training and evaluating autonomous driving algorithms. The Graphic Processing Unit (GPU) based traffic sign detection and recognition system can detect and recognize 48 traffic signs. The implementation has three stages: pre-processing, feature extraction, and classification. A highly optimized and parallelized version of Histogram of Oriented Gradients (HOG) and Support Vector Machine (SVM) is used. The system can process 27.9 frames per second with the active pixels of a 1,628 by 1,236 resolution, and with the minimal loss of accuracy. In an evaluation using the BelgiumTS dataset, the experimental results indicate that the detection rate is about 91.69% with false positives per window of 3.39e-5, and the recognition rate is about 93.77%. We report on two traffic light detection and recognition systems. The first system detects and recognizes red circular lights only, using image processing and SVM. Its performance is better than that of traditional detectors and it achieves the best performance with 96.97% precision and 99.43% recall. The second system is more complicated. It detects and classifies different types of traffic lights, including green and red lights in both circular and arrow forms. In addition, it employs image processing techniques, such as color extraction and blob detection to locate the candidates. Subsequently, a pre-trained PCA network is used as a multi-class classifier for obtaining frame-by-frame results. Furthermore, an online multi-object tracking technique is applied to overcome occasional misses and a forecasting method is used to filter out false positives. Several additional optimization techniques are employed to improve the detector performance and to handle the traffic light transitions. A multi-spectral data collection system is implemented for pedestrian detection, which includes a thermal camera and a pair of stereo color cameras. The three cameras are first aligned using trifocal tensor, and the aligned data are processed by using computer vision and machine learning techniques. Convolutional channel features (CCF) and the traditional HOG+SVM approach are evaluated over the data captured from the three cameras. Through the use of trifocal tensor and CCF, training becomes more efficient. The proposed system achieves only a 9% log-average miss rate on our dataset. Autonomous lane keeping system employs an end- to-end learning approach for obtaining the proper steering angle for maintaining a car in a lane. The convolutional neural network (CNN) model uses raw image frames as input, and it outputs the steering angles corresponding to the input frames. Unlike the traditional approach, which manually decomposes the problem into several parts, such as lane detection, path planning, and steering control, the model learns to extract useful features on its own and learns to steer from human behavior. More importantly, we find that having a simulator for data augmentation and evaluation is important. We then build the simulator using image projection, vehicle dynamics, and vehicle trajectory tracking. The test results reveal that the model trained with augmented data using the simulator has better performance and achieves about a 98% autonomous driving time on our dataset. Furthermore, a vehicle data collection system is developed for building our own datasets from recorded videos. These datasets are used in the above studies and have been released to the public for autonomous vehicle research. The experimental datasets are available at http://computing.wpi.edu/Dataset.html."


Worcester Polytechnic Institute

Degree Name



Electrical & Computer Engineering

Project Type


Date Accepted



Restricted-WPI community only


computer vision, machine learning, autonomous vehicles