The project consists of two main components i.e. Gesture and Voice. Gesture Recognition is where we used two algorithms. Those are Lucas-Kanade and Haar Classifier. The second main component, which is the Voice component, consists of two sub-components, which are the voice recognizer and the voice synthesizer. The entire architectural idea is based on the basic floor concepts of OpenCV (Open Computer Vision). The architecture and component distribution looks very easy and systematic but gets complicated as the implementation proceeds. The belowdetailed description should make it easy for all us to understand this not so easy system.