A primary goal of computer vision is to develop algorithms that can learn representations of objects from training sets and subsequently label digital images with instances of these objects. The main focus of my research is the formulation of statistical models for objects. Although not extensively used in computer vision these emerge as a powerful tool in developing recognition algorithms which allow for proper modeling of object and data variability. The simplicity and transparency of the statistical models enables training with small samples, and give rise to efficient computational methods. Models for individual objects can be composed to create models for entire scenes. The models have been implemented in concrete applications such as reading license plates on photos of cars, reading handwritten zipcodes, detecting faces, cars or various objects of interest in biological images and videos. I am also interested in importing ideas developed in computer vision into the domain of speech recognition. Although the speech recognition is a more mature field of research than vision, there are some interesting insights from vision that may contribute to increased robustness and stability of speech recognition algorithms.