Home | Background | Publications
|Topics||Data and Downloads|
Partially Occluded Object Detection
Object Recognition wih 3D Models
Object Recognition in Range Data
Detection of Pedestrians
Recognition with 3D Models
Range Images from Laser Range Camera (19 MB)
MPEG Pedestrian Detection
MPEG Pedestrian Tracking
Long-Term Prediction of Pedestrian Motion
Vasiliy Karasev, Alper Ayvaci,
Bernd Heisele, Stefano Soatto
We present a method to predict long-term motion of pedestrians, modeling their behavior as jump-Markov processes with their goal a hidden variable. Assuming approximately rational behavior, and incorporating environmental constraints and biases, including time-varying ones imposed by traffic lights, we model intent as a policy in a Markov decision process framework. We infer pedestrian state using a Rao-Blackwellized filter, and intent by planning according to a stochastic policy, reflecting individual preferences in aiming at the same goal.
V. Karasev, A. Ayvaci, B. Heisele, S. Soatto. Intent-Aware Long-Term Prediction of Pedestrian Motion. International Conference on Robotics and Automation (ICRA) , 2016.
Sample long-term predictions of traffic participants’ motion generated by our model. Warmer colors indicate more probable paths. Notice that the predictions are multi-modal and obey constraints of the environment.
Partially Occluded Object Detection by Finding the Visible Features and Parts
Kai Chi Chan, Alper Ayvaci and
We address the partially occluded object detection problem by implementing a model which includes latent visibility flags that are attached to cells and parts of a Deformable Part Model. A visibility flag indicates whether an image portion is part of a pedestrian or part of an occluder. To compute the visibility flags and the score of the detector simultaneously, we maximize a concave objective function that is composed of the following four parts: (1) the detection scores of visible cells and parts, (2) a cell-to-cell consistency term which encourages neighboring cells to have the same visibility flags, (3) a cell-to-part consistency term which encourages compatible labeling among overlapping cells and parts, and (4) a penalty term for cells and parts that are labeled as occluded. The maximization of the concave objective function is done using the Alternating Direction Method of Multipliers (ADMM). By removing scores of occluded cells and parts from the final detection score we significantly improve detection performance on partially occluded pedestrians. In experiments we show that our system outperforms the standard DPM and other state-of-art methods.
K. C. Chan, A. Ayvaci and B. Heisele. Partially Occluded Object Detection by Finding the Visible Features and Parts. International Conference on Image Processing, (ICIP), 2015, Best Paper Award.
Consistency graph: Root cells are represented by the
squares on the image, and parts are drawn above. Edges that are represented by orange lines indicate the cell-to-cell consistency while yellow lines indicate the cell-to-part consistency.
The visibility map estimates: Input image. The initialization passed to ADMM: To acquire this map, we threshold the cell-level and part-level detector responses at 0. Red and green indicate the variables with values 0 and 1, respectively. The binarized visibility estimate at first iteration. The solution at convergence (third iteration).
|Object Recognition with 3D
B. Heisele, G. Kim, and A. Meyer
We propose techniques for designing and training of pose-invariant object recognition systems using realistic 3d computer graphics models. We look at the relation between the size of the training set and the classification accuracy for a basic recognition task and provide a method for estimating the degree of difficulty of detecting an object. We show how to sample, align, and cluster images of objects on the view sphere. We address the problem of training on large, highly redundant data and propose a novel active learning method which generates compact training sets and compact classifiers.
Top row: 3D computer graphics models used for training and photographes of the real objects used for testing. Middle row: Synthetic images with uniform background. Bottom row: Synthetic images with natural background
Recognition performance on real objects. The system has been exclusively trained on synthetic images.
J. Skelley, R. Fischer, A. Sarma,
and B. Heisele
Example images from the new database.
We describe a new expression
database which contains video sequences of both played
and natural expressions and an expression classification
system based on warped optical flow fields and texture
features. We analyze the system's generalization
performance when confronted with subjects that
were not present in the training set and its recognition
performance when tested on natural expressions. We
evaluate several techniques for combining the classifier
outputs computed on single images to perform
classification of a temporal sequence of expression
R. Fischer, A. Sarma, and B. Heisele. Recognizing
Expressions in a New Database Containing Played and
B. Heisele, J. Huang, V. Blanz
Left: Original image used for computing the 3D model. Right: Synthetic image.
We present a novel
approach to pose and illumination invariant face
recognition that combines two recent advances in the
Huang, J., V. Blanz and B. Heisele. Face Recognition Using Component-Based SVM Classification and Morphable Models. In: Proceedings of Pattern Recognition with Support Vector Machines, First International Workshop, SVM 2002, Niagara Falls, Canada, Lecture Notes in Computer Science, Springer 2388, 334-341, 2002.
Components used for recognition for a frontal and half-profile view of a face.
B. Heisele, T. Poggio, M. Pontil
We present a trainable system for detecting frontal and near-frontal views of faces in still gray images using Support Vector Machines (SVMs). We first consider the problem of detecting the whole face pattern by a single SVM classifier. In this context we compare different types of image features, present and evaluate a new method for reducing the number features and discuss practical issues concerning the parameterization of SVMs and the selection of training data. The second part of the paper describes a component-based method for face detection consisting of a two-level hierarchy of SVM classifiers. On the first level, component classifiers independently detect components of a face, such as the eyes, the nose, and the mouth. On the second level, a single classifier checks if the geometrical configuration of the detected components in the image matches a geometrical model of a face.
T. Serre, M. Pontil, T. Vetter and T. Poggio.
Categorization by Learning and Combining Object Parts.
In: Advances in Neural Information Processing Systems
14, Vancouver, Canada, Vol. 2, 1239-1245, 2002.
Component-based face detection with four component classifiers and a single geometrical configuration classifier.
B. Heisele, R. Su
As part of the CBCL face detection
project we used face morphing algorithms to build a face
database for training and testing our detection system.
One way to perform face morphing is to manually select
pairs corresponding features which define the mapping
between the two images. The Beier-Neely (BN) algorithm
requires the user to select pairs of corresponding
straight line segments. Addtionally to the BN algorithm
we used a morphing technique based on optical flow. This
technique does not require manual user interaction.
Instead, the mapping between the images is automatically
determined by estimating the optical flow.
Morphing: left and middle: original images, right: morphed image
|Object Recognition in Range
I developed a model-based
algorithm for real-time recognition of objects in dense
range data. In an off-line modeling process, templates
are generated from a 3D model with a virtual range
sensor. Ttwo types of templates are generated: edge
templates representing the silhouette of the object and
range templates describing the 3D structure of the
object's surface. The recognition process consists of
two steps. First, object hypotheses are generated by
fast, hierarchical edge-based matching. Then range-based
matching verifies the object hypotheses.
Object recognition: left: original range image, middle: recognition result for large cup, right: recognition result for box.
|Detection of Pedestrians
B. Heisele, C. Woehler