ORA Thesis: "Human layout estimation using structured output learning" - uuid:bb290cfd-5216-42d7-b3d2-c2b4b01614bc

40 views

Thesis

Links & Downloads

Local copy not available for download in ORA



http://ora.ox.ac.uk/objects/ora:6591

Reference: Arpit Mittal, (2012). Human layout estimation using structured output learning. DPhil. University of Oxford.

Citable link to this page: http://ora.ox.ac.uk/objects/uuid:bb290cfd-5216-42d7-b3d2-c2b4b01614bc
 
Title: Human layout estimation using structured output learning

Abstract:

In this thesis, we investigate the problem of human layout estimation in unconstrained still images. This involves predicting the spatial configuration of body parts.

We start our investigation with pictorial structure models and propose an efficient method of model fitting using skin regions. To detect the skin, we learn a colour model locally from the image by detecting the facial region. The resulting skin detections are also used for hand localisation.

Our next contribution is a comprehensive dataset of 2D hand images. We collected this dataset from publicly available image sources, and annotated images with hand bounding boxes. The bounding boxes are not axis aligned, but are rather oriented with respect to the wrist. Our dataset is quite exhaustive as it includes images of different hand shapes and layout configurations.

Using our dataset, we train a hand detector that is robust to background clutter and lighting variations. Our hand detector is implemented as a two-stage system. The first stage involves proposing hand hypotheses using complementary image features, which are then evaluated by the second stage classifier. This improves both precision and recall and results in a state-of-the-art hand detection method. In addition we develop a new method of non-maximum suppression based on super-pixels.

We also contribute an efficient training algorithm for structured output ranking. In our algorithm, we reduce the time complexity of an expensive training component from quadratic to linear. This algorithm has a broad applicability and we use it for solving human layout estimation and taxonomic multiclass classification problems.

For human layout, we use different body part detectors to propose part candidates. These candidates are then combined and scored using our ranking algorithm. By applying this bottom-up approach, we achieve accurate human layout estimation despite variations in viewpoint and layout configuration. In the multiclass classification problem, we define the misclassification error using a class taxonomy. The problem then reduces to a structured output ranking problem and we use our ranking method to optimise it. This allows inclusion of semantic knowledge about the classes and results in a more meaningful classification system.

Lastly, we substantiate our ranking algorithm with theoretical proofs and derive the generalisation bounds for it. These bounds prove that the training error reduces to the lowest possible error asymptotically.


Digital Origin:Born digital
Type of Award:DPhil
Level of Award:Doctoral
Awarding Institution: University of Oxford
Notes:This thesis is not currently available via ORA.
About The Authors
websitehttp://www.robots.ox.ac.uk/~arpit/
institutionUniversity of Oxford
facultyMathematical,Physical & Life Sciences Division - Engineering Science
researchGroupVisual Geometry Group
oxfordCollegeBrasenose College
 
Contributors
Prof Andrew Zisserman More by this contributor
RoleSupervisor
 
Prof Philip Torr More by this contributor
RoleSupervisor
 
Bibliographic Details
Issue Date: 2012
Copyright Date: 2012
Identifiers
Urn: uuid:bb290cfd-5216-42d7-b3d2-c2b4b01614bc
Item Description
Relationships
Member of collection : ora:thesis
Alternate metadata formats
Rights
Copyright Holder: Arpit Mittal
Terms of Use: Click here for our Terms of Use