Learning Hierarchical Representations for Video Analysis Using Deep Learning
Author: Yang Yang
Publisher:
Published: 2013
Total Pages: 90
ISBN-13:
DOWNLOAD EBOOKBesides learning the low-level local features, higher level representations are further designed to be learned in the context of applications. The data-driven concept representations and sparse representation of the events are learned for complex event recognition; the representations for object body parts and structures are learned for object detection in videos; and the relational motion features and similarity metrics between video pairs are learned simultaneously for action verification. Second, in order to learn discriminative and compact features, we propose a new feature learning method using a deep neural network based on auto encoders. It differs from the existing unsupervised feature learning methods in two ways: first it optimizes both discriminative and generative properties of the features simultaneously, which gives our features a better discriminative ability. Second, our learned features are more compact, while the unsupervised feature learning methods usually learn a redundant set of over-complete features. Extensive experiments with quantitative and qualitative results on the tasks of human detection and action verification demonstrate the superiority of our proposed models.