复制成功
  • 图案背景
  • 纯色背景
  •   |  注册
  • /
  • 批注本地保存成功,开通会员云端永久保存 去开通
  • Model-based Visual Tracking the OpenTL Framework

    下载积分:400

    内容提示: MODEL-BASED VISUAL TRACKINGThe OpenTL FrameworkGIORGIO PANINA JOHN WILEY & SONS, INC., PUBLICATIONffirs02.indd iiiffirs02.indd iii1/26/2011 3:05:15 PM1/26/2011 3:05:15 PM ffirs01.indd iiffirs01.indd ii1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM MODEL-BASED VISUAL TRACKINGffirs01.indd iffirs01.indd i1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM ffirs01.indd iiffirs01.indd ii1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM MODEL-BASED VISUAL TRACKINGThe OpenTL FrameworkGIORGIO PANINA JOHN...

    亚博足球app下载格式:PDF| 浏览次数:35| 上传日期:2011-04-27 17:58:20| 亚博足球app下载星级:
    MODEL-BASED VISUAL TRACKINGThe OpenTL FrameworkGIORGIO PANINA JOHN WILEY & SONS, INC., PUBLICATIONffirs02.indd iiiffirs02.indd iii1/26/2011 3:05:15 PM1/26/2011 3:05:15 PM ffirs01.indd iiffirs01.indd ii1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM MODEL-BASED VISUAL TRACKINGffirs01.indd iffirs01.indd i1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM ffirs01.indd iiffirs01.indd ii1/26/2011 3:05:13 PM1/26/2011 3:05:13 PM MODEL-BASED VISUAL TRACKINGThe OpenTL FrameworkGIORGIO PANINA JOHN WILEY & SONS, INC., PUBLICATIONffirs02.indd iiiffirs02.indd iii1/26/2011 3:05:15 PM1/26/2011 3:05:15 PM Copyright © 2011 by John Wiley & Sons, Inc. All rights reserved.Published by John Wiley & Sons, Inc., Hoboken, New Jersey.Published simultaneously in Canada.No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permission.Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifi cally disclaim any implied warranties of merchantability or fi tness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profi t or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.Library of Congress Cataloging-in-Publication Data:Panin, Giorgio, 1974– Model-based visual tracking : the OpenTL framework / Giorgio Panin. p. cm. ISBN 978-0-470-87613-8 (cloth) 1. Computer vision–Mathematical models. 2. Automatic tracking–Mathematics. 3. Three-dimensional imaging–Mathematics. I. Title. II. Title: Open Tracking Library framework. TA1634.P36 2011 006.3′7–dc22 2010033315Printed in SingaporeoBook ISBN: 9780470943922ePDF ISBN: 9780470943915ePub ISBN: 978111800213110 9 8 7 6 5 4 3 2 1ffirs03.indd ivffirs03.indd iv1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM CONTENTSPREFACE xi1 INTRODUCTION 11.1 Overview of the Problem / 21.1.1 Models / 31.1.2 Visual Processing / 51.1.3 Tracking / 61.2 General Tracking System Prototype / 61.3 The Tracking Pipeline / 82 MODEL REPRESENTATION 122.1 Camera Model / 132.1.1 Internal Camera Model / 132.1.2 Nonlinear Distortion / 162.1.3 External Camera Parameters / 172.1.4 Uncalibrated Models / 182.1.5 Camera Calibration / 202.2 Object Model / 262.2.1 Shape Model and Pose Parameters / 262.2.2 Appearance Model / 342.2.3 Learning an Active Shape or Appearance Model / 37vftoc.indd vftoc.indd v1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM vi CONTENTS2.3 Mapping Between Object and Sensor Spaces / 392.3.1 Forward Projection / 402.3.2 Back-Projection / 412.4 Object Dynamics / 432.4.1 Brownian Motion / 472.4.2 Constant Velocity / 492.4.3 Oscillatory Model / 492.4.4 State Updating Rules / 502.4.5 Learning AR Models / 523 THE VISUAL MODALITY ABSTRACTION 553.1 Preprocessing / 553.2 Sampling and Updating Reference Features / 573.3 Model Matching with the Image Data / 593.3.1 Pixel-Level Measurements / 623.3.2 Feature-Level Measurements / 643.3.3 Object-Level Measurements / 673.3.4 Handling Mutual Occlusions / 683.3.5 Multiresolution Processing for Improving Robustness / 703.4 Data Fusion Across Multiple Modalities and Cameras / 703.4.1 Multimodal Fusion / 713.4.2 Multicamera Fusion / 713.4.3 Static and Dynamic Measurement Fusion / 723.4.4 Building a Visual Processing Tree / 774 EXAMPLES OF VISUAL MODALITIES 784.1 Color Statistics / 794.1.1 Color Spaces / 804.1.2 Representing Color Distributions / 854.1.3 Model-Based Color Matching / 894.1.4 Kernel-Based Segmentation and Tracking / 904.2 Background Subtraction / 934.3 Blobs / 964.3.1 Shape Descriptors / 974.3.2 Blob Matching Using Variational Approaches / 1044.4 Model Contours / 1124.4.1 Intensity Edges / 1144.4.2 Contour Lines / 1194.4.3 Local Color Statistics / 122ftoc.indd viftoc.indd vi1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM CONTENTS vii4.5 Keypoints / 1264.5.1 Wide-Baseline Matching / 1284.5.2 Harris Corners / 1294.5.3 Scale-Invariant Keypoints / 1334.5.4 Matching Strategies for Invariant Keypoints / 1384.6 Motion / 1404.6.1 Motion History Images / 1404.6.2 Optical Flow / 1424.7 Templates / 1474.7.1 Pose Estimation with AAM / 1514.7.2 Pose Estimation with Mutual Information / 1585 RECURSIVE STATE-SPACE ESTIMATION 1625.1 Target-State Distribution / 1635.2 MLE and MAP Estimation / 1665.2.1 Least-Squares Estimation / 1675.2.2 Robust Least-Squares Estimation / 1685.3 Gaussian Filters / 1725.3.1 Kalman and Information Filters / 1725.3.2 Extended Kalman and Information Filters / 1735.3.3 Unscented Kalman and Information Filters / 1765.4 Monte Carlo Filters / 1805.4.1 SIR Particle Filter / 1815.4.2 Partitioned Sampling / 1855.4.3 Annealed Particle Filter / 1875.4.4 MCMC Particle Filter / 1895.5 Grid Filters / 1926 EXAMPLES OF TARGET DETECTORS 1976.1 Blob Clustering / 1986.1.1 Localization with Three-Dimensional Triangulation / 1996.2 AdaBoost Classifi ers / 2026.2.1 AdaBoost Algorithm for Object Detection / 2026.2.2 Example: Face Detection / 2036.3 Geometric Hashing / 2046.4 Monte Carlo Sampling / 2086.5 Invariant Keypoints / 211ftoc.indd viiftoc.indd vii1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM viii CONTENTS7 BUILDING APPLICATIONS WITH OpenTL 2147.1 Functional Architecture of OpenTL / 2147.1.1 Multithreading Capabilities / 2167.2 Building a Tutorial Application with OpenTL / 2167.2.1 Setting the Camera Input and Video Output / 2177.2.2 Pose Representation and Model Projection / 2207.2.3 Shape and Appearance Model / 2247.2.4 Setting the Color-Based Likelihood / 2277.2.5 Setting the Particle Filter and Tracking the Object / 2327.2.6 Tracking Multiple Targets / 2357.2.7 Multimodal Measurement Fusion / 2377.3 Other Application Examples / 240APPENDIX A: POSE ESTIMATION 251A.1 Point Correspondences / 251A.1.1 Geometric Error / 253A.1.2 Algebraic Error / 253A.1.3 2D-2D and 3D-3D Transforms / 254A.1.4 DLT Approach for 3D-2D Projections / 256A.2 Line Correspondences / 259A.2.1 2D-2D Line Correspondences / 260A.3 Point and Line Correspondences / 261A.4 Computation of the Projective DLT Matrices / 262APPENDIX B: POSE REPRESENTATION 265B.1 Poses Without Rotation / 265B.1.1 Pure Translation / 266B.1.2 Translation and Uniform Scale / 267B.1.3 Translation and Nonuniform Scale / 267B.2 Parameterizing Rotations / 268B.3 Poses with Rotation and Uniform Scale / 272B.3.1 Similarity / 272B.3.2 Rotation and Uniform Scale / 273B.3.3 Euclidean (Rigid Body) Transform / 274B.3.4 Pure Rotation / 274B.4 Affi nity / 275ftoc.indd viiiftoc.indd viii1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM CONTENTS ixB.5 Poses with Rotation and Nonuniform Scale / 277B.6 General Homography: The DLT Algorithm / 278NOMENCLATURE 281BIBLIOGRAPHY 285INDEX 295ftoc.indd ixftoc.indd ix1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM ftoc.indd xftoc.indd x1/27/2011 1:53:25 PM1/27/2011 1:53:25 PM xi PREFACE Object tracking is a broad and important fi eld in computer science, addressing the most different applications in the educational, entertainment, industrial, and manufacturing areas. Since the early days of computer vision, the state of the art of visual object tracking has evolved greatly, along with the available imaging devices and computing hardware technology. This book has two main goals: to provide a unifi ed and structured review of this fi eld, as well as to propose a corresponding software framework, the OpenTL library , developed at TUM - Informatik VI (Chair for Robotics and Embedded Systems). The main result of this work is to show how most real - world application scenarios can be cast naturally into a common description vocabulary, and therefore implemented and tested in a fully modular and scal-able way, through the defi nition of a layered, object - oriented software archi-tecture. The resulting architecture covers in a seamless way all processing levels, from raw data acquisition up to model - based object detection and sequential localization, and defi nes, at the application level, what we call the tracking pipeline . Within this framework, extensive use of graphics hardware (GPU computing ) as well as distributed processing allows real - time perfor-mances for complex models and sensory systems. The book is organized as follows: In Chapter 1 we present our approach to the object - tracking problem in the most abstract terms. In particular, we defi ne the three main issues involved: models, vision, and tracking, a structure that we follow in subsequent chapters. A generic tracking system fl ow diagram, the main tracking pipeline , is presented in Section 1.3 . fpref.indd xifpref.indd xi1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM xii PREFACE The model layer is described in Chapter 2 , where specifi cations concerning the object (shape, appearance, degrees of freedom, and dynamics ), as well as the sensory system, are given. In this context, particular care has been directed to the representation of the many possible degrees of freedom (pose parameters ), to which Appendixes 8 and 9 are also dedicated. Our unique abstraction for visual features processing, and the related data association and fusion schemes, are then discussed in Chapter 3 . Subsequently, several concrete examples of visual modalities are provided in Chapter 4 . Several Bayesian tracking schemes that make effective use of the measure-ment processing are described in Chapter 5 , again under a common abstrac-tion: initialization, prediction, and correction. In Chapter 6 we address the challenging task of initial target detection and present some examples of more or less specialized algorithms for this purpose. Application examples and results are given in Chapter 7 . In particular, in Section 7.1 we provide an overview of the OpenTL layered class architecture along with a documented tutorial application, and in Section 7.3 present a full prototype system description and implementation, followed by other examples of application instances and experimental results. Acknowledgments I am particularly grateful to my supervisor, Professor Alois Knoll, for having suggested, supported, and encouraged this challenging research, which is both theoretical and practical in nature. In particular, I wish to thank him for having initiated the Visual Tracking Group at the Chair for Robotics and Embedded Systems of the Technische Universit ä t M ü nchen Fakult ä t f ü r Informatik, which was begun in May 2007 with the implementation of the OpenTL library, in which I participated as both a coordinator and an active programmer. I also wish to thank Professor Knoll and Professor Gerhard Rigoll (Chair for Man – Machine Communication), for having initiated the Image - Based Tracking and Understanding (ITrackU) project of the Cognition for Technical Systems (CoTeSys [10] ) research cluster of excellence, funded under the Excellence Initiative 2006 by the German Research Council (DFG). For his useful comments concerning the overall book organization and the introduc-tory chapter, I also wish to thank our Chair, Professor Darius Burschka. My acknowledgment to the Visual Tracking Group involves not only the code development and documentation of OpenTL, but also the many applica-tions and related projects that were contributed, as well as helpful suggestions for solving the most confusing implementation details, thus providing very important contributions to this book, especially to Chapter 7. In particular, in this context I wish to mention Thorsten R ö der, Claus Lenz, Sebastian Klose, Erwin Roth, Suraj Nair, Emmanuel Dean, Lili Chen, Thomas M ü ller, Martin Wojtczyk, and Thomas Friedlhuber. fpref.indd xiifpref.indd xii1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM PREFACE xiii Finally, the book contents are based partially on the undergraduate lectures on model - based visual tracking that I have given at the Chair since 2006. I therefore wish to express my deep sense of appreciation for the input and feedback of my students, some of whom later joined the Visual Tracking Group. G iorgio P anin fpref.indd xiiifpref.indd xiii1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM fpref.indd xivfpref.indd xiv1/26/2011 3:05:16 PM1/26/2011 3:05:16 PM 1 CHAPTER 1 INTRODUCTION Visual object tracking is concerned with the problem of sequentially localizing one or more objects in real time by exploiting information from imaging devices through fast, model - based computer vision and image - understanding techniques (Fig. 1.1 ). Applications already span many fi elds of interest, includ-ing robotics, man – machine interfaces, video surveillance, computer - assisted surgery, and navigation systems. Recent surveys on the current state of the art have appeared in the literature (e.g., [169,101] ), together with a variety of valuable and effi cient methodologies. Many of the low - level image processing and understanding algorithms involved in a visual tracking system can now be found in open - source vision libraries such as the Intel OpenCV [15] , which provides a worldwide standard; and at the same time, powerful programmable graphics hardware makes it possible both to visualize and to perform computations with very complex object models in negligible time on common PCs, using the facilities provided by the OpenGL [17] language and its extensions [19] . Despite these facts, to my knowledge, no wide - scale examples of software libraries for model - based visual tracking are available, and most existing soft-ware deals with more or less limited application domains, not easily allowing extensions or inclusion of different methodologies in a modular and scalable way. Therefore, a unifying, general - purpose, open framework is becoming a compelling issue for both users and researchers in the fi eld. This challenging Model-Based Visual Tracking: The OpenTL Framework, First Edition. Giorgio Panin. © 2011 John Wiley & Sons, Inc. Published 2011 by John Wiley & Sons, Inc.c01.indd 1c01.indd 11/26/2011 2:55:13 PM1/26/2011 2:55:13 PM 2 INTRODUCTIONtarget constitutes the main motivation of the present work, where a twofold goal is pursued: 1. Formulating a common and nonredundant description vocabulary for multimodal, multicamera, and multitarget visual tracking schemes 2. Implementing an object - oriented library that realizes the corresponding infrastructure, where both existing and novel systems can be built in terms of a simple application programming interface in a fully modular, scalable, and parallelizable way. 1.1 OVERVIEW OF THE PROBLEM The lack of a complete and general - purpose architecture for model - based tracking can be attributed in part to the apparent problem complexity: An extreme variety of scenarios with interacting objects, as well as many hetero-geneous visual modalities that can be defi ned, processed, and combined in virtually infi nite ways [169] , may discourage any attempt to defi ne a unifying framework. Nevertheless, a more careful analysis shows that many common properties can be identifi ed through the variety and properly included in a common description vocabulary for most state - of - the - art systems. Of course, while designing a general - purpose toolkit, careful attention should be paid from the beginning, to allow developers to formulate algorithms without intro-ducing redundant computations or less direct implementation schemes. Toward this goal, we begin highlighting the main issues addressed by OpenTL: • Representing models of the object, sensors, and environment • Performing visual processing , to obtain measurements associated with objects in order to carry out detection or state updating procedures • Tracking the objects through time using a prediction – measurement – update loop Figure 1.1 Model - based object tracking. Left : object model; middle : visual features; right : estimated pose. c01.indd 2c01.indd 21/26/2011 2:55:13 PM1/26/2011 2:55:13 PM OVERVIEW OF THE PROBLEM 3 These items are outlined in Fig. 1.2 , and discussed further in the following sections. 1.1.1 Models Object models consist of more or less specifi c prior knowledge about each object to be tracked, which depends on both the object and the application (Fig. 1.3 ). For example, a person model for visual surveillance can be repre-sented by a very simple planar shape undergoing planar transformations, and for three - dimensional face tracking a deformable mesh can be used. The appearance model can also vary from single reference pictures up to a full texture and refl ectance map. Degrees of freedom (or pose parameters ) defi ne in which ways the base shape can be modifi ed, and therefore how points in object coordinates map to world coordinates . Finally, dynamics is concerned with a model of the temporal evolution of an object ’ s pose, shape, and appear-ance parameters. Models of the sensory system are also required and may be more or less specifi c as well. In the video surveillance example, we have a monocular, uncalibrated camera where only horizontal and vertical image resolution is given, so that pose parameters specify target motion in pixel coordinates. On the other hand, in a stereo or multicamera setup, full calibration parameters have to be provided, in terms of both external camera positions and the Figure 1.2 Overview of the three main aspects of an object tracking task: models, vision, and tracking. ObjectTrackingPre-processingVisualprocessingDatafusionTrackingTargetUpdateMeasurementDetection/RecognitionTargetPredictionModelsObjectsSensorsEnvironmentFeaturesSamplingOcclusionHandlingDataassociationc01.indd 3c01.indd 31/26/2011 2:55:13 PM1/26/2011 2:55:13 PM 4 INTRODUCTIONinternal acquisition model (Chapter 2 ), while the shape is given in three - dimensional metric units . Information about the environment may also play a major role in visual tracking applications. Most notably, when the cameras are static and the light is more or less constant (or slowly changing), such as for video surveillance in indoor environments, a background model can be estimated and updated in time, providing a powerful method for detection of generic targets in the visual fi eld. But known obstacles such as tables or other items may also be included by restricting the pose space for the object, by means of penalty functions that avoid generating hypotheses in the “ forbidden ” regions. Moreover, they can be used to predict external occlusions and to avoid associating data in the occluded areas for a given view. 1 Figure 1.3 Specifi cation of object models for a variety of applications. 1 Conceptually, external occlusions are not to be confused with mutual occlusions (between tracked objects) or self - occlusions of a nonconvex object, such as those shown in Section 3.2 . However, the same computational tools can be used as well to deal with external occlusions . c01.indd 4c01.indd 41/26/2011 2:55:13 PM1/26/2011 2:55:13 PM OVERVIEW OF THE PROBLEM 5 1.1.2 Visual Processing Visual processing deals with the extraction and association of useful informa-tion about objects from the sensory data, in order to update knowledge about the overall system state. In particular, for any application we need to specify which types of cues will be detected and used for each target (i.e., color , edges , motion, background, texture , depth , etc.) and at which level of abstraction (e.g., pixel - wise maps, shape - and/or appearance - related features). Throughout the book we refer to these cues as visual modalities . Any of these modalities requires a preprocessing step, which does not depend in any way on the specifi c target or pose hypothesis but only on the image data, and a feature sampling step, where salient features related to the modality are sampled from the visible model surface under a given pose hypothesis: for example, salient keypoints, external contours , or color histo-grams . As we will see in Chapter 3 , these features can also be updated with image data during tracking, to improve the adaptation capabilites and robust-ness of a system. In the visual processing context, one crucial problem is data association or matching : assessing in a deterministic or probabilistic way, possibly keeping multiple hypotheses, which of the data observed have been generated by a target or by background clutter, on the basis of the respective models, and possibly using the temporal state prediction from the tracker (static/dynamic association ). In the most general case, data association must also deal with issues such as missing detections and false alarms , as well as multiple targets with mutual occlusions , which can make the problem one of high computational complexity. This complexity is usually reduced by setting validation gates around the positions predicted for each target, in order to avoid very unlikely associations that would produce too - high mea-surement residuals , or innovations . We explore these aspects in detail in Chapters 3 and 4 . After data have been associated with targets, measurements from different modalities or sensors must be integrated in some way according to the mea-surement type and possibly using the object dynamics as well (static/dynamic data fusion ). Data fusion is often the key to increasing robustness for a visual tracking system, which, by integrating independent information sources, can better cope with unpredicted situations such as light variations and model imperfections. Once all the target - related measurements have been integrated, one fi nal task concerns how to evaluate the likelihood of the measurements under the state predicted. This may involve single - hypothesis distributions such as a Gaussian, or multihypothesis models such as mixtures of Gaussians, and takes into account the measurement residuals as well as their uncertainties (or covariances ). As we will see in Chapter 4 , the choice of an object model will, in turn, more or less restrict the choice of the visual modalities that can be employed: for c01.indd 5c01.indd 51/26/2011 2:55:14 PM1/26/2011 2:55:14 PM 6 INTRODUCTIONexample, a nontextured appearance such as the fi rst two shown in Fig. 1.3 prevents the use of local keypoints or texture templates, whereas it makes it possible to use global statistics of color and edges . 1.1.3 Tracking When a temporal sequence of data is given, we distinguish between two basic forms of object localization: detection and tracking. In the detection phase, the system is initialized by providing prior knowledge about the state the fi rst time, or whenever a new target enters the scene, for which temporal predictions are not yet available. This amounts to a global search , eventually based on the same off - line shape and appearance models, to detect the new target and localize it roughly in pose space. A fully autono-mous system should also be able to detect when any target has been lost because of occlusions , or when it leaves the scene, and terminate the track accordingly. Monitoring the quality of estimation results is crucial in order to detect lost targets. This can be done in several ways, according to the prior models avail-able; we mention here two typical examples: • State statistics. A track loss can be declared whenever the state statistics estimated have a very high uncertainty compared to the dynamics expected ; for example, in a Kalman fi lter the posterior covariance [33] can be used; for particle fi lters , other indices, such as particle survival diagnostics [29] , are commonly employed. • Measurement residuals. After a state update , measurement residuals can be used to assess tracking quality by declaring a lost target whenever the residuals (or their covariances) are too high. In the tracking phase, measurement likelihoods are used to update overall knowledge of the multitarget state, represented for each object by a more or less generic posterior statistics in a Bayesian prediction – correction context. Updating the state statistics involves feeding the measurement into a sequen-tial estimator, which can be implemented in different ways according to the system nature, and where temporal dynamics are taken into account. 1.2 GENERAL TRACKING SYSTEM PROTOTYPE The issues mentioned above can be addressed by considering the standard target - oriented tracking approach (Fig. 1.4 ), which constitutes the starting point for developing our framework. The main system modules are: c01.indd 6c01.indd 61/26/2011 2:55:14 PM1/26/2011 2:55:14 PM GENERAL TRACKING SYSTEM PROTOTYPE 7 • Models: off - line available priors about the objects and the sensors, and possibly, environment information such as the background • Track maintainance: input devices, measurement processing with local data association and fusion , Bayesian tracking , postprocessing, and visu-alization of the output • Track initiation/termination: detection and recognition methods for track initialization and termination In this scheme we denote by Obj a multivariate state distribution represent-ing our knowledge of the entire scenario of tracked objects, as we explain in Section 5.1 . This representation has to be updated over time using the sensory data It from the cameras. In particular, the track initiation module processes sensory data with the purpose of localizing new targets as well as removing lost targets from the old set Objt−1, thus producing an updated vector Objt−while the distribution of maintained targets is not modifi ed. This module is used the fi rst time ( t = 0), when no predictions are available, but in general it may be called at any time during tracking. The upper part of the system consists of the track maintainance modules, where existing targets are subject to prediction , measurement , and correction +1, Figure 1.4 High - level view of a target - oriented tracking system. LocalprocessingLocalprocessingDetection/RecognitionDetection/RecognitionBayesiantrackingBayesiantrackingtMeastObjtItItShapeAppearanceDegrees offreedomDynamicsSensorsEnvironmentModels−tObj1−tObjΔtΔt+−1tObjPost-processingPost-processingTrack MaintainanceTrack Initiationc01.indd 7c01.indd 71/26/2011 2:55:14 PM1/26/2011 2:55:14 PM 8 INTRODUCTIONsteps, which modify their state distribution using the sensory data and models available. In the prediction step, the Bayesian tracker moves the old distribu-tions Objt−1 ahead to time t, according to the given dynamical models , produc-ing the prior distribution Objt Afterward, the measurement processing block uses the predicted states ObjtWith these data, the Bayesian update modifi es the predicted prior into the posterior distribution Objt, which is the output of our system. In the next section we consider in more detail the track maintainance sub-steps, which constitute what we call the tracking pipeline . −. − to provide target - associated measurements Meast for Bayesian update. 1.3 THE TRACKING PIPELINE The main tracking pipeline is depicted in Fig. 1.5 in an “ unfolded ” view, where the following sequence takes place: 1. Data acquisition. Raw sensory data (images) are obtained from the input devices, with associated time stamps . 2 2. State prediction. The Bayesian tracker generates one or more predic-tive hypotheses about the object states at the time stamp of the current data, based on the preceding state distribution and the system dynamics . 3. Preprocessing . Image data are processed in a model - free fashion, inde-pendent of any target hypothesis, providing unassociated data related to a given visual modality. 4. Sampling model features . A predicted target hypothesis, usually the average stcluded model surfaces. These features are back - projected in model space, for subsequent re - projection and matching at different hypotheses. 5. Data association . Reference features are matched against the prepro-cessed data to produce a set of target - associated measurements . These quantities are defi ned and computed differently (Section 3.3 ) according −, is used to sample good features for tracking from the unoc- Figure 1.5 Unfolded view of the tracking pipeline. StatepredictionpredictionStateDataDataacquisitionacquisitionOff-line featuressamplingsamplingOff-line featuresDataDataassociationassociationDatafusionfusionDataStateupdateupdateStateOn-line featuresupdateupdateOn-line featuresPre-Pre-processingprocessing 2 In an asynchronous context, each sensor provides independent data and time stamps. c01.indd 8c01.indd 81/26/2011 2:55:14 PM1/26/2011 2:55:14 PM THE TRACKING PIPELINE 9to the visual modality and desired level of abstraction, and with possibly multiple association hypotheses. 6. Data fusion . Target - associated data, obtained from all cameras and modalities, are combined to provide a global measurement vector, or a global likelihood , for Bayesian update . 7. State update . The Bayesian tracker updates the posterior state statistics for each target by using the associated measurements or their likelihood . Out of this distribution, a meaningful output - state estimate is computed (e.g., the MAP , or weighted average) and used for visualization or sub-sequent postprocessing. When a ground truth is also available, they can be compared to evaluate system performance. 8. Update online features . The output state is used to sample, from the underlying image data, online reference features for the next frame. An example of a monomodal pipeline for three - dimensional object tracking is shown in Fig. 1.6 , where the visual modality is given by local keypoints Figure 1.6 Example of a monomodal pipeline for three - dimensional object tracking. Pre-processing−tsRe-projectiontsMatching (feature-level)CorrectionBack-projectiontsNew featuresUpdate model featuresPredictionRendered modelSampling model featuresFeatures detectionBack-projection−ts−tsInput imageImage featuresc01.indd 9c01.indd 91/26/2011 2:55:14 PM1/26/2011 2:55:14 PM 10 INTRODUCTION(Section 4.5 ). Here, preprocessing consists of detecting local features in the input image, while the average prediction, stand sample features from the off - line model, by back - projection in object coordinates. Individual hypotheses are used during the matching process, where re - projected model features are associated with the nearest - neighbor image data. After Bayesian correction , residuals are minimized and the output state is e...

    关注我们

  • 新浪微博
  • 关注微信公众号

  • 打印亚博足球app下载
  • 复制文本
  • 下载Model-based Visual Tracking the OpenTL Framework.XDF
  • 您选择了以下内容