3D Object Pose Estimation in Industrial Context

3D Object Pose Estimation in Industrial Context

Author: Giorgia Pitteri

Publisher:

Published: 2020

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

3D object detection and pose estimation are of primary importance for tasks such as robotic manipulation, augmented reality and they have been the focus of intense research in recent years. Methods relying on depth data acquired by depth cameras are robust. Unfortunately, active depth sensors are power hungry or sometimes it is not possible to use them. It is therefore often desirable to rely on color images. When training machine learning algorithms that aim at estimate object's 6D poses from images, many challenges arise, especially in industrial context that requires handling objects with symmetries and generalizing to unseen objects, i.e. objects never seen by the networks during training.In this thesis, we first analyse the link between the symmetries of a 3D object and its appearance in images. Our analysis explains why symmetrical objects can be a challenge when training machine learning algorithms to predict their 6D pose from images. We then propose an efficient and simple solution that relies on the normalization of the pose rotation. This approach is general and can be used with any 6D pose estimation algorithm.Then, we address the second main challenge: the generalization to unseen objects. Many recent methods for 6D pose estimation are robust and accurate but their success can be attributed to supervised Machine Learning approaches. For each new object, these methods have to be retrained on many different images of this object, which are not always available. Even if domain transfer methods allow for training such methods with synthetic images instead of real ones-at least to some extent-such training sessions take time, and it is highly desirable to avoid them in practice.We propose two methods to handle this problem. The first method relies only on the objects' geometries and focuses on objects with prominent corners, which covers a large number of industrial objects. We first learn to detect object corners of various shapes in images and also to predict their 3D poses, by using training images of a small set of objects. To detect a new object in a given image, we first identify its corners from its CAD model; we also detect the corners visible in the image and predict their 3D poses. We then introduce a RANSAC-like algorithm that robustly and efficiently detects and estimates the object's 3D pose by matching its corners on the CAD model with their detected counterparts in the image.The second method overcomes the limitations of the first one as it does not require objects to have specific corners and the offline selection of the corners on the CAD model. It combines Deep Learning and 3D geometry and relies on an embedding of the local 3D geometry to match the CAD models to the input images. For points at the surface of objects, this embedding can be computed directly from the CAD model; for image locations, we learn to predict it from the image itself. This establishes correspondences between 3D points on the CAD model and 2D locations of the input images. However, many of these correspondences are ambiguous as many points may have similar local geometries. We also show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences. We can then robustly estimate a 3D pose from these discriminative correspondences using a RANSAC-like algorithm.


Representations and Techniques for 3D Object Recognition and Scene Interpretation

Representations and Techniques for 3D Object Recognition and Scene Interpretation

Author: Derek Hoiem

Publisher: Morgan & Claypool Publishers

Published: 2011

Total Pages: 172

ISBN-13: 1608457281

DOWNLOAD EBOOK

One of the grand challenges of artificial intelligence is to enable computers to interpret 3D scenes and objects from imagery. This book organizes and introduces major concepts in 3D scene and object representation and inference from still images, with a focus on recent efforts to fuse models of geometry and perspective with statistical machine learning. The book is organized into three sections: (1) Interpretation of Physical Space; (2) Recognition of 3D Objects; and (3) Integrated 3D Scene Interpretation. The first discusses representations of spatial layout and techniques to interpret physical scenes from images. The second section introduces representations for 3D object categories that account for the intrinsically 3D nature of objects and provide robustness to change in viewpoints. The third section discusses strategies to unite inference of scene geometry and object pose and identity into a coherent scene interpretation. Each section broadly surveys important ideas from cognitive science and artificial intelligence research, organizes and discusses key concepts and techniques from recent work in computer vision, and describes a few sample approaches in detail. Newcomers to computer vision will benefit from introductions to basic concepts, such as single-view geometry and image classification, while experts and novices alike may find inspiration from the book's organization and discussion of the most recent ideas in 3D scene understanding and 3D object recognition. Specific topics include: mathematics of perspective geometry; visual elements of the physical scene, structural 3D scene representations; techniques and features for image and region categorization; historical perspective, computational models, and datasets and machine learning techniques for 3D object recognition; inferences of geometrical attributes of objects, such as size and pose; and probabilistic and feature-passing approaches for contextual reasoning about 3D objects and scenes. Table of Contents: Background on 3D Scene Models / Single-view Geometry / Modeling the Physical Scene / Categorizing Images and Regions / Examples of 3D Scene Interpretation / Background on 3D Recognition / Modeling 3D Objects / Recognizing and Understanding 3D Objects / Examples of 2D 1/2 Layout Models / Reasoning about Objects and Scenes / Cascades of Classifiers / Conclusion and Future Directions


Estimation of 3D Object Pose for Packing Problem with a Deep Learning Approach

Estimation of 3D Object Pose for Packing Problem with a Deep Learning Approach

Author: Andrés David Rodríguez Torres

Publisher:

Published: 2020

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

This paper presents a deep learning approach to the pose estimation of boxes in a packing problem context. We divided the problem into two steps: detection and pose estimation. Each step is performed with a different convolutional neuronal network configured to complete its task without the excessive complexity that would be required to perform them simultaneously. The first neural network detects if a grayscale image of the working environment as captured by a Microsoft Kinect V2 contains a box or not. The second network predicts the two-dimensional position of each vertex of the box in the image plane from an RGB image. With this information, a depth channel of the image and the pinhole camera model we can estimate the position of the center of mass and the orientation of the box. We train and test both networks with synthetic data from a virtual scene of the workstation. For the detection problem, we achieved an accuracy of 99.5%. For the pose estimation problem, a mean error for center of mass distance of 17.78 millimeters and a mean error for orientation of 21.28 degrees were registered. Testing with real-world data remains pending, as well as the use of other network architectures.


From Shape-based Object Recognition and Discovery to 3D Scene Interpretation

From Shape-based Object Recognition and Discovery to 3D Scene Interpretation

Author: Nadia Payet

Publisher:

Published: 2011

Total Pages: 130

ISBN-13:

DOWNLOAD EBOOK

This dissertation addresses a number of inter-related and fundamental problems in computer vision. Specifically, we address object discovery, recognition, segmentation, and 3D pose estimation in images, as well as 3D scene reconstruction and scene interpretation. The key ideas behind our approaches include using shape as a basic object feature, and using structured prediction modeling paradigms for representing objects and scenes. In this work, we make a number of new contributions both in computer vision and machine learning. We address the vision problems of shape matching, shape-based mining of objects in arbitrary image collections, context-aware object recognition, monocular estimation of 3D object poses, and monocular 3D scene reconstruction using shape from texture. Our work on shape-based object discovery is the first to show that meaningful objects can be extracted from a collection of arbitrary images, without any human supervision, by shape matching. We also show that a spatial repetition of objects in images (e.g., windows on a building facade, or cars lined up along a street) can be used for 3D scene reconstruction from a single image. The aforementioned topics have never been addressed in the literature. The dissertation also presents new algorithms and object representations for the aforementioned vision problems. We fuse two traditionally different modeling paradigms Conditional Random Fields (CRF) and Random Forests (RF) into a unified framework, referred to as (RF)^2. We also derive theoretical error bounds of estimating distribution ratios by a two-class RF, which is then used to derive the theoretical performance bounds of a two-class (RF)^2. Thorough experimental evaluation of individual aspects of all our approaches is presented. In general, the experiments demonstrate that we outperform the state of the art on the benchmark datasets, without increasing complexity and supervision in training.


3D Computer Vision

3D Computer Vision

Author: Christian Wöhler

Publisher: Springer Science & Business Media

Published: 2009-07-28

Total Pages: 391

ISBN-13: 3642017320

DOWNLOAD EBOOK

This work provides an introduction to the foundations of three-dimensional c- puter vision and describes recent contributions to the ?eld, which are of methodical and application-speci?c nature. Each chapter of this work provides an extensive overview of the corresponding state of the art, into which a detailed description of new methods or evaluation results in application-speci?c systems is embedded. Geometric approaches to three-dimensional scene reconstruction (cf. Chapter 1) are primarily based on the concept of bundle adjustment, which has been developed more than 100 years ago in the domain of photogrammetry. The three-dimensional scene structure and the intrinsic and extrinsic camera parameters are determined such that the Euclidean backprojection error in the image plane is minimised, u- ally relying on a nonlinear optimisation procedure. In the ?eld of computer vision, an alternative framework based on projective geometry has emerged during the last two decades, which allows to use linear algebra techniques for three-dimensional scene reconstructionand camera calibration purposes. With special emphasis on the problems of stereo image analysis and camera calibration, these fairly different - proaches are related to each other in the presented work, and their advantages and drawbacks are stated. In this context, various state-of-the-artcamera calibration and self-calibration methods as well as recent contributions towards automated camera calibration systems are described. An overview of classical and new feature-based, correlation-based, dense, and spatio-temporal methods for establishing point c- respondences between pairs of stereo images is given.