3D Object Detection and Tracking for Autonomous Vehicles

3D Object Detection and Tracking for Autonomous Vehicles

Author: Su Pang

Publisher:

Published: 2022

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

Autonomous driving systems require accurate 3D object detection and tracking to achieve reliable path planning and navigation. For object detection, there have been significant advances in neural networks for single-modality approaches. However, it has been surprisingly difficult to train networks to use multiple modalities in a way that demonstrates gain over single-modality networks. In this dissertation, we first propose three networks for Camera-LiDAR and Camera-Radar fusion. For Camera-LiDAR fusion, CLOCs (Camera-LiDAR Object Candidates fusion) and Fast-CLOCs are presented. CLOCs fusion provides a multi-modal fusion framework that significantly improves the performance of single-modality detectors. CLOCs operates on the combined output candidates before Non-Maximum Suppression (NMS) of any 2D and any 3D detector, and is trained to leverage their geometric and semantic consistencies to produce more accurate 3D detection results. Fast-CLOCs can run in near real-time with less computational requirements compared to CLOCs. Fast-CLOCs eliminates the separate heavy 2D detector, and instead uses a 3D detector-cued 2D image detector (3D-Q-2D) to reduce memory and computation. For Camera-Radar fusion, we propose TransCAR, a Transformer-based Camera-And-Radar fusion solution for 3D object detection. The cross-attention layer within the transformer decoder can adaptively learn the soft-association between the radar features and vision queries instead of hard-association based on sensor calibration only. Then, we propose to solve the 3D multiple object tracking (MOT) problem for autonomous driving applications using a random finite set-based (RFS) Multiple Measurement Models filter (RFS-M3). In particular, we propose multiple measurement models for a Poisson multi-Bernoulli mixture (PMBM) filter in support of different application scenarios. Our RFS-M3 filter can naturally model these uncertainties accurately and elegantly. We combine learning-based detections with our RFS-M3 tracker by incorporating the detection confidence score into the PMBM prediction and update step. We have evaluated our CLOCs, Fast-CLOCs and TransCAR fusion-based 3D detector and RFS-M3 3D tracker using challenging datasets including KITTI, nuScenes, Argoverse and Waymo that are released by academia and industry leaders. Superior experimental results demonstrated the effectiveness of the proposed approaches.


Advances in Physical Agents II

Advances in Physical Agents II

Author: Luis M. Bergasa

Publisher: Springer Nature

Published: 2020-11-02

Total Pages: 362

ISBN-13: 3030625796

DOWNLOAD EBOOK

The book reports on cutting-edge Artificial Intelligence (AI) theories and methods aimed at the control and coordination of agents acting and moving in a dynamic environment. It covers a wide range of topics relating to: autonomous navigation, localization and mapping; mobile and social robots; multiagent systems; human-robot interaction; perception systems; and deep-learning techniques applied to the robotics. Based on the 21st edition of the International Workshop of Physical Agents (WAF 2020), held virtually on November 19-20, 2020, from Alcalá de Henares, Madrid, Spain, this book offers a snapshot of the state-of-the-art in the field of physical agents, with a special emphasis on novel AI techniques in perception, navigation and human robot interaction for autonomous systems.


3D Online Multi-object Tracking for Autonomous Driving

3D Online Multi-object Tracking for Autonomous Driving

Author: Venkateshwaran Balasubramanian

Publisher:

Published: 2019

Total Pages: 72

ISBN-13:

DOWNLOAD EBOOK

This research work focuses on exploring a novel 3D multi-object tracking architecture: 'FANTrack: 3D Multi-Object Tracking with Feature Association Network' for autonomous driving, based on tracking by detection and online tracking strategies using deep learning architectures for data association. The problem of multi-target tracking aims to assign noisy detections to a-priori unknown and time-varying number of tracked objects across a sequence of frames. A majority of the existing solutions focus on either tediously designing cost functions or formulating the task of data association as a complex optimization problem that can be solved effectively. Instead, we exploit the power of deep learning to formulate the data association problem as inference in a CNN. To this end, we propose to learn a similarity function that combines cues from both image and spatial features of objects. The proposed approach consists of a similarity network that predicts the similarity scores of the object pairs and builds a local similarity map. Another network formulates the data association problem as inference in a CNN by using the similarity scores and spatial information. The model learns to perform global assignments in 3D purely from data, handles noisy detections and a varying number of targets, and is easy to train. Experiments on the challenging Kitti dataset show competitive results with the state of the art. The model is finally implemented in ROS and deployed on our autonomous vehicle to show the robustness and online tracking capabilities. The proposed tracker runs alongside the object detector utilizing the resources efficiently.


Point Cloud Processing for Environmental Analysis in Autonomous Driving using Deep Learning

Point Cloud Processing for Environmental Analysis in Autonomous Driving using Deep Learning

Author: Martin Simon

Publisher: BoD – Books on Demand

Published: 2023-01-01

Total Pages: 194

ISBN-13: 3863602722

DOWNLOAD EBOOK

Autonomous self-driving cars need a very precise perception system of their environment, working for every conceivable scenario. Therefore, different kinds of sensor types, such as lidar scanners, are in use. This thesis contributes highly efficient algorithms for 3D object recognition to the scientific community. It provides a Deep Neural Network with specific layers and a novel loss to safely localize and estimate the orientation of objects from point clouds originating from lidar sensors. First, a single-shot 3D object detector is developed that outputs dense predictions in only one forward pass. Next, this detector is refined by fusing complementary semantic features from cameras and joint probabilistic tracking to stabilize predictions and filter outliers. The last part presents an evaluation of data from automotive-grade lidar scanners. A Generative Adversarial Network is also being developed as an alternative for target-specific artificial data generation.


Interlacing Self-Localization, Moving Object Tracking and Mapping for 3D Range Sensors

Interlacing Self-Localization, Moving Object Tracking and Mapping for 3D Range Sensors

Author: Frank Moosmann

Publisher: KIT Scientific Publishing

Published: 2014-05-13

Total Pages: 154

ISBN-13: 3866449771

DOWNLOAD EBOOK

This work presents a solution for autonomous vehicles to detect arbitrary moving traffic participants and to precisely determine the motion of the vehicle. The solution is based on three-dimensional images captured with modern range sensors like e.g. high-resolution laser scanners. As result, objects are tracked and a detailed 3D model is built for each object and for the static environment. The performance is demonstrated in challenging urban environments that contain many different objects.


Robust Environmental Perception and Reliability Control for Intelligent Vehicles

Robust Environmental Perception and Reliability Control for Intelligent Vehicles

Author: Huihui Pan

Publisher: Springer Nature

Published: 2023-11-25

Total Pages: 308

ISBN-13: 9819977908

DOWNLOAD EBOOK

This book presents the most recent state-of-the-art algorithms on robust environmental perception and reliability control for intelligent vehicle systems. By integrating object detection, semantic segmentation, trajectory prediction, multi-object tracking, multi-sensor fusion, and reliability control in a systematic way, this book is aimed at guaranteeing that intelligent vehicles can run safely in complex road traffic scenes. Adopts the multi-sensor data fusion-based neural networks to environmental perception fault tolerance algorithms, solving the problem of perception reliability when some sensors fail by using data redundancy. Presents the camera-based monocular approach to implement the robust perception tasks, which introduces sequential feature association and depth hint augmentation, and introduces seven adaptive methods. Proposes efficient and robust semantic segmentation of traffic scenes through real-time deep dual-resolution networks and representation separation of vision transformers. Focuses on trajectory prediction and proposes phased and progressive trajectory prediction methods that is more consistent with human psychological characteristics, which is able to take both social interactions and personal intentions into account. Puts forward methods based on conditional random field and multi-task segmentation learning to solve the robust multi-object tracking problem for environment perception in autonomous vehicle scenarios. Presents the novel reliability control strategies of intelligent vehicles to optimize the dynamic tracking performance and investigates the completely unknown autonomous vehicle tracking issues with actuator faults.


Practical Foundations for Programming Languages

Practical Foundations for Programming Languages

Author: Robert Harper

Publisher: Cambridge University Press

Published: 2016-04-04

Total Pages: 513

ISBN-13: 1107150302

DOWNLOAD EBOOK

This book unifies a broad range of programming language concepts under the framework of type systems and structural operational semantics.


Sensor Fusion for 3D Object Detection for Autonomous Vehicles

Sensor Fusion for 3D Object Detection for Autonomous Vehicles

Author: Yahya Massoud

Publisher:

Published: 2021

Total Pages:

ISBN-13:

DOWNLOAD EBOOK

Thanks to the major advancements in hardware and computational power, sensor technology, and artificial intelligence, the race for fully autonomous driving systems is heating up. With a countless number of challenging conditions and driving scenarios, researchers are tackling the most challenging problems in driverless cars. One of the most critical components is the perception module, which enables an autonomous vehicle to "see" and "understand" its surrounding environment. Given that modern vehicles can have large number of sensors and available data streams, this thesis presents a deep learning-based framework that leverages multimodal data - i.e. sensor fusion, to perform the task of 3D object detection and localization. We provide an extensive review of the advancements of deep learning-based methods in computer vision, specifically in 2D and 3D object detection tasks. We also study the progress of the literature in both single-sensor and multi-sensor data fusion techniques. Furthermore, we present an in-depth explanation of our proposed approach that performs sensor fusion using input streams from LiDAR and Camera sensors, aiming to simultaneously perform 2D, 3D, and Bird's Eye View detection. Our experiments highlight the importance of learnable data fusion mechanisms and multi-task learning, the impact of different CNN design decisions, speed-accuracy tradeoffs, and ways to deal with overfitting in multi-sensor data fusion frameworks.