Advanced Interpretable Machine Learning Methods for Clinical NGS Big Data of Complex Hereditary Diseases

Advanced Interpretable Machine Learning Methods for Clinical NGS Big Data of Complex Hereditary Diseases

Author: Yudong Cai

Publisher:

Published: 2020

Total Pages: 234

ISBN-13: 9782889662746

DOWNLOAD EBOOK

This eBook is a collection of articles from a Frontiers Research Topic. Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: frontiersin.org/about/contact.


Big Data in Omics and Imaging

Big Data in Omics and Imaging

Author: Momiao Xiong

Publisher: CRC Press

Published: 2017-12-01

Total Pages: 668

ISBN-13: 1498725805

DOWNLOAD EBOOK

Big Data in Omics and Imaging: Association Analysis addresses the recent development of association analysis and machine learning for both population and family genomic data in sequencing era. It is unique in that it presents both hypothesis testing and a data mining approach to holistically dissecting the genetic structure of complex traits and to designing efficient strategies for precision medicine. The general frameworks for association analysis and machine learning, developed in the text, can be applied to genomic, epigenomic and imaging data. FEATURES Bridges the gap between the traditional statistical methods and computational tools for small genetic and epigenetic data analysis and the modern advanced statistical methods for big data Provides tools for high dimensional data reduction Discusses searching algorithms for model and variable selection including randomization algorithms, Proximal methods and matrix subset selection Provides real-world examples and case studies Will have an accompanying website with R code The book is designed for graduate students and researchers in genomics, bioinformatics, and data science. It represents the paradigm shift of genetic studies of complex diseases– from shallow to deep genomic analysis, from low-dimensional to high dimensional, multivariate to functional data analysis with next-generation sequencing (NGS) data, and from homogeneous populations to heterogeneous population and pedigree data analysis. Topics covered are: advanced matrix theory, convex optimization algorithms, generalized low rank models, functional data analysis techniques, deep learning principle and machine learning methods for modern association, interaction, pathway and network analysis of rare and common variants, biomarker identification, disease risk and drug response prediction.


Clinical Applications for Next-Generation Sequencing

Clinical Applications for Next-Generation Sequencing

Author: Urszula Demkow

Publisher: Academic Press

Published: 2015-09-10

Total Pages: 336

ISBN-13: 0128018410

DOWNLOAD EBOOK

Clinical Applications for Next Generation Sequencing provides readers with an outstanding postgraduate resource to learn about the translational use of NGS in clinical environments. Rooted in both medical genetics and clinical medicine, the book fills the gap between state-of-the-art technology and evidence-based practice, providing an educational opportunity for users to advance patient care by transferring NGS to the needs of real-world patients. The book builds an interface between genetic laboratory staff and clinical health workers to not only improve communication, but also strengthen cooperation. Users will find valuable tactics they can use to build a systematic framework for understanding the role of NGS testing in both common and rare diseases and conditions, from prenatal care, like chromosomal abnormalities, up to advanced age problems like dementia. Fills the gap between state-of-the-art technology and evidence-based practice Provides an educational opportunity which advances patient care through the transfer of NGS to real-world patient assessment Promotes a practical tool that clinicians can apply directly to patient care Includes a systematic framework for understanding the role of NGS testing in many common and rare diseases Presents evidence regarding the important role of NGS in current diagnostic strategies


Machine Learning Advanced Dynamic Omics Data Analysis for Precision Medicine

Machine Learning Advanced Dynamic Omics Data Analysis for Precision Medicine

Author: Tao Zeng

Publisher:

Published: 2020

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

Precision medicine is being developed as a preventative, diagnostic and treatment tool to combat complex human diseases in a personalized manner. By utilizing high-throughput technologies, dynamic 'omics data including genetics, epi-genetics and even meta-genomics has produced temporal-spatial big biological datasets which can be associated with individual genotypes underlying pathogen progressive phenotypes. It is therefore necessary to investigate how to integrate these multi-scale 'omics datasets to distinguish the novel individual-specific disease causes from conventional cohort-common disease causes. Currently, machine learning plays an important role in biological and biomedical research, especially in the analysis of big 'omics data. However, in contrast to traditional big social data, 'omics datasets are currently always "small-sample-high-dimension", which causes overwhelming application problems and also introduces new challenges: (1) Big 'omics datasets can be extremely unbalanced, due to the difficulty of obtaining enough positive samples of such rare mutations or rare diseases; (2) A large number of machine learning models are "black box," which is enough to apply in social applications. However, in biological or biomedical fields, knowledge of the molecular mechanisms underlying any disease or biological study is necessary to deepen our understanding; (3) The genotype-phenotype association is a "white clue" captured in conventional big data studies. But identification of "causality" rather than association would be more helpful for physicians or biologists, as this can be used to determine an experimental target as the subject of future research. Therefore, to simultaneously improve the phenotype discrimination and genotype interpretability for complex diseases, it is necessary: To design and implement new machine learning technologies to integrate prior-knowledge with new 'omics datasets to provide transferable learning methods by combining multiple sources of data; To develop new network-based theories and methods to balance the trade-off between accuracy and interpretability of machine learning in biomedical and biological domains; To enhance the causality inference on "small-sample high dimension" data to capture the personalized causal relationship.


Handbook of Machine Learning Applications for Genomics

Handbook of Machine Learning Applications for Genomics

Author: Sanjiban Sekhar Roy

Publisher: Springer Nature

Published: 2022-06-23

Total Pages: 222

ISBN-13: 9811691584

DOWNLOAD EBOOK

Currently, machine learning is playing a pivotal role in the progress of genomics. The applications of machine learning are helping all to understand the emerging trends and the future scope of genomics. This book provides comprehensive coverage of machine learning applications such as DNN, CNN, and RNN, for predicting the sequence of DNA and RNA binding proteins, expression of the gene, and splicing control. In addition, the book addresses the effect of multiomics data analysis of cancers using tensor decomposition, machine learning techniques for protein engineering, CNN applications on genomics, challenges of long noncoding RNAs in human disease diagnosis, and how machine learning can be used as a tool to shape the future of medicine. More importantly, it gives a comparative analysis and validates the outcomes of machine learning methods on genomic data to the functional laboratory tests or by formal clinical assessment. The topics of this book will cater interest to academicians, practitioners working in the field of functional genomics, and machine learning. Also, this book shall guide comprehensively the graduate, postgraduates, and Ph.D. scholars working in these fields.


Big Data Analytics in Genomics

Big Data Analytics in Genomics

Author: Ka-Chun Wong

Publisher: Springer

Published: 2016-10-24

Total Pages: 426

ISBN-13: 3319412795

DOWNLOAD EBOOK

This contributed volume explores the emerging intersection between big data analytics and genomics. Recent sequencing technologies have enabled high-throughput sequencing data generation for genomics resulting in several international projects which have led to massive genomic data accumulation at an unprecedented pace. To reveal novel genomic insights from this data within a reasonable time frame, traditional data analysis methods may not be sufficient or scalable, forcing the need for big data analytics to be developed for genomics. The computational methods addressed in the book are intended to tackle crucial biological questions using big data, and are appropriate for either newcomers or veterans in the field.This volume offers thirteen peer-reviewed contributions, written by international leading experts from different regions, representing Argentina, Brazil, China, France, Germany, Hong Kong, India, Japan, Spain, and the USA. In particular, the book surveys three main areas: statistical analytics, computational analytics, and cancer genome analytics. Sample topics covered include: statistical methods for integrative analysis of genomic data, computation methods for protein function prediction, and perspectives on machine learning techniques in big data mining of cancer. Self-contained and suitable for graduate students, this book is also designed for bioinformaticians, computational biologists, and researchers in communities ranging from genomics, big data, molecular genetics, data mining, biostatistics, biomedical science, cancer research, medical research, and biology to machine learning and computer science. Readers will find this volume to be an essential read for appreciating the role of big data in genomics, making this an invaluable resource for stimulating further research on the topic.


Big Data in Omics and Imaging

Big Data in Omics and Imaging

Author: Momiao Xiong

Publisher: CRC Press

Published: 2018-06-14

Total Pages: 580

ISBN-13: 135117262X

DOWNLOAD EBOOK

Big Data in Omics and Imaging: Integrated Analysis and Causal Inference addresses the recent development of integrated genomic, epigenomic and imaging data analysis and causal inference in big data era. Despite significant progress in dissecting the genetic architecture of complex diseases by genome-wide association studies (GWAS), genome-wide expression studies (GWES), and epigenome-wide association studies (EWAS), the overall contribution of the new identified genetic variants is small and a large fraction of genetic variants is still hidden. Understanding the etiology and causal chain of mechanism underlying complex diseases remains elusive. It is time to bring big data, machine learning and causal revolution to developing a new generation of genetic analysis for shifting the current paradigm of genetic analysis from shallow association analysis to deep causal inference and from genetic analysis alone to integrated omics and imaging data analysis for unraveling the mechanism of complex diseases. FEATURES Provides a natural extension and companion volume to Big Data in Omic and Imaging: Association Analysis, but can be read independently. Introduce causal inference theory to genomic, epigenomic and imaging data analysis Develop novel statistics for genome-wide causation studies and epigenome-wide causation studies. Bridge the gap between the traditional association analysis and modern causation analysis Use combinatorial optimization methods and various causal models as a general framework for inferring multilevel omic and image causal networks Present statistical methods and computational algorithms for searching causal paths from genetic variant to disease Develop causal machine learning methods integrating causal inference and machine learning Develop statistics for testing significant difference in directed edge, path, and graphs, and for assessing causal relationships between two networks The book is designed for graduate students and researchers in genomics, epigenomics, medical image, bioinformatics, and data science. Topics covered are: mathematical formulation of causal inference, information geometry for causal inference, topology group and Haar measure, additive noise models, distance correlation, multivariate causal inference and causal networks, dynamic causal networks, multivariate and functional structural equation models, mixed structural equation models, causal inference with confounders, integer programming, deep learning and differential equations for wearable computing, genetic analysis of function-valued traits, RNA-seq data analysis, causal networks for genetic methylation analysis, gene expression and methylation deconvolution, cell –specific causal networks, deep learning for image segmentation and image analysis, imaging and genomic data analysis, integrated multilevel causal genomic, epigenomic and imaging data analysis.


Biologically Interpretable Machine Learning Methods to Understand Gene Regulation for Disease Phenotypes

Biologically Interpretable Machine Learning Methods to Understand Gene Regulation for Disease Phenotypes

Author: Ting Jin

Publisher:

Published: 2023

Total Pages: 0

ISBN-13:

DOWNLOAD EBOOK

Gene expression and regulation is a key molecular mechanism driving the development of human diseases, particularly at the cell type level, but it remains elusive. For example in many brain diseases, such as Alzheimer's disease (AD), understanding how cell-type gene expression and regulation change across multiple stages of AD progression is still challenging. Moreover, interindividual variability of gene expression and regulation is a known characteristic of the human brain and brain diseases. However, it is still unclear how interindividual variability affects personalized gene regulation in brain diseases including AD, thereby contributing to their heterogeneity. Recent technological advances have enabled the detection of gene regulation activities through multi-omics (i.e., genomics, transcriptomics, epigenomics, proteomics). In particular, emerging single-cell sequencing technologies (e.g., scRNA-seq, scATAC-seq) allow us to study functional genomics and gene regulation at the cell-type level. Moreover, these multi-omics data of populations (e.g., human individuals) provide a unique opportunity to study the underlying regulatory mechanisms occurring in brain disease progression and clinical phenotypes. For instance, PsychAD is a large project generating single-cell multi-omics data including many neuronal and glial cell types, aiming to understand the molecular mechanisms of neuropsychiatric symptoms of multiple brain diseases (e.g., AD, SCZ, ASD, Bipolar) from over 1,000 individuals. However, analyzing and integrating large-scale multi-omics data at the population level, as well as understanding the mechanisms of gene regulation, also remains a challenge. Machine learning is a powerful and emerging tool to decode the unique complexities and heterogeneity of human diseases. For instance, Beebe-Wang, Nicosia, et al. developed MD-AD, a multi-task neural network model to predict various disease phenotypes in AD patients using RNA-seq. Additionally, with advancements in graph neural networks, which possess enhanced capabilities to represent sophisticated gene network structures like gene regulation networks that control gene expression. Efforts have also been made to capture the gene regulation heterogeneity of brain diseases. For instance, Kim SY has applied graph convolutional networks to offer personalized diagnostic insights through population graphs that correspond with disease progression. However, many existing machine learning methods are often limited to constructing accurate models for disease phenotype prediction and frequently lack biological interpretability or personalized insights, especially in gene regulation. Therefore, to address these challenges, my Ph.D. works have developed three machine-learning methods designed to decode the gene regulation mechanisms of human diseases. First, in this dissertation, I will present scGRNom, a computational pipeline that integrates multi-omic data to construct cell-type gene regulatory networks (GRNs) linking non-coding regulatory elements. Next, I will introduce i-BrainMap an interpretable knowledge-guided graph neural network model to prioritize personalized cell type disease genes, regulatory linkages, and modules. Thirdly, I introduce ECMaker, a semi-restricted Boltzmann machine (semi-RBM) method for identifying gene networks to predict diseases and clinical phenotypes. Overall, all our interpretable machine learning models improve phenotype prediction, prioritize key genes and networks associated with disease phenotypes, and are further aimed at enhancing our understanding of gene regulatory mechanisms driving disease progression and clinical phenotypes.