Data-driven Exploration of Mouse Brain Transcriptome
Author: Yujie Li
Publisher:
Published: 2018
Total Pages: 218
ISBN-13:
DOWNLOAD EBOOKThe mammalian brain is the most complex organ. Modern genetics has shown that the complexity of brain structures and functions is ultimately encoded in the genome. As the primary functional interpretation of genome, a systematic study of transcriptome promises to enlighten how structures and functions are supported from the molecular scale. Fast advance in genomic information and throughput of technologies allows large-scale survey of transcriptome. The technique of in situ hybridization offers direct visualization of gene expression at cellular resolution. The spatial correlation among genes is closely associated with different phenotypes of anatomic regions. On the other hand, the correlations among transcripts allow us to investigate how sets of genes act in collaboration to control biological processes. However, how to unbiasedly derive the genetic-neuroanatomic correlations from the high-dimensional transcriptome data remains challenging. This thesis focuses on developing methods to connect genetics to neuroanatomy. To answer whether gene expression patterns can refine the architecture of the brain, I proposed dictionary learning and sparse coding (DLSC) as a tool because it considers the sparse structure of gene expressions. Voxels with similar coexpression patterns form tight clusters. Many clusters correspond well to neuroanatomy while others revealed finer delineation of regions previously considered homogeneous. Regionalized expressions in fiber tracts and ventricular systems have been discovered and reported for the first time. DLSC is also proven effective in grouping genes into gene coexpression networks (GCNs). The GCNs are crucial to understanding how genes act jointly in defining the anatomy of the brain. Gene ontologies and comparisons with curated gene lists with known functions confirmed the functional roles of these networks. One standing issue for the above-mentioned work is incomplete data. To address the problem, I designed a volume completion network accompanied with customized training scheme. The network successfully completed the large missing region on a slice as well as one or two consecutive missing slices. On the completed data, I seek out a probabilistic-based model Restricted Boltzmann Machine and its extension, deep belief network, to construct a hierarchical transcriptome anatomy. A fine-to-coarse organization emerges from the network layers, providing a multi-resolution transcriptome architecture.