From basic performing of sequence alignment through a proficiency at understanding how most industry-standard alignment algorithms achieve their results, Multiple Sequence Alignment Methods describes numerous algorithms and their nuances in chapters written by the experts who developed these algorithms. The various multiple sequence alignment algorithms presented in this handbook give a flavor of the broad range of choices available for multiple sequence alignment generation, and their diversity is a clear reflection of the complexity of the multiple sequence alignment problem and the amount of information that can be obtained from multiple sequence alignments. Each of these chapters not only describes the algorithm it covers but also presents instructions and tips on using their implementation, as is fitting with its inclusion in the highly successful Methods in Molecular Biology series. Authoritative and practical, Multiple Sequence Alignment Methods provides a readily available resource which will allow practitioners to experiment with different algorithms and find the particular algorithm that is of most use in their application.
The sequencing of the human genome involved thousands of scientists but used relatively few tools. Today, obtaining sequences is simpler, but aligning the sequences—making sure that sequences from one source are properly compared to those from other sources—remains a complicated but underappreciated aspect of comparative molecular biology. This volume, the first to focus on this crucial step in analyzing sequence data, is about the practice of alignment, the procedures by which alignments are established, and more importantly, how the outcomes of any alignment algorithm should be interpreted. Edited by Michael S. Rosenberg with essays by many of the field's leading experts, Sequence Alignment covers molecular causes, computational advances, approaches for assessing alignment quality, and philosophical underpinnings of the algorithms themselves.
Covers the fundamentals and techniques of multiple biological sequence alignment and analysis, and shows readers how to choose the appropriate sequence analysis tools for their tasks This book describes the traditional and modern approaches in biological sequence alignment and homology search. This book contains 11 chapters, with Chapter 1 providing basic information on biological sequences. Next, Chapter 2 contains fundamentals in pair-wise sequence alignment, while Chapters 3 and 4 examine popular existing quantitative models and practical clustering techniques that have been used in multiple sequence alignment. Chapter 5 describes, characterizes and relates many multiple sequence alignment models. Chapter 6 describes how traditionally phylogenetic trees have been constructed, and available sequence knowledge bases can be used to improve the accuracy of reconstructing phylogeny trees. Chapter 7 covers the latest methods developed to improve the run-time efficiency of multiple sequence alignment. Next, Chapter 8 covers several popular existing multiple sequence alignment server and services, and Chapter 9 examines several multiple sequence alignment techniques that have been developed to handle short sequences (reads) produced by the Next Generation Sequencing technique (NSG). Chapter 10 describes a Bioinformatics application using multiple sequence alignment of short reads or whole genomes as input. Lastly, Chapter 11 provides a review of RNA and protein secondary structure prediction using the evolution information inferred from multiple sequence alignments. • Covers the full spectrum of the field, from alignment algorithms to scoring methods, practical techniques, and alignment tools and their evaluations • Describes theories and developments of scoring functions and scoring matrices •Examines phylogeny estimation and large-scale homology search Multiple Biological Sequence Alignment: Scoring Functions, Algorithms and Applications is a reference for researchers, engineers, graduate and post-graduate students in bioinformatics, and system biology and molecular biologists. Ken Nguyen, PhD, is an associate professor at Clayton State University, GA, USA. He received his PhD, MSc and BSc degrees in computer science all from Georgia State University. His research interests are in databases, parallel and distribute computing and bioinformatics. He was a Molecular Basis of Disease fellow at Georgia State and is the recipient of the highest graduate honor at Georgia State, the William M. Suttles Graduate Fellowship. Xuan Guo, PhD, is a postdoctoral associate at Oak Ridge National Lab, USA. He received his PhD degree in computer science from Georgia State University in 2015. His research interests are in bioinformatics, machine leaning, and cloud computing. He is an editorial assistant of International Journal of Bioinformatics Research and Applications. Yi Pan, PhD, is a Regents' Professor of Computer Science and an Interim Associate Dean and Chair of Biology at Georgia State University. He received his BE and ME in computer engineering from Tsinghua University in China and his PhD in computer science from the University of Pittsburgh. Dr. Pan's research interests include parallel and distributed computing, optical networks, wireless networks and bioinformatics. He has published more than 180 journal papers with about 60 papers published in various IEEE/ACM journals. He is co-editor along with Albert Y. Zomaya of the Wiley Series in Bioinformatics.
This book offers comprehensive coverage of all the core topics of bioinformatics, and includes practical examples completed using the MATLAB bioinformatics toolboxTM. It is primarily intended as a textbook for engineering and computer science students attending advanced undergraduate and graduate courses in bioinformatics and computational biology. The book develops bioinformatics concepts from the ground up, starting with an introductory chapter on molecular biology and genetics. This chapter will enable physical science students to fully understand and appreciate the ultimate goals of applying the principles of information technology to challenges in biological data management, sequence analysis, and systems biology. The first part of the book also includes a survey of existing biological databases, tools that have become essential in today’s biotechnology research. The second part of the book covers methodologies for retrieving biological information, including fundamental algorithms for sequence comparison, scoring, and determining evolutionary distance. The main focus of the third part is on modeling biological sequences and patterns as Markov chains. It presents key principles for analyzing and searching for sequences of significant motifs and biomarkers. The last part of the book, dedicated to systems biology, covers phylogenetic analysis and evolutionary tree computations, as well as gene expression analysis with microarrays. In brief, the book offers the ideal hands-on reference guide to the field of bioinformatics and computational biology.
Probabilistic models are becoming increasingly important in analysing the huge amount of data being produced by large-scale DNA-sequencing efforts such as the Human Genome Project. For example, hidden Markov models are used for analysing biological sequences, linguistic-grammar-based probabilistic models for identifying RNA secondary structure, and probabilistic evolutionary models for inferring phylogenies of sequences from different organisms. This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis. Written by an interdisciplinary team of authors, it aims to be accessible to molecular biologists, computer scientists, and mathematicians with no formal knowledge of the other fields, and at the same time present the state-of-the-art in this new and highly important field.
This book develops a new approach called parameter advising for finding a parameter setting for a sequence aligner that yields a quality alignment of a given set of input sequences. In this framework, a parameter advisor is a procedure that automatically chooses a parameter setting for the input, and has two main ingredients: (a) the set of parameter choices considered by the advisor, and (b) an estimator of alignment accuracy used to rank alignments produced by the aligner. On coupling a parameter advisor with an aligner, once the advisor is trained in a learning phase, the user simply inputs sequences to align, and receives an output alignment from the aligner, where the advisor has automatically selected the parameter setting. The chapters first lay out the foundations of parameter advising, and then cover applications and extensions of advising. The content • examines formulations of parameter advising and their computational complexity, • develops methods for learning good accuracy estimators, • presents approximation algorithms for finding good sets of parameter choices, and • assesses software implementations of advising that perform well on real biological data. Also explored are applications of parameter advising to • adaptive local realignment, where advising is performed on local regions of the sequences to automatically adapt to varying mutation rates, and • ensemble alignment, where advising is applied to an ensemble of aligners to effectively yield a new aligner of higher quality than the individual aligners in the ensemble. The book concludes by offering future directions in advising research.
This book is a practical guide for biologists who use multiple sequence alignments (MSAs) for their data analysis and are looking for a comprehensive overview of the many different programs. Despite their important role in data analysis, there is uncertainty among researchers about exactly how MSA programs work - not to mention how and why the different analyzes lead to different results. Which program is the right one for evaluating my data and how can I ensure that I have drawn all relevant findings from the alignments? This book offers helpful explanations and background information without requiring extensive bioinformatics knowledge and slowly introduces the reader to the topic.In the first part of the book, the possible fields of application as well as the formats that are usually produced by MSA programs are described in detail. The central algorithms as well as the internal processes of the most common MSA programs of the past and the present are also explained in an uncomplicated manner in greater detail. The second part of the book is a detailed, data-based comparison between MSA programs, which is intended to help you decide which program to use for your next alignment.
Biomolecular sequence comparison is the origin of bioinformatics. This book gives a complete in-depth treatment of the study of sequence comparison. A comprehensive introduction is followed by a focus on alignment algorithms and techniques, proceeded by a discussion of the theory. The book examines alignment methods and techniques, features a new issue of sequence comparison - the spaced seed technique, addresses several new flexible strategies for coping with various scoring schemes, and covers the theory on the significance of high-scoring segment pairs between two unalignment sequences. Useful appendices on basic concepts in molecular biology, primer in statistics and software for sequence alignment are included in this reader-friendly text, as well as chapter-ending exercise and research questions A state-of-the-art study of sequence alignment and homology search, this is an ideal reference for advanced students studying bioinformatics and will appeal to biologists who wish to know how to use homology search tools.
This book constitutes the refereed proceedings of the 25th International Conference on Parallel Computational Fluid Dynamics, ParCFD 2013, held in Changsha, China, in May 2013. The 35 revised full papers presented were carefully reviewed and selected from more than 240 submissions. The papers address issues such as parallel algorithms, developments in software tools and environments, unstructured adaptive mesh applications, industrial applications, atmospheric and oceanic global simulation, interdisciplinary applications and evaluation of computer architectures and software environments.