rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9781119697961

Algorithms in Bioinformatics Theory and Implementation

by
  • ISBN13:

    9781119697961

  • ISBN10:

    1119697964

  • Edition: 1st
  • Format: Hardcover
  • Copyright: 2021-08-10
  • Publisher: Wiley
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $155.68 Save up to $15.28
  • Buy New
    $154.90
    Add to Cart Free Shipping Icon Free Shipping

    PRINT ON DEMAND: 2-4 WEEKS. THIS ITEM CANNOT BE CANCELLED OR RETURNED.

Summary

ALGORITHMS IN BIOINFORMATICS

Explore a comprehensive and insightful treatment of the practical application of bioinformatic algorithms in a variety of fields

Algorithms in Bioinformatics: Theory and Implementation delivers a fulsome treatment of some of the main algorithms used to explain biological functions and relationships. It introduces readers to the art of algorithms in a practical manner which is linked with biological theory and interpretation. The book covers many key areas of bioinformatics, including global and local sequence alignment, forced alignment, detection of motifs, Sequence logos, Markov chains or information entropy. Other novel approaches are also described, such as Self-Sequence alignment, Objective Digital Stains (ODSs) or Spectral Forecast and the Discrete Probability Detector (DPD) algorithm.

The text incorporates graphical illustrations to highlight and emphasize the technical details of computational algorithms found within, to further the reader’s understanding and retention of the material. Throughout, the book is written in an accessible and practical manner, showing how algorithms can be implemented and used in JavaScript on Internet Browsers. The author has included more than 120 open-source implementations of the material, as well as 33 ready-to-use presentations. The book contains original material that has been class-tested by the author and numerous cases are examined in a biological and medical context. Readers will also benefit from the inclusion of:

  • A thorough introduction to biological evolution, including the emergence of life, classifications and some known theories and molecular mechanisms
  • A detailed presentation of new methods, such as Self-sequence alignment, Objective Digital Stains and Spectral Forecast
  • A treatment of sequence alignment, including local sequence alignment, global sequence alignment and forced sequence alignment with full implementations
  • Discussions of position-specific weight matrices, including the count, weight, relative frequencies, and log-likelihoods matrices
  • A detailed presentation of the methods related to Markov Chains as well as a description of their implementation in Bioinformatics and adjacent fields
  • An examination of information and entropy, including sequence logos and explanations related to their meaning
  • An exploration of the current state of bioinformatics, including what is known and what issues are usually avoided in the field
  • A chapter on philosophical transactions that allows the reader a broader view of the prediction process
  • Native computer implementations in the context of the field of Bioinformatics
  • Extensive worked examples with detailed case studies that point out the meaning of different results

Perfect for professionals and researchers in biology, medicine, engineering, and information technology, as well as upper level undergraduate students in these fields, Algorithms in Bioinformatics: Theory and Implementation will also earn a place in the libraries of software engineers who wish to understand how to implement bioinformatic algorithms in their products.

Author Biography

Paul A. Gagniuc, PhD, is an associated Professor of Bioinformatics and a Professor of Programming Languages at University Politehnica of Bucharest in Romania. He obtained his doctorate in Genetics at the University of Bucharest. Dr. Gagniuc is also an Academic Editor at PLoS ONE and a pro-active reviewer for several well-known scientific journals. He has published numerous high-profile scientific articles and is the recipient of several awards for exceptional scientific results.

Table of Contents

1 The tree of life (I) 12

1.1 Introduction 12

1.2 Emergence of life 12

1.2.1 Timeline disagreements 14

1.3 Classifications and mechanisms 14

1.4 Chromatin structure 16

1.5 Molecular mechanisms 18

1.5.1 Precursor messenger RNA 18

1.5.2 Precursor messenger RNA to messenger RNA 19

1.5.3 Classes of introns 19

1.5.4 Messenger RNA 20

1.5.5 mRNA to proteins 20

1.5.6 Transfer RNA 21

1.5.7 Small RNA 22

1.5.8 The transcriptome 22

1.5.9 Gene networks and information processing 23

1.5.10 Eukaryotic vs prokaryotic regulation 23

1.5.11 What is life ? 24

1.6 Known species 24

1.7 Approaches for compartmentalization 24

1.7.1 Two main approaches for organism formation 25

1.7.2 Size and metabolism 25

1.8 Sizes in eukaryotes 26

1.8.1 Sizes in unicellular eukaryotes 26

1.8.2 Sizes in multicellular Eukaryotes 26

1.9 Sizes in Prokaryotes 26

1.10 Virus sizes 27

1.10.1 Viruses vs the spark of metabolism 28

1.11 The diffusion coefficient 28

1.12 The origins of eukaryotic cells 29

1.12.1 Endosymbiosis theory 29

1.12.2 DNA and organelles 30

1.12.3 Membrane-bound organelles with DNA 30

1.12.4 Membrane-bound organelles without DNA 31

1.12.5 Control and division of organelles 31

1.12.6 The horizontal gene transfer 32

1.12.7 On the mechanisms of horizontal gene transfer 33

1.13 Origins of eukaryotic multicellularity 33

1.13.1 Colonies inside an early unicellular common ancestor 33

1.13.2 Colonies of early unicellular common ancestors 34

1.13.3 Colonies of inseparable early unicellular common ancestors 35

1.13.4 Chimerism and mosaicism 35

1.14 Conclusions 36

2 Tree of life: genomes (II) 37

2.1 Introduction 37

2.2 Rules of engagement 37

2.3 Genome sizes in the tree of life 38

2.3.1 Alternative methods 38

2.3.2 The weaving of scales 39

2.3.3 Computations on the average genome size 42

2.3.4 Observations on data 43

2.4 Organellar genomes 44

2.4.1 Chloroplasts 45

2.4.2 Apicoplasts 46

2.4.3 Chromatophores 46

2.4.4 Cyanelles 46

2.4.5 Kinetoplasts 46

2.4.6 Mitochondria 47

2.5 Plasmids 47

2.6 Virus genomes 48

2.7 Viroids and their implications 49

2.8 Genes vs. proteins in the tree of life 50

2.9 Conclusions 52

3 Sequence alignment (I) 53

3.1 Introduction 53

3.2 Style and visualization 53

3.3 Initialisation of the score matrix 55

3.4 Calculation of scores 58

3.4.1 Initialization of the score matrix for global alignment 59

3.4.2 Initialization of the score matrix for local alignment 62

3.4.3 Optimization of the initialization steps 65

3.4.4 Curiosities 66

3.5 Traceback 70

3.6 Global alignment 73

3.7 Local alignment 77

3.8 Alignment layout 81

3.9 Local sequence alignment - the final version 84

3.10 Complementarity 88

3.11 Conclusions 93

4 Forced alignment (II) 94

4.1 Introduction 94

4.2 Global and Local sequence alignment 95

4.2.1 Short notes 95

4.2.2 Understanding the technology 96

4.2.3 Main objectives 97

4.3 Experiments and discussions 97

4.3.1 Alignment layout 97

4.3.2 Forced alignment regime 97

4.3.3 Alignment scores and significance 98

4.3.4 Optimal alignments 100

4.3.5 The main significance scores 100

4.3.6 Significance vs Chance 103

4.3.7 Sequence quality and the score matrix 104

4.3.8 Optimal alignments by numbers 106

4.3.9 Chaos theory on sequence alignment 107

4.3.10 Image encoding possibilities 108

4.4 Advanced features and methods 108

4.4.1 Sequence detector 109

4.4.2 Parameters 109

4.4.3 Heatmap 109

4.4.4 Text visualization 113

4.4.5 Graphics for manuscript figures and didactic presentations 114

4.4.6 Dynamics 114

4.4.7 Independence 114

4.4.8 Limits 114

4.4.9 Local Storage 116

4.5 Conclusions 118

5 Self-Sequence Alignment (I) 119

5.1 Introduction 119

5.2 True randomness 120

5.3 Information and compression algorithms 120

5.4 White noise and biological sequences 120

5.5 The mathematical model 121

5.5.1 A concrete example 122

5.5.2 Model dissection 123

5.5.3 Conditions for maxima and minima 125

5.6 Noise vs Redundancy 126

5.7 Global and local information content 126

5.8 Signal sensitivity 127

5.9 Implementation 128

5.9.1 Global Self-Sequence Alignment 128

5.9.2 Local Self-Sequence Alignment 131

5.10 A complete scanner for information content 134

5.11 Conclusions 135

6 Frequencies and percentages (II) 136

6.1 Introduction 136

6.2 Base composition 137

6.3 Percentage of nucleotide combinations 137

6.4 Implementation 138

6.5 A frequency scanner 140

6.6 Examples of known significance 142

6.7 Observation vs expectation 144

6.8 A frequency scanner with a threshold 144

6.9 Conclusions 146

7 Objective digital stains (III) 148

7.1 Introduction 148

7.2 Information and frequency 149

7.3 The objective digital stain 152

7.3.1 A 3D representation over a 2D plane 156

7.3.2 ODSs relative to the background 159

7.4 Interpretation of ODSs 163

7.5 The significance of the areas in the ODS 165

7.6 Discussions 166

7.6.1 A similarity between dissimilar sequences 168

7.7 Conclusions 168

8 Detection of motifs (I) 169

8.1 Introduction 169

8.2 DNA motifs 169

8.2.1 DNA-binding proteins vs motifs and degeneracy 170

8.2.2 Concrete examples of DNA motifs 170

8.3 Major functions of DNA motifs 171

8.3.1 RNA splicing and DNA motifs 172

8.4 Conclusions 174

9 Representation of motifs (II) 175

9.1 Introduction 175

9.2 The training data 175

9.3 A visualization function 176

9.4 The Alignment Matrix 177

9.5 Alphabet detection 179

9.6 The Position-Specific Scoring Matrix (PSSM) initialization 182

9.7 The Position Frequency Matrix (PFM) 183

9.8 The Position Probability Matrix (PPM) 184

9.8.1 A kind of PPM pseudo scanner 185

9.9 The Position Weight Matrix (PWM) 187

9.10 The background model 190

9.11 The Consensus sequence 192

9.11.1 The consensus - not necessarily functional 194

9.12 Mutational intolerance 195

9.13 From motifs to PWMs 196

9.14 Pseudo counts and negative infinity 200

9.15 Conclusions 203

10 The motif scanner (III) 204

10.1 Introduction 204

10.2 Looking for signals 205

10.3 A functional scanner 208

10.4 The meaning of scores 211

10.4.1 A score value above zero 211

10.4.2 A score value below zero 212

10.4.3 A score value of zero 213

10.5 Conclusions 213

11 Understanding the parameters (IV) 215

11.1 Introduction 215

11.2 Experimentation 215

11.2.1 A scanner implementation based on pseudo counts 216

11.2.2 A scanner implementation based on propagation of zero counts 218

11.3 Signal discrimination 220

11.4 False positive results 221

11.5 Sensitivity adjustments 222

11.6 Beyond Bioinformatics 223

11.7 A scanner that uses a known PWM 224

11.8 Signal thresholds 227

11.8.1 Implementation and filter testing 228

11.9 Conclusions 231

12 Dynamic Backgrounds (V) 232

12.1 Introduction 232

12.2 Towards a scanner with two PFMs 232

12.2.1 The implementation of dynamic PWMs 234

12.2.2 Issues and corrections for dynamic PWMs 239

12.2.3 Solutions for aberrant positive likelihood values 242

12.3 A scanner with two PFMs 246

12.4 Information and background frequencies on score values 249

12.5 Dynamic background vs null model 251

12.6 Conclusions 251

13 MARKOV CHAINS:  THE MACHINE (I)

14 MARKOV CHAINS: LOG LIKELIHOOD (II)

References 253

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program