rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9783540241669

Bioinformatics

by ;
  • ISBN13:

    9783540241669

  • ISBN10:

    3540241663

  • Edition: 1st
  • Format: Hardcover
  • Copyright: 2007-06-03
  • Publisher: Springer-Nature New York Inc
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $109.99 Save up to $90.19
  • Digital
    $42.90*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

Bioinformatics is a discipline originated by the need of introducing order into massive datasets produced by new technologies of molecular biology: large-scale DNA sequencing, measurements of RNA concentrations in multiple gene expression arrays and new profiling techniques in proteomics. The volume under preparation presents mathematical models in bioinformatics along with descriptions of inspiring biological problems and computer science tools necessary to cope with data. The material covered includes widely used applications in genomics and proteomics and later-generation techniques, which attempts at linking genetic information with structure and function of molecules, metabolic processes and whole cells.

Author Biography

Andrzej Polanski is Professor at the Silesian University of Technology. Prior to this, he worked as a Post Doctoral Fellow at the University of Texas, Human Genetics Center, Houston USA (1996-1997) ans as a Visiting Professor at Rice University, Houston USA (2001-2003). His research interests are in bioinformatics, biomedical modeling and control, modern control and optimization theory. Marek Kimmel, Ph.D., is a Professor of Statistics at Rice University in Houston, TX, Professor in Department of Automatic Control, Silesian University of Technology in Gliwice, Poland, Professor of Biostatistics and Applied Mathematics (adj.) at M.D. Anderson Cancer Center in Houston, and a Professor of Biometry (adj.) at the School of Public Health of the University of Texas in Houston. He is heading the Rice Bioinformatics Group as well as the doctoral program in Statistical Genetics and Bioinformatics. Dr. Kimmel is a Fellow of the American Statistical Association. His principal interests are stochastic modeling of human disease (in particular lung cancer progression and screening), statistical and population genetics, biostatistics and bioinformatics.

Table of Contents

Introductionp. 1
The Genesis of Bioinformaticsp. 1
Bioinformatics Versus Other Disciplinesp. 2
Further Developments: from Linear Information to Multidimensional Structure Organizationp. 4
Mathematical and Computational Methodsp. 5
Why Mathematical Modeling?p. 6
Fitting Models to Datap. 7
Computer Softwarep. 7
Applicationsp. 8
Mathematical and Computational Methods
Probability and Statisticsp. 13
The Rules of Probability Calculusp. 13
Independence, Conditional Probabilities and Bayes' Rulesp. 14
Random Variablesp. 15
Vector Random Variablesp. 16
Marginal Distributionsp. 17
Operations on Random Variablesp. 17
Notationp. 19
Expectation and Moments of Random Variablesp. 19
Probability-Generating Functions and Characteristic Functionsp. 20
A Collection of Discrete and Continuous Distributionsp. 22
Bernoulli Trials and the Binomial Distributionp. 22
The Geometric Distributionp. 23
The Negative Binomial Distributionp. 23
The Poisson Distributionp. 24
The Multinomial Distributionp. 25
The Hypergeometric Distributionp. 25
The Normal (Gaussian) Distributionp. 26
The Exponential Distributionp. 26
The Gamma Distributionp. 27
The Beta Distributionp. 27
Likelihood maximizationp. 28
Binomial Distributionp. 29
Multinomial distributionp. 29
Poisson Distributionp. 29
Geometric Distributionp. 30
Normal Distributionp. 30
Exponential Distributionp. 31
Other Methods of Estimating Parameters: a Comparisonp. 31
Example 1. Uniform Distributionp. 31
Example 2. Cauchy Distributionp. 33
Minimum Variance Parameter Estimationp. 35
The Expectation Maximization Methodp. 37
The Derivations of the Algorithmp. 38
Examples of Recursive Estimation of Parameters by Using the EM Algorithmp. 41
Statistical Testsp. 45
The Ideap. 45
Parametric Testsp. 47
Nonparametric Testsp. 48
Type I and II statistical errorsp. 49
Markov Chainsp. 49
Transition Probability Matrix and State Transition Graphp. 50
Time Evolution of Probability Distributions of Statesp. 51
Classification of Statesp. 52
Ergodicityp. 54
Stationary Distributionp. 54
Reversible Markov Chainsp. 55
Time-Continuous Markov Chainsp. 56
Markov Chain Monte Carlo (MCMC) Methodsp. 57
Acceptance-Rejection Rulep. 59
Applications of the Metropolis-Hastings Algorithmp. 59
Simulated Annealing and MC3p. 59
Hidden Markov Modelsp. 60
Probability of Occurrence of a Sequence of Symbolsp. 60
Backward Algorithmp. 61
Forward Algorithmp. 61
Viterbi Algorithmp. 62
The Baum-Welch algorithmp. 63
Exercisesp. 63
Computer Science Algorithmsp. 67
Algorithmsp. 67
Sorting and Quicksortp. 68
Simple Sortp. 69
Quicksortp. 69
String Searches. Fast Searchp. 70
Easy Searchp. 71
Fast Searchp. 71
Index Structures for Strings. Search Tries. Suffix Treesp. 73
A Treelike Structure in Computer Memoryp. 74
Search Triesp. 75
Compact Search Triesp. 76
Suffix Tries and Suffix Treesp. 77
Suffix Arraysp. 80
Algorithms for Searching Triesp. 80
Building Triesp. 83
Remarks on the Efficiency of the Algorithmsp. 85
The Burrows-Wheeler Transformp. 85
Inverse transformp. 86
BW Transform as a Compression Toolp. 88
BW Transform as a Search Tool for Patternsp. 89
BW Transform as an Associative, Compressed Memoryp. 90
Computational Complexity of BW Transformp. 91
Hashingp. 91
Hashing functions for addressing variablesp. 91
Collisionsp. 92
Statistics of Memory Access Time with Hashingp. 93
Inquiring About Repetitive Structure of Sequences, Comparing Sequences and Detecting Sequence Overlap by Hashingp. 94
Exercisesp. 95
Pattern Analysisp. 97
Feature Extractionp. 97
Classificationp. 98
Linear Classifiersp. 98
Linear Classifier Functions and Artificial Neuronsp. 100
Artificial Neural Networksp. 100
Support Vector Machinesp. 102
Clusteringp. 103
K-means Clusteringp. 104
Hierarchical Clusteringp. 105
Dimensionality Reduction, Principal Component Analysisp. 107
Singular-Value Decomposition (SVD)p. 108
Geometric Interpretation of SVDp. 109
Partial-Least-Squares (PLS) Methodp. 115
Parametric Transformationsp. 116
Hough Transformp. 117
Generalized Hough Transformsp. 118
Geometric Hashingp. 119
Exercisesp. 119
Optimizationp. 123
Static Optimizationp. 124
Convexity and Concavityp. 126
Constrained Optimization with Equality Constraintsp. 128
Constrained Optimization with Inequality Constraintsp. 131
Sufficiency of Optimality Conditions for Constrained Problemsp. 133
Computing Solutions to Optimization Problemsp. 133
Linear Programmingp. 136
Quadratic Programmingp. 137
Recursive Optimization Algorithmsp. 137
Dynamic Programmingp. 140
Dynamic Programming Algorithm for a Discrete-Time Systemp. 141
Tracing a Path in a Planep. 143
Shortest Paths in Arrays and Graphsp. 145
Combinatorial Optimizationp. 147
Examples of Combinatorial Optimization Problemsp. 148
Time Complexityp. 148
Decision and Optimization Problemsp. 149
Classes of Problems and Algorithmsp. 149
Suboptimal Algorithmsp. 150
Unsolved Problemsp. 150
Exercisesp. 151
Applications
Sequence Alignmentp. 155
Number of Possible Alignmentsp. 157
Dot Matricesp. 159
Scoring Correspondences and Mismatchesp. 160
Developing Scoring Functionsp. 162
Estimating Probabilities of Nucleotide Substitutionp. 162
Parametric Models of Nucleotide Substitutionp. 163
Computing Transition Probabilitiesp. 165
Fitting Nucleotide Substitution Models to Datap. 168
Breaking the Loop of Dependenciesp. 173
Scaling Substitution Probabilitiesp. 173
Amino Acid Substitution Matricesp. 173
Gapsp. 177
Sequence Alignment by Dynamic Programmingp. 178
The Needleman-Wunsch Alignment Algorithmp. 178
The Smith-Waterman Algorithmp. 181
Aligning Sequences Against Databasesp. 182
Methods of Multiple Alignmentp. 183
Exercisesp. 184
Molecular Phylogeneticsp. 187
Trees: Vocabulary and Methodsp. 187
The Vocabulary of Treesp. 188
Overview of Tree-Building Methodologiesp. 189
Distance-Based Treesp. 190
Tree-Derived Distancep. 191
Ultrametric Distances and Molecular-Clock Treesp. 191
Unweighted Pair Group Method with Arithmetic Mean (UPGMA) Algorithmp. 193
Neighbor-Joining Treesp. 193
Maximum Likelihood (Felsenstein) Treesp. 194
Hypotheses and Stepsp. 196
The Pulley Principlep. 197
Estimating Branch Lengthsp. 197
Estimating the Tree Topologyp. 198
Maximum-Parsimony Treesp. 198
Minimal Number of Evolutionary Events for a Given Treep. 199
Searching for the Optimal Tree Topologyp. 199
Miscellaneous Topics in Phylogenetic Tree Modelsp. 200
The Nonparametric Bootstrap Methodp. 200
Variable Substitution Rates, the Felsenstein-Churchill Algorithm and Related Methodsp. 201
The Evolutionary Trace Method and Functional Sites in Proteinsp. 201
Coalescence Theoryp. 202
Neutral Evolution: Interaction of Genetic Drift and Mutationp. 202
Modeling Genetic Driftp. 203
Modeling Mutationp. 204
Coalescence Under Different Demographic Scenariosp. 204
Statistical Inference on Demographic Hypotheses and Parametersp. 207
Markov Chain Monte Carlo (MCMC) Methodsp. 207
Approximate Approachesp. 208
Exercisesp. 212
Genomicsp. 213
The DNA Molecule and the Central Dogma of Molecular Biologyp. 214
Genome Structurep. 220
Genome Sequencingp. 223
Restriction Enzymesp. 224
Electrophoresisp. 224
Southern Blotp. 224
The Polymerase Chain Reactionp. 225
DNA Cloningp. 226
Chain Termination DNA Sequencingp. 226
Genome Shotgun Sequencingp. 228
Genome Assembly Algorithmsp. 230
Growing Contigs from Fragmentsp. 230
Detection of Overlaps Between Readsp. 230
Repetitive Structure of DNAp. 232
The Shortest Superstring Problemp. 233
Overlap Graphs and the Hamiltonian Path Problemp. 234
Sequencing by Hybridizationp. 235
De Bruijn Graphsp. 238
All l-mers in the Readsp. 238
The Euler Superpath Problemp. 239
Further Aspects of DNA Assembly Algorithmsp. 240
Statistics of the Genome Coveragep. 243
Contigs, Gaps and Anchored Contigsp. 244
Statistics with Minimum Overlaps Between Fragments, Anchored Contigsp. 246
Genome Length and Structure Estimation by Sampling l-mersp. 247
Polymorphismsp. 252
Genome Annotationp. 252
Research Tools for Genome Annotationp. 254
Gene Identificationp. 254
DNA Motifsp. 257
Annotation by Words and Comparisons of Genome Assembliesp. 258
Human Chromosome 14p. 258
Exercisesp. 259
Proteomicsp. 261
Protein Structurep. 262
Amino Acidsp. 262
Peptide Bondsp. 265
Primary Structurep. 266
Secondary Structurep. 266
Tertiary Structurep. 268
Quaternary Structurep. 271
Experimental Determination of Amino Acid Sequences and Protein Structuresp. 271
Electrophoresisp. 272
Protein 2D Gelsp. 272
Protein Western Blotsp. 273
Mass Spectrometryp. 273
Chemical Identification of Amino Acids in Peptidesp. 274
Analysis of Protein 3D Structure by X Ray Diffraction and NMRp. 275
Other Assays for Protein Compositions and Interactionsp. 275
Computational Methods for Modeling Molecular Structuresp. 275
Molecular-Force-Field Modelp. 276
Molecular Dynamicsp. 281
Hydrogen Bondsp. 281
Computation and Minimization of RMSDp. 282
Solutions to the Problem of Minimization of RMSD over Rotationsp. 284
Solutions to the Problem of Minimization of RMSD over Rotations and Translationsp. 290
Solvent-Accessible Surface of a Proteinp. 290
Computational Prediction of Protein Structure and Functionp. 290
Inferring Structures of Proteinsp. 291
Protein Annotationp. 292
De Novo Methodsp. 292
Comparative Modelingp. 293
Protein-Ligand Binding Analysisp. 295
Classification Based on Proteomic Assaysp. 295
Exercisesp. 296
RNAp. 299
The RNA World Hypothesisp. 300
The Functions of RNAp. 300
Reverse Transcription, Sequencing RNA Chainsp. 301
The Northern Blotp. 302
RNA Primary Structurep. 302
RNA Secondary Structurep. 302
RNA Tertiary Structurep. 302
Computational Prediction of RNA Secondary Structurep. 303
Nested Structurep. 304
Maximizing the Number of Pairings Between Basesp. 304
Minimizing the Energy of RNA Secondary Structurep. 306
Pseudoknotsp. 310
Prediction of RNA Structure by Comparative Sequence Analysisp. 311
Exercisesp. 311
DNA Microarraysp. 313
Design of DNA Microarraysp. 315
Kinetics of the Binding Processp. 318
Data Preprocessing and Normalizationp. 320
Normalization Procedures for Single Microarraysp. 321
Normalization Based on Spiked-in Control RNAp. 323
RMA Normalization Procedurep. 326
Correction of Ratio-Intensity Plots for cDNAp. 328
Statistics of Gene Expression Profilesp. 328
Modeling Probability Distributions of Gene Expressionsp. 331
Class Prediction and Class Discoveryp. 336
Dimensionality Reductionp. 337
Example of Application of PCA to Microarray Datap. 338
Class Discoveryp. 338
Hierarchical Clusteringp. 339
Class Prediction. Differentially Expressed Genesp. 340
Multiple Testing, and Analysis of False Discovery Rate (FDR)p. 341
FDR analysis in ALL versus AML gene expression datap. 344
The Gene Ontology Databasep. 344
Structure of GOp. 345
Other Vocabularies of Termsp. 346
Supporting Results of DNA Microarray Analyses with GO and other Vocabulary Termsp. 347
Exercisesp. 347
Bioinformatic Databases and Bioinformatic Internet Resourcesp. 349
Genomic Databasesp. 350
Proteomic Databasesp. 350
RNA Databasesp. 350
Gene Expression Databasesp. 351
Ontology Databasesp. 351
Databases of Genetic and Proteomic Pathwaysp. 351
Programs and Servicesp. 352
Clinical Databasesp. 352
Referencesp. 355
Indexp. 371
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program