did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

did-you-know? rent-now

Amazon no longer offers textbook rentals. We do!

We're the #1 textbook rental company. Let us show you why.

9780198567417

Computational Text Analysis For Functional Genomics and Bioinformatics

by
  • ISBN13:

    9780198567417

  • ISBN10:

    0198567413

  • Edition: 1st
  • Format: Paperback
  • Copyright: 2006-03-30
  • Publisher: Oxford University Press
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $90.66

Summary

This book brings together the two disparate worlds of computational text analysis and biology and presents some of the latest methods and applications to proteomics, sequence analysis and gene expression data. Modern genomics generates large and comprehensive data sets but their interpretation requires an understanding of a vast number of genes, their complex functions, and interactions. Keeping up with the literature on a single gene is a challenge itself-for thousands of genes it is simply impossible. Here, Soumya Raychaudhuri presents the techniques and algorithms needed to access and utilize the vast scientific text, i.e. methods that automatically "read" the literature on all the genes. Including background chapters on the necessary biology, statistics and genomics, in addition to practical examples of interpreting many different types of modern experiments, this book is ideal for students and researchers in computational biology, bioinformatics, genomics, statistics and computer science.

Table of Contents

List of Figures
xvii
List of Plates
xxi
List of Tables
xxiii
An introduction to text analysis in genomics
1(16)
The genomics literature
2(3)
Using text in genomics
5(4)
Building databases of genetic knowledge
5(2)
Analyzing experimental genomic data sets
7(1)
Proposing new biological knowledge: identifying candidate genes
8(1)
Publicly available text resources
9(3)
Electronic text
9(1)
Genome resources
9(2)
Gene ontology
11(1)
The advantage of text-based methods
12(1)
Guide to this book
13(4)
Functional genomics
17(66)
Some molecular biology
17(10)
Central dogma of molecular biology
18(1)
Deoxyribonucleic acid
18(2)
Ribonucleic acid
20(2)
Genes
22(2)
Proteins
24(2)
Biological function
26(1)
Probability theory and statistics
27(10)
Probability
27(1)
Conditional probability
28(1)
Independence
29(1)
Bayes' theorem
30(1)
Probability distribution functions
31(2)
Information theory
33(1)
Population statistics
34(1)
Measuring performance
35(2)
Deriving and analyzing sequences
37(24)
Sequencing
39(1)
Homology
40(2)
Sequence alignment
42(2)
Pairwise sequence alignment and dynamic programming
44(3)
Linear time pairwise alignment: BLAST
47(1)
Multiple sequence alignment
48(2)
Comparing sequences to profiles: weight matrices
50(3)
Position specific iterative BLAST
53(1)
Hidden Markov models
54(7)
Gene expression profiling
61(22)
Measuring gene expression with arrays
63(1)
Measuring gene expression by sequencing and counting transcripts
64(1)
Expression array analysis
65(1)
Unsupervised grouping: clustering
66(2)
K-means clustering
68(1)
Self-organizing maps
69(1)
Hierarchical clustering
70(2)
Dimension reduction with principal components analysis
72(2)
Combining expression data with external information: supervised machine learning
74(1)
Nearest neighbor classification
75(1)
Linear discriminant analysis
75(8)
Textual profiles of genes
83(24)
Representing documents as word vectors
84(2)
Metrics to compare documents
86(2)
Some words are more important for document similarity
88(1)
Building a vocabulary: feature selection
88(2)
Weighting words
90(2)
Latent semantic indexing
92(2)
Defining textual profiles for genes
94(2)
Using text like genomics data
96(4)
A simple strategy to assigning keywords to groups of genes
100(1)
Querying genes for biological function
101(6)
Using text in sequence analysis
107(16)
SWISS-PROT records as a textual resource
109(2)
Using sequence similarity to extend literature references
111(1)
Assigning keywords to summarize sequences hits
112(2)
Using textual profiles to organize sequence hits
114(1)
Using text to help identify remote homology
114(1)
Modifying iterative sequence similarity searches to include text
115(2)
Evaluating PSI-BLAST modified to include text
117(3)
Combining sequence and text together
120(3)
Text-based analysis of a single series of gene expression measurements
123(24)
Pitfalls of gene expression analysis: noise
124(2)
Phosphate metabolism: an example
126(1)
The top fifteen genes
127(2)
Distinguishing true positives from false positives with a literature-based approach
129(1)
Neighbor expression information
130(2)
Application to phosphate metabolism data set
132(4)
Recognizing high induction false positives with literature-based scores
136(2)
Recognizing low induction false positives
138(2)
Assessing experiment quality with literature-based scoring
140(1)
Improvements
140(1)
Application to other assays
141(1)
Assigning keywords that describe the broad biology of the experiment
141(6)
Analyzing groups of genes
147(24)
Functional coherence of a group of genes
148(4)
Overview of computational approach
152(3)
Strategy to evaluate different algorithms
155(2)
Word distribution divergence
157(3)
Best article score
160(3)
Neighbor divergence
163(1)
Calculating a theoretical distribution of scores
163(1)
Quantifying the difference between the empirical score distribution and the theoretical one
164(1)
Neighbor divergence per gene
164(2)
Corruption studies
166(1)
Application of functional coherence scoring to screen gene expression clusters
167(3)
Understanding the gene group's function
170(1)
Analyzing large gene expression data sets
171(24)
Groups of genes
172(1)
Assigning keywords
173(1)
Screening gene expression clusters
173(5)
Optimizing cluster boundaries: hierarchical clustering
178(6)
Application to other organisms besides yeast
184(5)
Identifying and optimizing clusters in a Drosophila development data set
189(6)
Using text classification for gene function annotation
195(32)
Functional vocabularies and gene annotation
196(6)
Gene Ontology
197(3)
Enzyme Commission
200(1)
Kyoto Encyclopedia of Genes and Genomes
200(2)
Text classification
202(1)
Nearest neighbor classification
203(1)
Naive Bayes classification
204(1)
Maximum entropy classification
205(5)
Feature selection: choosing the best words for classification
210(2)
Classifying documents into functional categories
212(1)
Comparing classifiers
213(8)
Annotating genes
221(6)
Finding gene names
227(18)
Strategies to identify gene names
228(1)
Recognizing gene names with a dictionary
228(4)
Using word structure and appearance to identify gene names
232(1)
Using syntax to eliminate gene name candidates
233(2)
Using context as a clue about gene names
235(2)
Morphology
237(1)
Identifying gene names and their abbreviations
237(3)
A single unified gene name finding algorithm
240(5)
Protein interaction networks
245(26)
Genetic networks
246(1)
Experimental assays to identify protein networks
247(2)
Yeast two hybrid
247(1)
Affinity precipitation
248(1)
Predicting interactions versus verifying interactions with scientific text
249(1)
Networks of co-occurring genes
249(1)
Protein interactions and gene name co-occurrence in text
250(4)
Number of textual co-occurrences predicts likelihood of an experimentally predicted interaction
254(5)
Information extraction and genetic networks: increasing specificity and identifying interaction type
259(3)
Statistical machine learning
262(9)
Conclusion
271(2)
Index 273

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program