rent-now

Rent More, Save More! Use code: ECRENTAL

5% off 1 book, 7% off 2 books, 10% off 3+ books

9780387789088

The Burrows-Wheeler Transform

by ; ;
  • ISBN13:

    9780387789088

  • ISBN10:

    0387789081

  • Format: Hardcover
  • Copyright: 2008-05-30
  • Publisher: Springer-Verlag New York Inc
  • Purchase Benefits
  • Free Shipping Icon Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • eCampus.com Logo Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $169.99 Save up to $134.35
  • Digital
    $77.22*
    Add to Cart

    DURATION
    PRICE
    *To support the delivery of the digital material to you, a digital delivery fee of $3.99 will be charged on each digital item.

Summary

The Burrows-Wheeler Transform is a text transformation scheme that has found applications in different aspects of the data explosion problem, from data compression to index structures and search. The BWT belongs to a new class of compression algorithms, distinguished by its ability to perform compression by sorted contexts. More recently, the BWT has also found various applications in addition to text data compression, such as in lossless and lossy image compression, tree-source identification, bioinformatics, machine translation, shape matching, and test data compression.This book will serve as a reference for seasoned professionals or researchers in the area, while providing a gentle introduction, making it accessible for senior undergraduate students or first year graduate students embarking upon research in compression, pattern matching, full text retrieval, compressed index structures, or other areas related to the BWT.Key FeaturesComprehensive resource for information related to different aspects of the Burrows-Wheeler Transform including: Gentle introduction to the BWTHistory of the development of the BWTDetailed theoretical analysis of algorithmic issues and performance limits Searching on BWT compressed data Hardware architectures for the BWT Explores non-traditional applications of the BWT in areas such as: Bioinformatics Joint source-channel codingModern information retrieval Machine translation Test data compression for systems-on-chip Teaching materials ideal for classroom use on courses in: Data Compression and Source Coding Modern Information Retrieval Information Science Digital Libraries

Table of Contents

Introductionp. 1
An example of a Burrows-Wheeler Transformp. 3
Genesis of the Burrows-Wheeler Transformp. 5
Transformationp. 8
Permutationp. 11
Recencyp. 12
Pattern matchingp. 13
Organization of this bookp. 14
Further readingp. 16
How the Burrows-Wheeler Transform worksp. 19
The forward Burrows-Wheeler Transformp. 19
The reverse Burrows-Wheeler Transformp. 23
Special casesp. 29
Further readingp. 31
Coders for the Burrows-Wheeler Transformp. 33
Entropy codingp. 33
Run-length and arithmetic coderp. 38
Move-to-front listsp. 39
Frequency counting methodsp. 42
Inversion Frequencies (IF)p. 43
Distance codingp. 44
Wavelet treesp. 45
Other permutationsp. 46
Block sizep. 47
Further readingp. 48
Suffix trees and suffix arraysp. 51
Suffix Treesp. 51
Basic notations and definitionsp. 52
Construction of a suffix treep. 54
Ukkonen's suffix tree algorithmp. 57
From implicit suffix tree to true suffix treep. 64
Farach's recursive constructionp. 66
Generalized suffix treesp. 73
Implementation issuesp. 74
Suffix arraysp. 75
Traditional string sortingp. 76
Suffix arrays via suffix treesp. 78
Manber-Myers suffix sorting algorithmp. 78
Linear-time direct suffix sortingp. 81
Space issues in suffix trees and suffix arraysp. 85
Further readingp. 88
Analysis of the Burrows-Wheeler Transformp. 91
The BWT, suffix trees and suffix arraysp. 93
Computational complexityp. 95
BWT first stage - the transformp. 95
BWT second stage - coding the transformed textp. 95
BWT context clustering propertyp. 97
Context treesp. 97
Estimation using context treesp. 100
BWT and context treesp. 103
Analysis of BWT outputp. 104
Theoretical distribution of BWT outputp. 104
Empirical distribution of BWT outputp. 105
Analysis of BWT compression performancep. 119
Definitions and notationp. 120
Performance using recency rankingp. 123
Performance without LGTp. 129
Performance using piecewise constant parametersp. 132
Performance on general sources via empirical entropyp. 133
Relationship with other compression schemesp. 135
Context-based schemesp. 135
Symbol ranking schemesp. 148
Further readingp. 149
Variants of the Burrows-Wheeler Transformp. 153
The sort transformp. 154
Forward sort transformp. 154
Inverse sort transformp. 155
Performance of the sort transformp. 159
Lexical permutation sortingp. 163
Sorting permutationsp. 164
Lexical permutation sorting algorithmp. 167
The extended BWTp. 168
Sort order between stringsp. 168
Performing the extended BWTp. 169
Inverting the transformp. 170
Sort-based context similarity measurementp. 173
Context similarity measurement and rankingp. 173
The prefix list data structurep. 175
Relationship with the Burrows-Wheeler Transformp. 178
Performance of the prefix listp. 180
Word-based compressionp. 180
General word-based compressionp. 181
Word-based Burrows-Wheeler Transformp. 183
Further readingp. 185
Exact and approximate pattern matchingp. 187
Exact pattern matching algorithmsp. 188
Brute force matchingp. 189
The Knuth-Morris-Pratt Algorithmp. 190
The Boyer-Moore algorithmp. 195
The Karp-Rabin algorithmp. 197
The shift-and methodp. 199
Multiple pattern matchingp. 200
Pattern matching with don't-care charactersp. 204
Pattern matching using the Burrows-Wheeler Transformp. 207
Boyer-Moore pattern matching using the BWTp. 209
BWT-based exact pattern matching with binary searchp. 209
BWT-based exact pattern matching with suffix arraysp. 214
Pattern matching using the FM-indexp. 215
Algorithm improvements with overwritten arraysp. 220
Performance of BWT-based exact pattern matchingp. 221
Compression performancep. 222
Search performancep. 224
Array construction speedsp. 231
Comparison with LZ-based compressed-domain pattern matchingp. 232
Approximate pattern matchingp. 233
Edit distance: dynamic programming formulationp. 234
Edit graphsp. 236
Local similarityp. 237
The longest common subsequence problemp. 239
String matching with k differencesp. 244
The k-mismatch problem using the BWTp. 247
k-approximate matching using the BWTp. 253
Hardware algorithms for pattern matchingp. 255
An equivalent hardware algorithmp. 256
A brief review of other hardware algorithmsp. 258
Conclusionp. 259
Further readingp. 260
Other applications of the Burrows-Wheeler Transformp. 265
Compressed suffix trees and compressed suffix arraysp. 266
Compressed suffix treesp. 267
Compressed suffix arraysp. 270
Compressed full-text indexingp. 275
Full-text indexing using CSTs and CSAsp. 276
Searching on compressed suffix treesp. 277
Searching on compressed suffix arraysp. 278
Bioinformatics and computational biologyp. 278
DNA sequence compressionp. 279
Analysis of repetition structuresp. 280
Whole-genome comparisonsp. 281
Genome annotationp. 282
Distance measure between sequences and phylogenyp. 283
Test data compressionp. 284
Nature of test datap. 285
BWT-based test data compressionp. 286
Image compression, computer vision and machine translationp. 287
Image compressionp. 287
Shape matchingp. 292
Machine translationp. 294
Joint source-channel codingp. 296
General source coding via channel codingp. 297
BWT-based joint source-channel codingp. 298
Prediction and entropy estimationp. 299
Further readingp. 301
Conclusionp. 305
Notationp. 309
Ongoing work on the Burrows-Wheeler Transformp. 313
BWT-related web sitesp. 313
Ph.D. theses relating to the Burrows-Wheeler Transformp. 314
Referencesp. 317
Indexp. 341
Table of Contents provided by Ingram. All Rights Reserved.

Supplemental Materials

What is included with this book?

The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.

The Used, Rental and eBook copies of this book are not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Rewards Program