More New and Used
from Private Sellers
Questions About This Book?
What version or edition is this?
This is the edition with a publication date of 12/8/2011.
What is included with this book?
The information trapped in text files, PDFs, and other digital content isa valuable information asset that can be very difficult to discover anduse. Apache Tika is an open source toolkit that makes it easy for searchengines, content management systems and other applications to detectand extract content from digital documents in all major file formats. Tika in Action is a hands-on guide for developers working with searchengines, content management systems and other similar applicationswho want to exploit the information locked in digital documents. Itintroduces the world of mining text and binary documents as well asother information sources. The book shows where Tika fits within thislandscape and how readers can use Tika to build and extendapplications. The book's many case studies give real-world experiencefrom domains ranging from search engines to digital asset managementand scientific data processing.