9780596001780

Perl and Lwp

by
  • ISBN13:

    9780596001780

  • ISBN10:

    0596001789

  • Format: Paperback
  • Copyright: 2002-07-01
  • Publisher: Oreilly & Associates Inc
  • Purchase Benefits
  • Free Shipping On Orders Over $35!
    Your order must be $35 or more to qualify for free economy shipping. Bulk sales, PO's, Marketplace items, eBooks and apparel do not qualify for this offer.
  • Get Rewarded for Ordering Your Textbooks! Enroll Now
List Price: $39.99 Save up to $6.00
  • Buy New
    $33.99

    IN STOCK USUALLY SHIPS IN 24-48 HOURS

Supplemental Materials

What is included with this book?

  • The New copy of this book will include any supplemental materials advertised. Please check the title of the book to determine if it should include any access cards, study guides, lab manuals, CDs, etc.
  • The eBook copy of this book is not guaranteed to include any supplemental materials. Typically, only the book itself is included. This is true even if the title states it includes any access cards, study guides, lab manuals, CDs, etc.

Summary

Perl is the leading language not only for creating web content, but also for consuming it. With a suite of Perl modules known as LWP (lib-www-perl), programmers can dispense with graphical web browsers such as Netscape Navigator and interact with web servers directly. LWP enables programmers to write "spiders" to automatically fetch web pages, extract information from HTML pages, submit forms, and write homegrown servers. This comprehensive guide to LWP and its applications comes with many practical examples. Topics include programmatically fetching web pages, submitting forms, using various techniques for HTML parsing, handling cookies, and authentication. With the knowledge in Perl & LWP, you can automate any task on the Web, from checking the prices of items at online stores to bidding at auctions automatically.

Author Biography

Sean M. Burke is an active member of the Perl community and one of CPAN's most prolific module authors. Since 1998 he has been a contributor to LWP and a columnist for The Perl Journal

Table of Contents

Foreword ix
Preface xi
Introduction to Web Automation
1(14)
The Web as Data Source
1(2)
History of LWP
3(1)
Installing LWP
4(5)
Words of Caution
9(1)
LWP in Action
10(5)
Web Basics
15(16)
URLs
15(2)
An HTTP Transaction
17(2)
LWP::Simple
19(5)
Fetching Documents Without LWP::Simple
24(1)
Example: Alta Vista
25(2)
HTTP POST
27(1)
Example: Babelfish
28(3)
The LWP Class Model
31(17)
The Basic Classes
31(1)
Programming with LWP Classes
32(1)
Inside the do GET and do POST Functions
33(1)
User Agents
34(8)
HTTP::Response Objects
42(5)
LWP Classes: Behind the Scenes
47(1)
URLs
48(10)
Parsing URLs
48(6)
Relative URLs
54(1)
Converting Absolute URLs to Relative
55(2)
Converting Relative URLs to Absolute
57(1)
Forms
58(27)
Elements of an HTML Form
59(1)
LWP and GET Requests
59(3)
Automating Form Analysis
62(2)
Idiosyncrasies of HTML Forms
64(6)
POST Example: License Plates
70(4)
POST Example: ABEBooks.com
74(7)
File Uploads
81(3)
Limits on Forms
84(1)
Simple HTML Processing with Regular Expressions
85(15)
Automating Data Extraction
85(2)
Regular Expression Techniques
87(4)
Troubleshooting
91(2)
When Regular Expressions Aren't Enough
93(1)
Example: Extracting Links from a Bookmark File
93(3)
Example: Extracting Links from Arbitrary HTML
96(2)
Example: Extracting Temperatures from Weather Underground
98(2)
HTML Processing with Tokens
100(19)
HTML as Tokens
100(1)
Basic HTML::TokeParser Use
101(4)
Individual Tokens
105(2)
Token Sequences
107(5)
More HTML::TokeParser Methods
112(5)
Using Extracted Text
117(2)
Tokenizing Walkthrough
119(13)
The Problem
119(1)
Getting the Data
120(1)
Inspecting the HTML
121(1)
First Code
122(1)
Narrowing In
123(2)
Rewrite for Features
125(6)
Alternatives
131(1)
HTML Processing with Trees
132(16)
Introduction to Trees
132(1)
HTML::TreeBuilder
133(4)
Processing
137(5)
Example: BBC News
142(3)
Example: Fresh Air
145(3)
Modifying HTML with Trees
148(17)
Changing Attributes
148(4)
Deleting Images
152(1)
Detaching and Reattaching
153(3)
Attaching in Another Tree
156(5)
Creating New Elements
161(4)
Cookies, Authentication, and Advanced Requests
165(13)
Cookies
165(4)
Adding Extra Request Header Lines
169(3)
Authentication
172(3)
An HTTP Authentication Example: The Unicode Mailing Archive
175(3)
Spiders
178(21)
Types of Web-Querying Programs
178(2)
A User Agent for Robots
180(1)
Example: A Link-Checking Spider
181(16)
Ideas for Further Expansion
197(2)
A. LWP Modules 199(4)
B. HTTP Status Codes 203(2)
C. Common MIME Types 205(2)
D. Language Tags 207(2)
E. Common Content Encodings 209(2)
F. ASCII Table 211(13)
G. User's View of Object-Oriented Modules 224(11)
Index 235

Rewards Program

Write a Review