While the text of this book has covered the LWP modules that you need to know about to get things done, there are many additional modules in LWP. Most of them are behind the scenes or have limited use that we couldn't spare the space to discuss. But if you want to further your knowledge of LWP's internals, here's a roadmap to get you started.
These are the LWP modules, listed alphabetically, from the CPAN distributions most current at time of writing, libwww-perl v5.64, URI v1.18, HTML-Parser v3.26, HTML-Tree v3.11, and HTML-Format v1.23. Especially noteworthy modules have an "*" in front of their names.
Module | Description |
---|---|
File::Listing | Module for parsing directory listings. Used by Net::FTP. |
HTML::Form | Class for objects representing HTML forms. |
HTML::FormatPS | Class for objects that can render HTML::TreeBuilder tree contents as PostScript. |
HTML::Formatter | Internal base class for HTML::FormatPS and HTML::FormatText. |
*HTML::FormatText | Class for objects that can render HTML::TreeBuilder tree contents as plain text. |
*HTML::Entities | Useful module providing functions that &-encode/decode strings (such as C. & E. Brontë to and from C. & E. Brontë). |
HTML::Filter | Deprecated class for HTML parsers that reproduce their input by default. |
HTML::HeadParser | Parse <HEAD> section of an HTML document. |
HTML::LinkExtor | Class for HTML parsers that parse out links. |
HTML::PullParser | Semi-internal base class used by HTML::TokeParser. |
*HTML::TokeParser | Friendly token-at-a-time HTML pull-parser class. |
HTML::Parser | Base class for HTML parsers; used by the friendlier HTML::TokeParser and HTML::TreeBuilder. |
HTML::AsSubs | Semi-deprecated module providing functions that each construct an HTML::Element object. |
*HTML::Element | Class for objects that each represent an HTML element. |
HTML::Parse | Deprecated module that provides functions accessing HTML::TreeBuilder. |
HTML::Tree | Module that exists just so you can run perldoc HTML-Tree. |
*HTML::TreeBuilder | Class for objects representing an HTML tree into which you can parse source. |
*HTTP::Cookies | Class for objects representing databases of cookies. |
HTTP::Daemon | Base class for writing HTTP server daemons. |
HTTP::Date | Module for date conversion routines. Used by various LWP protocol modules. |
HTTP::Headers | Class for objects representing the group of headers in an HTTP::Response or HTTP::Request object. |
HTTP::Headers::Auth | Experimental/internal for improving HTTP::Headers's authentication support. |
HTTP::Headers::ETag | Experimental/internal module adding HTTP ETag support to HTTP::Headers. |
HTTP::Headers::Util | Module providing string functions used internally by various other LWP modules. |
*HTTP::Message | Base class for methods common to HTTP::Response and HTTP::Request. |
HTTP::Negotiate | Module implementing an algorithm for content negotiation. Not widely used. |
HTTP::Request | Class for objects representing a request that carried out with an LWP::UserAgent object. |
HTTP::Request::Common | Module providing functions used for constructing common kinds of HTTP::Request objects. |
*HTTP::Response | Class for objects representing the result of an HTTP::Request that was carried out. |
*HTTP::Status | Module providing functions and constants involving HTTP status codes. |
*LWP | Module that exists merely so you can say "use LWP" and have all the common LWP modules (notably LWP::UserAgent, HTTP::Request, and HTTP::Response). Saying "use LWP5.64" also asserts that the current LWP distribution had better be Version 5.64 or later. The module also contains generous documentation. |
LWP::Authen::Basic | Module used internally by LWP::UserAgent for doing common ("Basic") HTTP authentication responses. |
LWP::Authen::Digest | Module used internally by LWP::UserAgent for doing less-common HTTP Digest authentication responses. |
LWP::ConnCache | Class used internally by some LWP::Protocol::protocol modules to reuse socket connections. |
*LWP::Debug | Module for routines useful in tracing how LWP performs requests. |
LWP::MediaTypes | Module used mostly internally for guessing the MIME type of a file or URL. |
LWP::MemberMixin | Base class used internally for accessing object attributes. |
LWP::Protocol | Mostly internal base class for accessing and managing LWP protocols. |
LWP::Protocol::data | Internal class that handles the new data: URL scheme (RFC 2397). |
LWP::Protocol::file | Internal class that handles the file: URL scheme. |
LWP::Protocol::ftp | Internal class that handles the ftp: URL scheme. |
LWP::Protocol::GHTTP | Internal class for handling http: URL scheme using the HTTP::GHTTP library. |
LWP::Protocol::gopher | Internal class that handles the gopher: URL scheme. |
LWP::Protocol::http | Internal class that normally handles the http: URL scheme. |
LWP::Protocol::http10 | Internal class that handles the http: URL scheme via just HTTP v1.0 (without the 1.1 extensions and features). |
LWP::Protocol::https | Internal class that normally handles the https: URL scheme, assuming you have an SSL library installed. |
LWP::Protocol::https10 | Internal class that handles the https: URL scheme, if you don't want HTTP v1.1 extensions. |
LWP::Protocol::mailto | Internal class that handles the mailto: URL scheme; yes, it sends mail! |
LWP::Protocol::nntp | Internal class that handles the nntp: and news: URL schemes. |
LWP::Protocol::nogo | Internal class used in handling requests to unsupported protocols. |
*LWP::RobotUA | Class based on LWP::UserAgent, for objects representing virtual browsers that obey robots.txt files and don't abuse remote servers. |
*LWP::Simple | Module providing the get, head, getprint, getstore, and mirror shortcut functions. |
*LWP::UserAgent | Class for objects representing "virtual browsers." |
Net::HTTP | Internal class used for HTTP socket connections. |
Net::HTTP::Methods | Internal class used for HTTP socket connections. |
Net::HTTP::NB | Internal class used for HTTP socket connections with nonblocking sockets. |
Net::HTTPS | Internal class used for HTTP Secure socket connections. |
*URI | Main class for objects representing URIs/URLs, relative or absolute. |
URI::_foreign | Internal class for objects representing URLs for schemes for which we don't have a specific class. |
URI::_generic | Internal base class for just about all URLs. |
URI::_login | Internal base class for connection URLs such as telnet:, rlogin:, and ssh:. |
URI::_query | Internal base class providing methods for URL types that can have query strings (such as foo://...?bar). |
URI::_segment | Internal class for representing some return values from $url->path_segments( ) calls. |
URI::_server | Internal base class for URL types where the first bit represents a server name (most of them except mailto:). |
URI::_userpass | Internal class providing methods for URL types with an optional user[:pass] part (such as ftp://itsme:foo@secret.int/). |
URI::data | Class for objects representing the new data: URLs (RFC 2397). |
*URI::Escape | Module for functions that URL-encode and URL-decode strings (such as pot pie to and from pot%20pie). |
URI::file | Class for objects representing file: URLs. |
URI::file::Base | Internal base class for file: URLs. |
URI::file::FAT | Internal base class for file: URLs under legacy MSDOS (with 8.3 filenames). |
URI::file::Mac | Internal base class for file: URLs under legacy (before v10) MacOS. |
URI::file::OS2 | Internal base class for file: URLs under OS/2. |
URI::file::QNX | Internal base class for file: URLs under QNX. |
URI::file::Unix | Internal base class for file: URLs under Unix. |
URI::file::Win32 | Internal base class for file: URLs under MS Windows. |
URI::ftp | Class for objects representing ftp: URLs. |
URI::gopher | Class for objects representing gopher: URLs. |
URI::Heuristic | Module for functions that expand abbreviated URLs such as ora.com. |
URI::http | Class for objects representing http: URLs. |
URI::https | Class for objects representing https: URLs. |
URI::ldap | Class for objects representing ldap: URLs. |
URI::mailto | Class for objects representing mailto: URLs. |
URI::news | Class for objects representing news: URLs. |
URI::nntp | Class for objects representing nntp: URLs. |
URI::pop | Class for objects representing pop: URLs. |
URI::rlogin | Class for objects representing rlogin: login URLs. |
URI::rsync | Class for objects representing rsync: URLs. |
URI::snews | Class for objects representing snews: (Secure News) URLs. |
URI::ssh | Class for objects representing ssh: login URLs. |
URI::telnet | Class for objects representing telnet: login URLs. |
URI::URL | Deprecated class that is like URI; use URI instead. |
URI::WithBase | Like the class URI, but objects of this class can "remember" their base URLs. |
WWW::RobotsRules | Class for objects representing restrictions parsed from various robots.txt files. |
WWW:: |
Subclass of WWW::RobotRules that uses a DBM file to cache its contents. |