Appendix A. LWP Modules

While the text of this book has covered the LWP modules that you need to know about to get things done, there are many additional modules in LWP. Most of them are behind the scenes or have limited use that we couldn't spare the space to discuss. But if you want to further your knowledge of LWP's internals, here's a roadmap to get you started.

These are the LWP modules, listed alphabetically, from the CPAN distributions most current at time of writing, libwww-perl v5.64, URI v1.18, HTML-Parser v3.26, HTML-Tree v3.11, and HTML-Format v1.23. Especially noteworthy modules have an "*" in front of their names.

ModuleDescription
 File::Listing Module for parsing directory listings. Used by Net::FTP.
 HTML::Form Class for objects representing HTML forms.
 HTML::FormatPS Class for objects that can render HTML::TreeBuilder tree contents as PostScript.
 HTML::Formatter Internal base class for HTML::FormatPS and HTML::FormatText.
*HTML::FormatText Class for objects that can render HTML::TreeBuilder tree contents as plain text.
*HTML::Entities Useful module providing functions that &-encode/decode strings (such as C. & E. Brontë to and from C. & E. Brontë).
 HTML::Filter Deprecated class for HTML parsers that reproduce their input by default.
 HTML::HeadParser Parse <HEAD> section of an HTML document.
 HTML::LinkExtor Class for HTML parsers that parse out links.
 HTML::PullParser Semi-internal base class used by HTML::TokeParser.
*HTML::TokeParser Friendly token-at-a-time HTML pull-parser class.
 HTML::Parser Base class for HTML parsers; used by the friendlier HTML::TokeParser and HTML::TreeBuilder.
 HTML::AsSubs Semi-deprecated module providing functions that each construct an HTML::Element object.
*HTML::Element Class for objects that each represent an HTML element.
 HTML::Parse Deprecated module that provides functions accessing HTML::TreeBuilder.
 HTML::Tree Module that exists just so you can run perldoc HTML-Tree.
*HTML::TreeBuilder Class for objects representing an HTML tree into which you can parse source.
*HTTP::Cookies Class for objects representing databases of cookies.
 HTTP::Daemon Base class for writing HTTP server daemons.
 HTTP::Date Module for date conversion routines. Used by various LWP protocol modules.
 HTTP::Headers Class for objects representing the group of headers in an HTTP::Response or HTTP::Request object.
 HTTP::Headers::Auth Experimental/internal for improving HTTP::Headers's authentication support.
 HTTP::Headers::ETag Experimental/internal module adding HTTP ETag support to HTTP::Headers.
 HTTP::Headers::Util Module providing string functions used internally by various other LWP modules.
*HTTP::Message Base class for methods common to HTTP::Response and HTTP::Request.
 HTTP::Negotiate Module implementing an algorithm for content negotiation. Not widely used.
 HTTP::Request Class for objects representing a request that carried out with an LWP::UserAgent object.
 HTTP::Request::Common Module providing functions used for constructing common kinds of HTTP::Request objects.
*HTTP::Response Class for objects representing the result of an HTTP::Request that was carried out.
*HTTP::Status Module providing functions and constants involving HTTP status codes.
*LWP Module that exists merely so you can say "use LWP" and have all the common LWP modules (notably LWP::UserAgent, HTTP::Request, and HTTP::Response). Saying "use LWP5.64" also asserts that the current LWP distribution had better be Version 5.64 or later. The module also contains generous documentation.
 LWP::Authen::Basic Module used internally by LWP::UserAgent for doing common ("Basic") HTTP authentication responses.
 LWP::Authen::Digest Module used internally by LWP::UserAgent for doing less-common HTTP Digest authentication responses.
 LWP::ConnCache Class used internally by some LWP::Protocol::protocol modules to reuse socket connections.
*LWP::Debug Module for routines useful in tracing how LWP performs requests.
 LWP::MediaTypes Module used mostly internally for guessing the MIME type of a file or URL.
 LWP::MemberMixin Base class used internally for accessing object attributes.
 LWP::Protocol Mostly internal base class for accessing and managing LWP protocols.
 LWP::Protocol::data Internal class that handles the new data: URL scheme (RFC 2397).
 LWP::Protocol::file Internal class that handles the file: URL scheme.
 LWP::Protocol::ftp Internal class that handles the ftp: URL scheme.
 LWP::Protocol::GHTTP Internal class for handling http: URL scheme using the HTTP::GHTTP library.
 LWP::Protocol::gopher Internal class that handles the gopher: URL scheme.
 LWP::Protocol::http Internal class that normally handles the http: URL scheme.
 LWP::Protocol::http10 Internal class that handles the http: URL scheme via just HTTP v1.0 (without the 1.1 extensions and features).
 LWP::Protocol::https Internal class that normally handles the https: URL scheme, assuming you have an SSL library installed.
 LWP::Protocol::https10 Internal class that handles the https: URL scheme, if you don't want HTTP v1.1 extensions.
 LWP::Protocol::mailto Internal class that handles the mailto: URL scheme; yes, it sends mail!
 LWP::Protocol::nntp Internal class that handles the nntp: and news: URL schemes.
 LWP::Protocol::nogo Internal class used in handling requests to unsupported protocols.
*LWP::RobotUA Class based on LWP::UserAgent, for objects representing virtual browsers that obey robots.txt files and don't abuse remote servers.
*LWP::Simple Module providing the get, head, getprint, getstore, and mirror shortcut functions.
*LWP::UserAgent Class for objects representing "virtual browsers."
 Net::HTTP Internal class used for HTTP socket connections.
 Net::HTTP::Methods Internal class used for HTTP socket connections.
 Net::HTTP::NB Internal class used for HTTP socket connections with nonblocking sockets.
 Net::HTTPS Internal class used for HTTP Secure socket connections.
*URI Main class for objects representing URIs/URLs, relative or absolute.
 URI::_foreign Internal class for objects representing URLs for schemes for which we don't have a specific class.
 URI::_generic Internal base class for just about all URLs.
 URI::_login Internal base class for connection URLs such as telnet:, rlogin:, and ssh:.
 URI::_query Internal base class providing methods for URL types that can have query strings (such as foo://...?bar).
 URI::_segment Internal class for representing some return values from $url->path_segments( ) calls.
 URI::_server Internal base class for URL types where the first bit represents a server name (most of them except mailto:).
 URI::_userpass Internal class providing methods for URL types with an optional user[:pass] part (such as ftp://itsme:foo@secret.int/).
 URI::data Class for objects representing the new data: URLs (RFC 2397).
*URI::Escape Module for functions that URL-encode and URL-decode strings (such as pot pie to and from pot%20pie).
 URI::file Class for objects representing file: URLs.
 URI::file::Base Internal base class for file: URLs.
 URI::file::FAT Internal base class for file: URLs under legacy MSDOS (with 8.3 filenames).
 URI::file::Mac Internal base class for file: URLs under legacy (before v10) MacOS.
 URI::file::OS2 Internal base class for file: URLs under OS/2.
 URI::file::QNX Internal base class for file: URLs under QNX.
 URI::file::Unix Internal base class for file: URLs under Unix.
 URI::file::Win32 Internal base class for file: URLs under MS Windows.
 URI::ftp Class for objects representing ftp: URLs.
 URI::gopher Class for objects representing gopher: URLs.
 URI::Heuristic Module for functions that expand abbreviated URLs such as ora.com.
 URI::http Class for objects representing http: URLs.
 URI::https Class for objects representing https: URLs.
 URI::ldap Class for objects representing ldap: URLs.
 URI::mailto Class for objects representing mailto: URLs.
 URI::news Class for objects representing news: URLs.
 URI::nntp Class for objects representing nntp: URLs.
 URI::pop Class for objects representing pop: URLs.
 URI::rlogin Class for objects representing rlogin: login URLs.
 URI::rsync Class for objects representing rsync: URLs.
 URI::snews Class for objects representing snews: (Secure News) URLs.
 URI::ssh Class for objects representing ssh: login URLs.
 URI::telnet Class for objects representing telnet: login URLs.
 URI::URL Deprecated class that is like URI; use URI instead.
 URI::WithBase Like the class URI, but objects of this class can "remember" their base URLs.
 WWW::RobotsRules Class for objects representing restrictions parsed from various robots.txt files.
 WWW::RobotRules::AnyDBM_File Subclass of WWW::RobotRules that uses a DBM file to cache its contents.