Creating New Elements (Perl & LWP)

So far we haven't directly created any new HTML::Element objects. All the elements that have appeared thus far were created by HTML::TreeBuilder as part of its delegated task of building whole trees. But suppose that we actually do need to add something to a tree that never existed elsewhere in that or any other tree. In the above section, we actually snuck in creating a new node in this statement:

But that's hardly an amazing feat, because that node isn't a real object. You can actually create a new object by calling HTML::Element->new('tagname'). So this would add an hr element to a given paragraph object:

This is simple enough, but it becomes rather annoying when you want to construct several linked nodes. For example, suppose you wanted to construct objects equivalent to what you'd get if you parsed this:

Even this little treelet is fairly tedious to produce using normal constructor calls:

10.5.1. Literals

If you try manually constructing and linking every element in a larger structure such as a table, the code will be maddening. One solution is not to create the elements at all, but to create a single element, called a ~literal pseudoelement, that contains the raw source you want to appear when that part of the tree is dumped. These sorts of objects are very much like the ~comment pseudoelements we saw in the last section; their real content is in their text attribute:

my $li = HTML::Element->new( '~literal',
  'text', '<li>See <b><a href="page.html">here.</a></b>!</li>'
);

This constructs something that will appear as that chunk of text when as_HTML( ) is called on it, but it's nothing like a normal HTML element—you can't put other elements or text under it, and you can't see it with look_down or find_by_tag_name( ) (unless you're looking for a ~literal element, which you're probably not).

10.5.2. New Nodes from Lists

Literals are fine for cases where you just want to drop arbitrarily large amounts of undigested HTML source into a tree right before you call as_HTML( ). But when you want to really make new, full-fledged elements, you can do that with a friendlier syntax with the new_from_lol( ) constructor.

With new_from_lol( ), you can specify an element with a list reference whose first item should be the tag name, which then specifies attributes with an optional hash reference, and then contains any other nodes, either as bits of text, preexisting element objects, or more list references. This is best shown by example:

my $li = HTML::Element->new_from_lol(
  [ 'li',
          "See ",
          [ 'b',
                 [ 'a',
                        {'href' => 'page.html'},
                        "here."
                 ]
          ],
          "!"
  ]
);    # or indent it however you prefer -- probably more concisely

And this produces exactly the same tree as when we called HTML::Element->new three times then linked up the resulting elements.

The benefits of the new_from_lol( ) approach are you can easily specify children at construction time, and it's very hard to produce mis-nested trees, because if the number of ['s above doesn't match the number of ]'s, it won't parse as valid Perl. Moreover, it can actually be a relatively concise format. The above code, with some whitespace removed, basically fits happily on one line:

my $li = HTML::Element->new_from_lol(
  ['li',  "See ",  ['b', ['a', {'href' => 'page.html'}, "here." ] ], "!" ]
);

So, for example, consider returning to the template-insertion problem in the previous section, and suppose that besides dumping the article's content into a template, we should also preface the content with something like this:

<p>The original version of the following story is to found at:
<br><a href="$orig_url">$orig_url</a></p>
<hr>

This can be done by replacing:

put_into_template( $good_td->content_list );

with this:

# Assuming $orig_url has been set somewhere...

put_into_template(
  HTML::Element->new_from_lol(
    ['p', "The original version of the following story is to found at:",
      ['a', {'href', $orig_url}, $orig_url],
    ]
  ),
  HTML::Element->new_from_lol(['hr']),
  $good_td->content_list,
);

If you find new_from_lol( ) notation to be an unnecessary elaboration, you can still manually construct each element with HTML::Element->new and link them up before passing them to put_into_template( ). Or you could just as well create a ~literal pseudoelement containing the raw source:

put_into_template(
  HTML::Element->new('~literal', 'text' => qq{
      <p>The original version of the following story is to found at:
      <br><a href="$orig_url">$orig_url</a></p>
      <hr>
  }),
  $good_td->content_list,
);

While the new_from_lol( ) syntax is an expressive shorthand for the general form of element construction, you may well prefer the directness of creating a single ~literal or the simplicity of normal ->new calls. As the Perl saying goes, there is more than one way to do it.

10.5. Creating New Elements

10.5.1. Literals

10.5.2. New Nodes from Lists