Objects of the `HTML::Parser' class will recognize markup and
separate it from plain text (alias data content) in HTML
documents. As different kinds of markup and text are recognized, the
corresponding event handlers are invoked.
`HTML::Parser' is not a generic SGML parser. We have tried to
make it able to deal with the HTML that is actually "out there", and
it normally parses as closely as possible to the way the popular web
browsers do it instead of strictly following one of the many HTML
specifications from W3C. Where there is disagreement, there is often
an option that you can enable to get the official behaviour.
The document to be parsed may be supplied in arbitrary chunks. This
makes on-the-fly parsing as documents are received from the network
If event driven parsing does not feel right for your application, you
might want to use `HTML::PullParser'. This is an `HTML::Parser'
subclass that allows a more conventional program structure.