Parsing HTML Data
An HTML document looks something like the following, but usually with a lot more content:
<!doctype html> <html lang="en"> Â Â Â Â <head> Â Â Â Â Â Â Â Â <title>Example Document</title> Â Â Â Â </head> Â Â Â Â <body> Â Â Â Â Â Â Â Â <p>A man, a plan, a canal. Panama.</p> Â Â Â Â </body> </html>
HTML structures a document into a tree-like format, as shown in this example by indentation. The <head>
element appears inside the <html>
element. The <title>
element appears inside the <head>
element. An HTML document can have many levels of hierarchy.
Note
Most web browsers provide an option to view a page's source. Select that and you'll see the HTML for the page.
When you run a GET request from a Java application, you need...