HTTP and HTML
You have probably heard this story many times, but the World Wide Web would not have existed without it. It is the story of Tim Berners Lee, an engineer at the European Centre for Nuclear Research (CERN). The centre had many computers that were, of course, hooked up to the Internet. It also produced a tremendous amount of data and documents, and that became almost impossible to manage. Tim worked out a solution by developing a language to write these documents in, a protocol on top of the Internet to manage them, as well as a computer program for users to access them.
HTML
HTML, short for Hypertext Markup Language is the name of that language. Hypertext is text that contains hyperlinks, which in turn are those parts of a document which readers can click on to take them to a different document, using the link. You have all seen the blue underlined parts of a text, in not-so-good-looking web pages. These are hyperlinks.
A document in HTML consists of tags, with text in between them. There are opening and closing tags for example, as follows:
<h1>Hello, world</h1>
Here, <h1>
is the opening tag and </h1>
the closing one. We will learn about a similar markup language: XML. HTML and XML are not the same though. One important difference is that in XML you can define your own tags, as long as you close each one you've opened. XML is used to transfer the data and the tags are used to organize the data.
In HTML, tags do have a specific meaning. <h1>
would be used in a document for the text of a level one header. A <a>
tag—the anchor tag—is the one used to include the hyperlinks we just discussed. So the purpose of writing HTML is not to transfer data, but to present it to human users.
To do so, these tags are interpreted by the computer program we mentioned earlier. Such a program is called a browser. When the reader clicks on a hyperlink, the browser will detect that as well, and send a request to yet another program, the web server, to go fetch another document.
HTTP
This is where HTTP, the Hypertext Transfer Protocol fits in. If a user clicks on a link, it is like saying: go fetch another HTML document. The name of that document would be part of a longer string that starts with http://
and also contains the domain name of the server. It is called a uniform resource locator, but we all refer to it as URL. Following is an example: http://www.paulpwellens.com/examples/secondpage.html.
What you can do with HTTP has evolved over time and we will learn about it later on in the book, but for now we need to move on with our history lesson. One more little tidbit of history for you: guess how our friend Tim called his browser, the first ever browser: WorldWideWeb. He later renamed it to avoid confusion.
The World Wide Web Consortium (W3C)
After he left the CERN in 1994, Tim Berners Lee founded the World Wide Web Consortium (W3C). The consortium tries to enforce compatibility and agreement between vendors that deliver components for the web. Incompatible versions of HTML would cause browsers to render web pages differently; and incompatible features added to browsers have the same unexpected result.
If you visit www.w3.org, the consortiums website, you will notice that the W3C has evolved into a standards body for many technologies, but even in those days, having such an organization was sorely needed.