This is the first installment of a series of blog posts to discuss HTML, XML and XHTML, what they are and why more and more people are moving from HTML to XHTML when producing web content.
This first article looks at the history of these markup languages.
In the beginning…
HTML and the Web have grown up together. Originally, HTML was a simple markup language defined using SGML, which was intended for the creation of documents that referenced each other. Using hyperlinks, these references could be easily followed, rather than needing to go back to the catalogs to search for the referenced articles.
As the Web grew in reach and popularity, more and more functions (such as fonts, tables, forms) were added to HTML by the browser manufacturers working in competition, delivering exciting new features that only worked on one browser. Designing web sites became more challenging, more fun and more frustrating at the same time.
Then came the standards…
As time moved on, it became clear that this way of working was hindering, rather than helping, the development of the web. The W3C decided to increase its involvement, and its HTML 4 and CSS 1 “recommendations” became the first standards to lead, rather than follow the technology.
Defining a clear and clean “standard” way of expressing documents, these two standards are the core of what HTML content on the Web should be today. Unfortunately, the browsers have generally failed to fully adopt the W3C’s recommendations.
Then came XML…
HTML is an SGML application, specific to the generation of web page content. While this can be used to describe “anything” to other people, the document structure stops it being more immediately comprehensible to machines. The markup only enables browsers to understand the relative significance of the different pieces of text; it does not allow the creation of more complex structures, for example to describe all the properties of different food products, such as nutrition information.
XML was created to allow the definition of computer-comprehensible descriptions of artifacts. Using HTML’s heritage from SGML, XML appears superficially similar. But while the rules of HTML define the exact vocabulary of concepts that can be conveyed, XML was designed to be extensible: that is, the set of concepts which can be directly conveyed – through computer-comprehensible markup – is unlimited. As XML evolved, interoperable schemas were introduced, which allowed a single XML document to incorporate multiple distinct content types, all machine-readable, all machine-verifiable.
And XHTML …
With XML came the notion of XML languages to describe different types of content. One of the first to be adopted was the definition of hypertext documents – XHTML. With the same semantic description as HTML, XHTML documents appear similar to their HTML relatives, but nevertheless carry some key advantages.
Making the Switch
It’s generally considered relatively easy to switch from HTML to XHTML, since the differences are largely syntactic, although some issues are bigger than that. But making all those changes, particularly the ones that are completely incompatible, can be a pain.
Using Aggiorno
Aggiorno is a software product that can rapidly assist in making the switch from HTML to XHTML by automatically applying all the right changes for you. Working from within the Visual Studio environment, one action can operate on multiple legacy files at the same time, fixing errors and removing outdated constructs while converting everything to the latest Web Standard XHTML.