A new standard is promising to smooth the transition from HTML to XML. XHTML is a blending of HTML and XML. The first iteration, XHTML Version 1.0, lets Web sites migrate to XML while allowing their content to remain visible in old browsers.
XHTML redraws the lines blurred by browser makers. HTML is once again meant to be a structural language - tags denote headings and where paragraphs begin and end. Style sheets handle presentation: the placement of items on a page, font and color.
Senior writer Mathew Schwartz talked with Steven Pemberton, chairman of the World Wide Web Consortium's (W3C) HTML working group, the developer of XHTML, to ask about the latest developments. Pemberton is also a researcher at the National Research Institute for Mathematics and Computer Science in the Netherlands.
Q: What was the impetus behind creating XHTML?
A: We have to see why it was created. Back when Microsoft entered the browser wars, they started creating their own tags in HTML, just as Netscape had been. Now you had diverging HTML languages. To end those battles, the W3C decided to create a language where it was perfectly OK to invent tags and thus give the greatest flexibility to people who wanted to create Web content. That was XML. With style sheets, (cascading style sheets and Extensible Style Language), you don't need HTML anymore.
Q: Why have HTML or XHTML at all? Why not just go to XML?
A: We had a workshop in San Francisco in May 1998 to see if the industry still had a need for HTML. First, even though you have the freedom to use any tag in your pages, people didn't want to have to invent new languages; they were happy to stick to standards and not invent their own or maybe just add a little. Second, HTML does have some advantages in that you and search engines know what an HTML tag is - that it is the most important heading on a page - and search engines can do a better job of searching an HTML page because they know what the tags mean. So, people wanted HTML that would partially combine with XML. There's a lot of advantage going to XML, because it makes translations from your database to viewing something on the Web easier, for instance, and it can aid device independence.
Q: Are XHTML 1.0 pages larger than HTML pages?
A: Mostly smaller. We've recoded major sites using style sheets and markup methods, and the sites were smaller. That's because if you do XHTML, the style sheets often only have to be downloaded once for your whole site, and you don't need GIFs to represent text.
Q: Do browser makers intentionally subvert standards?
A: I think it's a learning curve for everyone. I've noticed problems as well - XML gets defined in a certain way, but then how does a reader interpret the definition? There can be confusion; it often takes a few iterations to get it right, especially with Java, cascading style sheets and plug-ins. My hope is browser makers don't say, "Sorry, we've already built products around these mistakes, we don't want to change it now."
Q: Will XHTML work on older browsers?
A: Some. In the short term, people have to do browser sniffing (version detection) if they want to use the fancy stuff in (cascading style sheets). A lot of sites have to do browser sniffing anyway, because Netscape and Internet Explorer have different implementations of JavaScript. There are work-arounds now, but hopefully they won't be necessary in the future.