The World Wide Web is undergoing a radical change that will introduce wonderful services for users and amazing new opportunities for Web site developers and businesses.
HTML - the HyperText Markup Language - made the Web the world's library. Now its sibling, XML - the Extensible Markup Language has begun to make the Web the world's commercial and financial hub. XML has just been approved as a W3C Recommendation, and already there are millions of XML files out there, with more coming online every day.
You can see why by comparing XML and HTML. Both are based on SGML -- the International Standard for structured information Ð but look at the difference:
In HTML:
P200 Laptop
Friendly Computer Shop
$1438
In XML:
Both of these may appear the same in your browser, but the XML data is smart data. HTML tells how the data should look, but XML tells you what it means.
With XML, your browser knows there is a product, and it knows the model, dealer, and price. From a group of these it can show you the cheapest product or closest dealer without going back to the server. Unlike HTML, with XML you create your own tags, so they describe exactly what you need to know. Because of that, your client-side applications can access data sources anywhere on the Web, in any format. New "middle-tier" servers sit between the data sources and the client, translating everything into your own task-specific XML.
But XML data isn't just smart data, it's also a smart document. That means when you display the information, the model name can be a different font from the dealer name, and the lowest price can be highlighted in green. Unlike HTML, where text is just text to be rendered in a uniform way, with XML text is smart, so it can control the rendition.
And you don't have to decide whether your information is data or documents; in XML, it is always both at once. You can do data processing or document processing or both at the same time.
With that kind of flexibility, it's no wonder that we're starting to see a Brave New Web of smart, structured information. Your broker sends your account data to Quicken using XML. Your ÒpushÓ technology channel definitions are in XML. Everything from math to multimedia, chemistry to CommerceNet, is using XML or is preparing to start.
You should be too!
Welcome to the Brave New XML Web.
What about SGML?
This book is about XML. You won't find feature comparisons to SGML, or footnotes with nerdy observations like "the XML empty-element tag does not contradict the rule that every element has a start-tag and an end-tag because, in SGML terms, it is actually a start-tag followed immediately by a null end-tag".
Nevertheless, for readers who use SGML, it is worth addressing the question of how XML and SGML relate. There has been a lot of speculation about this.
Some claim that XML will replace SGML because there will be so much free and low-cost software. Others assert that XML users, like HTML users before them, will discover that they need more of SGML and will eventually migrate to the full standard.
Both assertions are nonsense ... XML and SGML don't even compete.
XML is a simplified subset of SGML. The subsetting was optimized for the Web environment, which implies data-processing-oriented (rather than publishing-oriented), short life-span (in fact, usually dynamically-generated) information. The vast majority of XML documents will be created by computer programs and processed by other programs, then destroyed. Humans will never see them.
Eliot Kimber, a member of both the XML and SGML standards committees, says:
There are certain use domains for which XML is simply not sufficient and where you need the additional features of SGML. These applications tend to be very large scale and of long term; e.g., aircraft maintenance information, government regulations, power plant documentation, etc.
Any one of them might involve a larger volume of information than the entire use of XML on the Web. A single model of commercial aircraft, for example, requires some four million unique pages of documentation that must be revised and republished quarterly. Multiply that by the number of models produced by companies like Airbus and Boeing and you get a feel for the scale involved.
I invented SGML, I'm proud of it, and I'm awed that such a staggering volume of the world's mission-critical information is represented in it.
I'm also proud of XML. I'm proud of my friend Jon Bosak who made it happen, and I'm excited that the World Wide Web is becoming XML-based.
If you are new to XML, don't worry about any of this. All you need to know is that the XML subset of SGML has been in use for a decade or more, so you can trust it.
I am writing this the day after a meeting of the ISO committee that develops the SGML standard. We had the largest attendance in our 20-year history at that meeting. Interest in SGML has never been higher. You should share that interest if you produce documents on the scale of an Airbus or Boeing. For the rest of us, there's XML.
About our sponsors:
With all the buzz surrounding a hot technology like XML, it can be tough for a newcomer to distinguish the solid projects and realistic applications from the fluff and the fantasies. Our solution was to seek out companies with real products and realistic applications and tell their stories in sufficient detail that readers can see for themselves what is believable.
The application chapters are about what can be done with XML, extrapolating from actual experience with one or more users or prototype implementations. The case studies describe the XML experiences of specific named enterprises.
Some applications and case studies were done with full SGML before XML had a formal existence, but are within XML's capabilities. These are described as having been done with XML. Part of the proof of XML's viability is that people have used its core functions for over a decade.
The primary purpose of the tool chapters is to provide the vicarious experience of using a variety of XML tools without the effort of obtaining evaluation copies and installing them. They also provide useful information about uses and benefits of XML in general, which supplements the application-oriented discussions in the earlier parts of the book.
There are also two sponsored chapters on new XML-related technologies.
All sponsored chapters are identified with the name of the sponsor, and sometimes with the names of the experts who prepared the original text. All of the chapters were edited by me, sometimes extensively, in order to integrate them into the book. The editing objectives were to establish consistency of terminology and style, and to eliminate unnecessary duplication among the chapters. I believe the result was faithful to the intentions of the expert preparers with regard to bringing out the important characteristics of their applications and products.
The sponsorship program was organized by Linda Burman, the president of L. A. Burman Associates, a consulting company that provides marketing and business development services to the XML and SGML industries.
We are grateful to our sponsors just as we are grateful to you, our readers. Both of you together make it possible for the XML Handbook to exist. In the interests of everyone, we make our own editorial decisions and we don't recommend or endorse any product or service offerings over any others.
Our fourteen sponsors are:
n Adobe Systems Incorporated, http://www.adobe.com n ArborText, Inc., http://www.arbortext.com n C