In this article, Martin Bond discusses XML and its associated APIs and standards, and how XML can be used to create flexible structured data that is inherently portable. This excerpt is from chapter (Day) 16 of Teach Yourself J2EE in 21 Days, second edition, by Martin Bond, et. al. (Sams, ISBN: 0672325586)
In this section, you will explore the syntax of XML and understand what is meant by a well-formed document.
Note - You will often encounter the terms "well-formed" and "valid" applied to XML documents. These are not the same. A well-formed document is structurally and syntactically correct (the XML conforms to the XML language definition, that is all tags have a correctly nested corresponding end tag, all attributes are quoted, only valid characters have been used, and so on), whereas a valid document is also semantically correct (the XML conforms to some external definition stored in an XML Schema or Document Type Definition). A document can be well-formed but may not be valid.
The best way to become familiar with the syntax of XML is to write an XML document. To check your XML, you will need access to an XML-aware browser or another XML validator. The XML-aware browser or XML validator will allow you to ensure that the XML is well-formed. If the XML references an XML Schema or Document Type Definition (more on these later) the validator can also check that the XML is valid.
An XML browser includes an XML parser. To get the browser to check the syntax and structure of your XML document, simply use the browser to open the XML file. Well-formed XML will be displayed in a structured way (with indentation). If the XML is not well-formed, an appropriate error message will be given.
At first glance, XML looks very similar to HTML. An XML document consists of elements that have a start and end tag, just like HTML. In fact, Listing 16.1 is both well-formed HTML and XML.
Listing 16.1 Example XML and HTML
<html>
<head><title>Web Page</title></head>
<body>
<h1>Teach Yourself J2EE in 21 Days</h1>
<p>Now you have seen the web page – buy the book</p>
</body>
</html>
An XML document is only well-formed if there are no syntax errors. If you are familiar with HTML, you will be aware that many browsers are lenient with poorly formed HTML documents. Missing end tags and even missing sections will often be ignored and therefore unnoticed until the page is displayed in a more rigorous browser, and fails to display correctly.
XML differs from HTML in that a missing end tag will always cause an error.
We will now look at XML syntax so you can understand what is going on.
This chapter is from Teach Yourself J2EE in 21 Days, second edition, by Martin Bond et. al. (Sams, 2004, ISBN: 0-672-32558-6). Check it out at your favorite bookstore today. Buy this book now.