In this article, Martin Bond discusses XML and its associated APIs and standards, and how XML can be used to create flexible structured data that is inherently portable. This excerpt is from chapter (Day) 16 of Teach Yourself J2EE in 21 Days, second edition, by Martin Bond, et. al. (Sams, ISBN: 0672325586)
When designers define an XML structure for some data, they are free to choose tag names that are appropriate for the data. Consequently, there is nothing to stop two individuals from using the same tag name for different purposes or in different ways. Consider the job agency that deals with two contract companies, each of which uses a different form of job description (such as those in Listings 16.3 and 16.4). How can an application differentiate between these different types of book descriptions?
The answer is to use namespaces. XML provides namespaces that can be used to impose a hierarchical structure on XML tag names in the same way that Java packages provide a naming hierarchy for Java methods. You can define a unique namespace with which you can qualify your tags to avoid them being confused with those from other XML authors.
An attribute called xmlns (XML Namespace) is added to an element tag in a document and is used to define the namespace. For example, the second line in Listing 16.5 indicates that the tags for the whole of this document are scoped within the agency namespace.
Listing 16.5 XML Document with Namespace
<?xml version ="1.0"?>
<jobSummary xmlns="agency">
<job customer="winston" reference="Cigar Trimmer">
<location>London</location>
<description>Must like to talk and smoke</description>
<skill>Cigar maker</skill>
<skill>Critic</skill>
</job>
<job customer="george" reference="Tree pruner">
<location>Washington</location>
<description>Must be honest</description>
<skill>Tree surgeon</skill>
</job>
</jobSummary>
The xmlns attribute can be added to any element in the document to enable scoping of elements, and multiple namespaces can be defined in the same document using a prefix. For example, Listing 16.6 has two namespaces—ad and be. All the tags have been prefixed with the appropriate namespace, and now two different forms of the job tag (one with attributes and one without) can coexist in the same file.
Listing 16.6 XML Document with Namespaces
<?xml version ="1.0"?>
<jobSummary xmlns:ad="ADAgency" xmlns:be="BEAgency">
<ad:job customer="winston" reference="Cigar Trimmer">
<ad:location>London</ad:location>
<ad:description>Must like to talk and smoke</ad:description>
<ad:skill>Cigar maker</ad:skill>
<ad:skill>Critic</ad:skill>
</ad:job>
<be:job>
<be:customer>george</be:customer>
<be:reference>Tree pruner</be:refenence>
<be:location>Washington</be:location>
<be:description>Must be honest</be:description>
<be:skill>Tree surgeon</be:skill>
</be:job>
</jobSummary>
Creating Valid XML
As you have seen, XML validators recognize well-formed XML, and this is very useful for picking up syntax errors in your document. Unfortunately, a well-formed, syntactically-correct XML document may still have semantic errors in it. For example, a job in Listing 16.4 with no location or skills does not make sense, but without these elements, the XML document is still well-formed, but not valid.
What is required is a set of rules or constraints that define a valid structure for an XML document. There are two common methods for specifying XML rules—the Document Type Definition (DTD) and XML Schemas.
Document Type Definitions
A DTD provides a template that defines the occurrence, and arrangement of elements and attributes in an XML document. Using a DTD, you can define
Element ordering and hierarchy
Which attributes are associated with an element
Default values and enumeration values for attributes
Any entity references used in the document (internal constants, external files, and parameters)
Note - Entity references are covered in Appendix A, "An Overview of XML."
DTDs originated with SGML and have some disadvantages when compared with XML Schemas, which were developed explicitly for XML. One of these disadvantages is that a DTD is not written in XML, which means you have to learn another syntax to define a DTD. Another disadvantage is that DTD's are not as comprehensive as XML Schemas and cannot therefore constrain an XML document as tightly as an XML Schema.
DTD rules can be included in the XML document as document type declarations, or they can be stored in an external document. The syntax is the same in both cases.
If a DTD is being used, the XML document must include a DOCTYPE declaration, which is followed by the name of the root element for the XML document. If an external DTD is being used, the declaration also includes the word SYSTEM followed by a system identifier (the URI that identifies the location of the DTD file). For example
<!DOCTYPE jobSummary SYSTEM "jobSummary.dtd">
specifies that the root element for this XML document is jobSummary and the remainder of the DTD rules are in the file called jobSummary.dtd in the same directory.
An external identifier can also include a public identifier. The public identifier precedes the system identifier and is denoted by the word PUBLIC. An XML processor can use the public identifier to try to generate an alternative URI. If the document is unavailable by this method, the system identifier will be used.
<!DOCTYPE web-app
PUBLIC '-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN'
'http://java.sun.com/dtd/web-app_2_3.dtd'>
Note -DOCTYPE,SYSTEM and PUBLIC must appear in capitals to be recognized.
This chapter is from Teach Yourself J2EE in 21 Days, second edition, by Martin Bond et. al. (Sams, 2004, ISBN: 0-672-32558-6). Check it out at your favorite bookstore today. Buy this book now.