Ever tried to read a DTD, and failed miserably? Ever wondered what all those symbols and weird language constructs meant? Well, fear not - this crash course will get you up to speed with the basics of DTD design in a hurry.
Now that you know the basics of linking and validating XML data against DTDs, let's focus in on the different components that actually go into a DTD.
All XML documents consist of some combination of elements, attributes, entities and character data. In case you've forgotten what these are, here are some quick definitions:
An element, which is the basic unit of XML, consists of textual content (character data), enhanced with descriptive tags.
<dinosaur>Stegosaurus</dinosaur>
An attribute is a name-value pair which provides additional descriptive parameters
or default values to an element.
<person sex="male">Spiderman</person>
An entity is an XML construct, referenced by name, which stores text, images
and file references; it is primarily used as a mechanism to store and reuse content which appears in multiple places within an XML document.
<!ENTITY copyright "This material copyright Melonfire, 2001. All rights
reserved.">
Each of these basic constructs can be defined in a DTD. I'll begin with element
declarations, which typically look like this:
<!ELEMENT elementName (contentType)>
As an example, consider the "forecast" element from the previous example:
<!ELEMENT forecast (#PCDATA)>
In English, this declares an element with name "forecast" and content of the
form "parsed character data" (in case you're wondering, this means that the parser will parse the contents of the "forecast" element, automatically processing its child elements and entities).
The alternative to parsed character data is regular character data, which will be treated as literal text by the parser without any further processing. Here's an example of this type of element declaration:
<!ELEMENT greeting (#CDATA)>
In case you don't want to specify a content type, you can escape without making
a decision by allowing any content.
<!ELEMENT address ANY>
Of course, doing this kinda negates the purpose of having a DTD in the first
place...
If an element contains nested child elements, it's necessary to specify these element names within the declaration. In the following example,
<?xml version="1.0"?>
<book>
<author>Stephen King</author>
<title>Bag
Of Bones</title>
<price>$9.99</price>
</book>
the "book" element contains four child elements nested within it - which is why
its element declaration in the DTD looks like this:
<!ELEMENT book (author, title, price)>
XML also allows for so-called empty elements - essentially, elements which have
no content and therefore do not require a closing tag. Such elements are closed by adding a slash (/) to the end of their opening tag. Consider the following XML snippet
<?xml version="1.0"?>
<rule>Every sentence ends with a <period /></rule>
and then take a look at the corresponding empty element declaration: