Now that you know the basics, this article explains how to use XML's more advanced constructs to author complex XML documents. Entities, namespaces, CDATA blocks, processing instructions - they're all in here, together with aliens, idiots, secret agents and buried treasure.
XML entities are a bit like variables in other programming languages - they're XML constructs which are referenced by a name and store text, images and file references. Once an entity has been defined, XML authors may call it by its name at different places within an XML document, and the XML parser will replace the entity name with its actual value.
XML entities come in particularly handy if you have a piece of text which recurs at different places within a document - examples would be a name, an email address or a standard header or footer. By defining an entity to hold this recurring data, XML allows document authors to make global alternations to a document by changing a single value.
Entities come in two parts. First comes the entity definition, which always appears
within the document type declaration at the head of the document (after the prolog). In this case, the entity "copyright" has ben defined and mapped to the string "This material copyright Melonfire, 2001. All rights reserved."
<!ENTITY copyright "This material copyright Melonfire, 2001. All rights
reserved.">
Once an entity has been defined, the next step is to use it. This is accomplished
via entity references, placeholders for entity data within the document markup. Typically, an entity reference contains the entity name prefixed with either an ampersand (&) or a percentage (%) symbol and suffixed with a semi-colon(;), as below:
When a parser reads an XML document, it replaces the entity references with the
actual values defined in the document type declaration. So this document
would look like this once a parser was through with it.
<?xml version="1.0" ?>
<title>XML Basics (part 2)</title>
<abstract>A
discussion of basic XML theory</abstract>
<body>This material copyright
Melonfire, 2001. All rights reserved. Article
body goes here This material copyright
Melonfire, 2001. All rights
reserved.</body>
</article>
Note that entities must be declared before they are referenced, and must appear
within the document type declaration. If a parser finds an entity reference without a corresponding entity declaration, it will barf and produce some nasty error messages.
XML comes with the following five pre-defined entities:
< - represents the less-than (<) symbol.
> represents the greater-than (>) symbol
' represents the single-quote (') symbol
"e; represents the double-quote(") symbol
& represents the ampersand (&) symbol
Entities can contain XML markup in addition to ordinary text - the following is a perfectly valid entity declaration:
<!ENTITY copyright "This material copyright <link>Melonfire</link>,
<publication_year>2001</publication_year>.
All rights reserved.">
Entities may be "nested"; one entity can reference another. Consider the following
entity declaration, which illustrates this rather novel concept.
<!ENTITY company "Melonfire">
<!ENTITY year "2001">
<!ENTITY copyright
"This material copyright &company;, &year;. All rights
reserved.">
Note, however, that an entity cannot reference itself, either directly or indirectly,
as this would result in an infinite loop (most parsers will warn you about this.) And now that I've said it, I just know that you're going to try it out to see how much damage it causes.
This article copyright Melonfire 2001. All rights reserved.