learn xml in 11.5 minutes by l.c. rees

To learn XML, open a text editor like SimpleText or Notepad and type:


<?xml version="1.0"?>
<PARENT>
<CHILD>
This is content.
</CHILD>
<EMPTY/>
</PARENT>

Save this as wellform.xml. Let's examine the XML you just created in detail.

<?xml version="1.0"?> declares your XML is version 1.0, so programs know to use it like XML 1.0.

<PARENT> is markup. XML is divided into markup and content. Markup is information about content, describing it to any level of detail desired. Correct markup must follow certain rules. First, markup containing content must have opening and closing tags. <PARENT> is an opening tag. </PARENT> is a closing tag. Opening tags start with < and close with >. Closing tags start with </ and close with >.

Second, markup must nest properly. Markup tags divide into parents and children. Parent markup encloses child markup. A child's opening and closing tags must be contained within its parent's opening and closing tags. You can have...

 
 
<PARENT> 
<CHILD> 
This is content. 
</CHILD> 
</PARENT> 
 
...but not...
 
 
<PARENT> 
<CHILD> 
This is content. 
</PARENT> 
</CHILD> 
 
...or...
 
 
<CHILD> 
<PARENT> 
This is content.
</CHILD> 
</PARENT> 

Third, if markup contains no content, it must begin with < and end with /> like <EMPTY/>.

So if you...

  1. Declare your XML version.
  2. Begin and end opening tags with < and > and closing tags with </ and >.
  3. Insure child markup nests completely within parent markup.
  4. Start empty markup with < and end it with />.
...your XML is well-formed. wellform.xml is the simplest XML possible. If you need more power, you need valid XML. Using your text editor, type in:

<?xml version="1.0"?>
<!DOCTYPE PARENT [
<!ELEMENT PARENT (CHILD*)>
<!ELEMENT CHILD (MARK?,NAME+)>
<!ELEMENT MARK EMPTY>
<!ELEMENT NAME (LASTNAME+,FIRSTNAME+)*>
<!ELEMENT LASTNAME (#PCDATA)>
<!ELEMENT FIRSTNAME (#PCDATA)>
<!ATTLIST MARK
            NUMBER ID #REQUIRED
            LISTED CDATA #FIXED "yes"
            TYPE (natural|adopted) "natural">
<!ENTITY STATEMENT "This is well-formed XML">
]>
<PARENT>
&STATEMENT; 
<CHILD> 
<MARK NUMBER="1" LISTED="yes" TYPE="natural"/>  
<NAME> 
<LASTNAME>child</LASTNAME> 
<FIRSTNAME>second</FIRSTNAME> 
</NAME>
</CHILD>
</PARENT>

Save it as valid.xml. Valid XML is more complex than the well-formed XML used in simple.xml so let's examine its parts in detail.

<?xml version="1.0"?>, fills the same role it did in simple.xml. The second line, <!DOCTYPE PARENT [, declares this section is a document type definition or DTD and its name(PARENT).

A DTD is the primary distinction between well-formed and valid XML. Well-formed XML can give you a vague idea of an XML document's purpose but leaves room for doubt. A DTD eliminates this by providing a stringent standard to measure a document against.

A DTD declares each part of an XML document and its proper form exactly. This DTD is named PARENT. DTD's are enclosed between an opening [ and a closing ]>. <!ELEMENT PARENT (CHILD)*>, shows a DTD's most basic part, the element.

An element defines markup's name and form. <!ELEMENT PARENT (CHILD*)> declares:

  1. The markup tag's name (PARENT).
  2. The name of any child markup found within it (CHILD).
  3. How often it and any child markup within it are needed and can appear. (Both are optional and can appear more than once, as indicated by the *).

The next line, <!ELEMENT CHILD (MARK?,NAME+)>, lists the children of the child CHILD. It lists:

  1. The element's name (CHILD).
  2. The name of its children (MARK and NAME).
  3. That some of its children appear once or not at all (MARK, as indicated by the ?).
  4. That one of the children must appear one or more times (NAME, as indicated by the +).
You can have...
 
 
<CHILD> 
<NAME> 
</NAME> 
</CHILD> 
 
...or...
 
 
<CHILD> 
<MARK/> 
<NAME> 
</NAME> 
</CHILD> 
 
...but never...
 
 
<CHILD> 
<MARK/> 
</CHILD> 
 

<!ELEMENT MARK EMPTY>shows the element value EMPTY. This indicates markup containing no content like...


<MARK/>

...or...

<EMPTY/>

...from simple.xml.

The following line, <!ELEMENT NAME (LASTNAME+,FIRSTNAME+)*>, tells us:

  1. The element name (NAME).
  2. That the children of the element must appear sequentially (as indicated by the comma).
  3. That these choices may be made more than once or not at all (as indicated by the *).

The next two lines contain #PCDATA. #PCDATA indicates when markup contains content. This can be anything and does not have to follow the same rules as markup.

The following line contains another DTD fundamental, the attribute An attribute is a description given to an element to further define it. Attributes are declared in attribute lists. The attribute list in valid.xml...

 
 
<!ATTLIST MARK 
            NUMBER ID #REQUIRED 
            LISTED CDATA #FIXED "yes" 
            TYPE (natural|adopted) "natural"> 
 
...tells you:
  1. The element the attribute list is attached to (MARK).
  2. That the first attribute (NUMBER) is unique text (as indicated by ID) and required (as indicated by #REQUIRED).
  3. That the second attribute (LISTED) is regular text (as indicated by CDATA) and fixed (as indicated by #FIXED).
  4. That the third attribute (TYPE) is a choice (as indicated by the |) between two values "natural" and "adopted") and what the default choice is ("natural").
The result is...
 
 
<MARK NUMBER="1" LISTED="yes" TYPE="natural"/> 
 
...where, of the attributes of MARK, "1" is required and unique, "yes" is regular text and fixed, and "natural" is the chosen, default choice.

VALID.XML next brings up a third DTD part, the entity. The entity points to a something that can be inserted at any point in the XML document. The line <!ENTITY STATEMENT "This is well-formed XML"> inserts This is well-formed XML whenever &STATEMENT; appears.

The DTD then closes with ]> and the rest of the XML document follows it as outlined.

You have learned XML in 11.5 minutes.

.
home