A Simple Overview of W3C XML Schema
Version 1.0
Copyright © 2007-2010 Codalogic Ltd.

Introduction

This tutorial is intended to give a brief introduction to W3C XML Schema. It is not intended to cover all aspects of XML Schema, although it is hoped that after reading this you will be able to do useful work using Schema.

The focus of this tutorial is on the data-oriented use of XML Schema, with the intention that it gives a grounding in XML Schema that will be applicable to using tools such as LMX. Further tutorials in this series may be added from time to time on the LMX Support Page.

For a fuller description of XML Schema, we recommend the XSD Primer and the book XML Schema by Eric Van Der Vlist available at Amazon.com and Amazon.co.uk.

Contents

1 - Prerequisites
2 - Approach
3 - What does XML Schema Look Like?
4 - XML Schema Tools
5 - Schema Preamble
6 - Defining Elements (Part 1)
7 - Specifying How Many Times an Element Can Appear
8 - Specifying How Many Times an Attribute Can Appear
9 - Specifying Your Own Types
   9.1 - Adding Attributes to a Type with a Simple Type Body
   9.2 - Defining Types that Contain Multiple Elements
   9.3 - Defining Types that Contain One of a Selection of Elements
   9.4 - Specifying an Empty Element
10 - Defining More Restricted Simple Types
   10.1 - Same Simple Type, Different Name
11 - Defining Elements (Part 2) - Defining Elements using Local Types
12 - Documenting Your Schema
13 - Putting It All Together
14 - Simple Steps to Writing a Schema
15 - Experimenting with Schemas
16 - Things We've Not Covered
17 - Please Rate This Article

[Top] [Contents]

1 - Prerequisites

Before reading this guide you should have a basic understanding of how XML represents data. It is also advantageous to understand XML namespaces. For further information on these topics try looking for other tutorials in this series on the LMX Support Page.

[Top] [Contents]

2 - Approach

It can be very difficult to learn XML Schema completely. Hence, this guide only sets out to give you a basic working knowledge of XML Schema.

To make it easier to use XML Schema for your own purposes this guide is designed so that you can cut-and-paste snippets of it into your own document. As such, most examples are of the form, "If you want 'X' in your XML instance, do 'Y' in your Schema."

Some of the fields in the examples need to be customised to your application. Such fields are generally called "MyThing". For example, MyElement, MyType, MyDomain, MySchema etc. If there is more than one type of 'thing' in an example, then a number will be appended to them to differentiate them. For example, MyElement1, MyElement2, MyElement3 etc. You should change these to more appropriate names if you paste the snippets into your Schema.

Some snippets require additional content to make them complete. Where this is the case, we will include ellipsis (...), perhaps with a comment, to signify this.

We won't necessarily explain everything in detail. Where we have not explained something sufficiently, we suggest simply copying the un-described fragment to your application along with the other parts that are more fully described.

[Top] [Contents]

3 - What does XML Schema Look Like?

Before we move into the details, we first show you an example of an XML Schema. The reason for doing this is not so that you can immediately understand it, but so that you can get an idea of what this guide is aiming for you to be able to understand.

The schema definition we are going to show you is a hypothetical example of storing an application's configuration and is as follows:

    <?xml version="1.0" encoding="utf-8" ?> 
    <xs:schema targetNamespace="http://xml2cpp.com/config.xsd"
                    xmlns="http://xml2cpp.com/config.xsd"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    elementFormDefault="qualified" >

        <xs:element name="config" type="ConfigType"/>
        
        <xs:complexType name="ConfigType">
            <xs:sequence>
                <xs:element name="userName" type="xs:string" minOccurs="0"/>
                <xs:element name="maxRecentProjects" type="Int1To10Type"/>
                <xs:element name="recentProject" type="xs:string" 
                                            minOccurs="0" maxOccurs="10"/>
                <xs:element name="childWindow" type="WindowType" 
                                            minOccurs="0" maxOccurs="unbounded"/>
                <xs:element name="dictionary" type="DictionaryFileType" 
                                            minOccurs="0"/>
            </xs:sequence>
        </xs:complexType>
        
        <xs:simpleType name="Int1To10Type">
            <xs:restriction base="xs:int">
                <xs:minInclusive value="1"/>
                <xs:maxInclusive value="10"/>
            </xs:restriction>
        </xs:simpleType>

        <xs:complexType name="WindowType">
            <xs:attribute name="name" type="xs:string" use="required"/>
            <xs:attribute name="width" type="xs:unsignedInt" use="required"/>
            <xs:attribute name="height" type="xs:unsignedInt" use="required"/>
        </xs:complexType>
        
        <xs:complexType name="DictionaryFileType">
            <xs:simpleContent>
                <xs:annotation><xs:documentation>
                The base type is the name of the dictionary file.
                </xs:documentation></xs:annotation>
                <xs:extension base="xs:string">
                    <xs:attribute name="language" type="xs:string"/>
                </xs:extension>
            </xs:simpleContent>
        </xs:complexType>
        
    </xs:schema>

From the above definition you might be able to pick out that something named config is defined to be an element that has the type ConfigType. From that you might deduce that an instance of that element would look something like:
    <config ...>
    ...
    </config>
You may also infer that the entity called ConfigType defines a type that contains multiple child elements.

You may also be able to surmise other features about a valid XML instance of the Schema. But don't worry if you can't at this stage as it will hopefully become clearer soon.

For the record, an example XML instance conforming to above Schema is as follows:

    <config xmlns="http://xml2cpp.com/config.xsd">
        <userName>John Smith</userName>
        <maxRecentProjects>10</maxRecentProjects>
        
        <recentProject>z:\projects\project1.prj</recentProject>
        <recentProject>z:\projects\project5.prj</recentProject>
        <recentProject>z:\projects\project3.prj</recentProject>
        <recentProject>z:\projects\project2.prj</recentProject>
        
        <childWindow name="Input" width="200" height="100"/>
        <childWindow name="Output" width="200" height="150"/>
        <childWindow name="Transformation" width="300" height="100"/>
        
        <dictionary language="en">z:\projects\dictionary.dic</dictionary>
    </config>
Now, let's dive in!

[Top] [Contents]

4 - XML Schema Tools

In addition to a basic text editor, a large number of tools can be used to help define W3C XML Schemas. These include:

[Top] [Contents]

5 - Schema Preamble

An XML Schema contains some basic preamble and boilerplate material specifying global details about the schema.

If you are assigning your Schema to a namespace (recommended) then the basic outline of a Schema is:

    <xs:schema targetNamespace="http://MyDomain.com/MySchema.xsd"
                    xmlns="http://MyDomain.com/MySchema.xsd"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    elementFormDefault="qualified" >
    ...element definitions...
    ...type definitions...
    </xs:schema>
The first thing to note is that the schema definition is defined in an xs:schema element.

The targetNamespace attribute specifies the namespace to which the schema is being assigned. The xmlns attribute further assigns the target namespace to be the default namespace.

The xmlns:xs specifies that the schema language directives are assigned the xs: namespace prefix. Thus, the schema language directives become xs:schema, xs:element, xs:attribute, xs:complexType, and so on.

If you are familiar with how XML namespaces work, you can assign the various namespaces to different namespace prefixes if you desire. (For example, the XML Schema namespace is also commonly assigned to the xsd: namespace prefix. We save ourselves the extra typing!)

The elementFormDefault attribute declares that all element names in an XML instance should be qualified with (i.e. associated with) a namespace, either via a namespace prefix (e.g. MyNSPrefix:MyElement) or the defined default namespace (e.g. MyElement).

The resulting XML instance would be of the form:

    <MyElement xmlns="http://MyDomain.com/MySchema.xsd" ...>
    ...more to go here...
    </MyElement>
If you are not assigning your schema to a namespace, then the schema preamble is:
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" >
    ...element definitions...
    ...type definitions...
    </xs:schema>
This will result in an XML instance similar to:
    <MyElement ...>
    ...more to go here...
    </MyElement>

[Top] [Contents]

6 - Defining Elements (Part 1)

An element can be defined using an xs:element element of the form:
        <xs:element name="MyElement" type="...Type..."/>
The type part can be one of the (simple) types defined in the XML Schema specifications (such as xs:int, xs:string etc.) or a type defined elsewhere in your schema.

The types defined by XML Schema are:

Schema TypeDescription
xs:boolean A Boolean value.
xs:string A string; typically Unicode
xs:byte A signed 8-bit number.
xs:short A signed 16-bit number.
xs:int A signed 32-bit number.
xs:long A signed 64-bit number.
xs:unsignedByte An unsigned 8-bit number.
xs:unsignedShort An unsigned 16-bit number.
xs:unsignedInt An unsigned 32-bit number.
xs:unsignedLong An unsigned 64-bit number.
xs:integer, xs:nonPositiveInteger, xs:negativeInteger,
xs:nonNegativeInteger, xs:positiveInteger
Unbounded or partially bounded integers. It is recommended that these types are avoided in schemas that are intended to be used with binding tools.
xs:decimal A decimal number that includes a fractional part but is not specified using an exponent; for example, 123.45.
xs:float, xs:double Single and double precision floating point numbers.
xs:hexBinary, xs:base64Binary Binary data.
xs:duration For specifying elapsed time, particularly in terms of days, months, years etc.
xs:dateTime, xs:time, xs:date, xs:gYearMonth, xs:gYear, xs:gMonthDay, xs:gDay, xs:gMonth Date and time related types.
xs:anyURI, xs:QName, xs:NOTATION,
xs:normalizedString, xs:token, xs:language, xs:ID,
xs:IDREF, xs:IDREFS, xs:ENTITY, xs:ENTITIES, xs:NMTOKEN,
xs:NMTOKENS, xs:Name, xs:NCName
Other schema defined types.
Table 1: Simple Types defined by XML Schema

To use a schema defined simple type, you would do:

        <xs:element name="MyElement" type="xs:int"/>
This would appear in an XML instance as something like:
        <MyElement>123</MyElement>
A complete schema defining such an element might appear as:
    <xs:schema targetNamespace="http://MyDomain.com/MySchema.xsd"
                    xmlns="http://MyDomain.com/MySchema.xsd"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    elementFormDefault="qualified" >

        <xs:element name="MyElement" type="xs:int"/>

    </xs:schema>
To specify an element that uses a type defined in your schema, you would do:
        <xs:element name="MyElement" type="MyType"/>
You can also define your own types locally to the xs:element definition. We will return to this later in 11 - Defining Elements (Part 2) - Defining Elements using Local Types.

[Top] [Contents]

7 - Specifying How Many Times an Element Can Appear

Depending on your requirements, an element may be mandatory, optional, or be able to appear many times. The minOccurs and maxOccurs attributes in an xs:element definition specify this. (Not surprisingly, minOccurs specifies the minimum number of times an element can appear and maxOccurs specifies the maximum number of times an element can appear.)

minOccurs can be assigned any non-negative integer value (e.g. 0, 1, 2, 3... etc.), and maxOccurs can be assigned any non-negative integer value or the string constant "unbounded".

The default values of minOccurs and maxOccurs is 1.

So if both the minOccurs and maxOccurs attributes are absent, as shown in the following snippet, the element can appear once and once only:

        <xs:element name="MyElement" type="MyType"/>

To specify that an element is optional (i.e. it may or may not be present in the XML instance), do:

        <xs:element name="MyElement" type="MyType" minOccurs="0"/>
To specify that an element can be absent, but can also appear an unlimited number of times, do:
        <xs:element name="MyElement" type="MyType" 
                    minOccurs="0" maxOccurs="unbounded"/>

To specify that an element must appear at least once, but may also appear many times do:

        <xs:element name="MyElement" type="MyType" maxOccurs="unbounded"/>
Naturally it is OK to include the minOccurs and maxOccurs attributes with their default values, e.g.:
        <xs:element name="MyElement" type="MyType" 
                    minOccurs="1" maxOccurs="1"/>
Or:
        <xs:element name="MyElement" type="MyType" 
                    minOccurs="1" maxOccurs="10"/>

For more specific constraints on the occurrence, you can do things like:

        <xs:element name="MyElement" type="MyType" 
                    minOccurs="0" maxOccurs="10"/>
and:
        <xs:element name="MyElement" type="MyType" 
                    minOccurs="8" maxOccurs="27"/>

[Top] [Contents]

8 - Specifying How Many Times an Attribute Can Appear

We haven't discussed defining attributes yet, but while we're discussing how many times things can appear, it seems appropriate to address this in relation to attributes now. (We will discuss further details about attributes later.)

Attributes can either be present or absent. A particular attribute cannot appear multiple times within a single element.

The use attribute within an xs:attribute definition is used to specify how many times an attribute can appear.

By default an attribute is optional, and may or may not appear in a XML instance. To specify this, use the form:

    <xs:attribute name="MyAttribute1" type="...simple type..."/>
This is the same as saying:
    <xs:attribute name="MyAttribute1" type="...simple type..."
                  use="optional"/>
To specify that an attribute must be present, use the form:
    <xs:attribute name="MyAttribute1" type="...simple type..."
                  use="required"/>
For completeness, in some cases (for complex type restriction, which we do not cover in this tutorial) it is useful to specify that an attribute must not be present. For this case, use the form:
    <xs:attribute name="MyAttribute1" type="...simple type..."
                  use="prohibited"/>

[Top] [Contents]

9 - Specifying Your Own Types

We have already shown in 6 - Defining Elements (Part 1) how to define an element that only contains one of the simple types defined by XML Schema. Such elements appear like this in an XML instance:
        <MyElement>123</MyElement>
and are specified in your schema something like this:
        <xs:element name="MyElement" type="xs:int"/>
In this section we define how to define your own types that combine the various Schema defined simple types in much the same way as you would combine the C++ built in types (such as int) into classes, structs and unions.

[Top] [Contents]

9.1 - Adding Attributes to a Type with a Simple Type Body

If you want an element to appear in an XML instance as:
        <MyElement MyAttribute="MyData">123</MyElement>
Then you need to have schema of the form:
        <xs:complexType name="MyType">
            <xs:simpleContent>
                <xs:extension base="...the simple base type to appear in the body...">
                    <xs:attribute name="MyAttribute1" type="...simple type..."/>
                    <xs:attribute name="MyAttribute2" type="...simple type..."/>
                </xs:extension>
            </xs:simpleContent>
        </xs:complexType>
For example:
        <xs:complexType name="MyType">
            <xs:simpleContent>
                <xs:extension base="xs:int">
                    <xs:attribute name="MyAttribute1" type="xs:string"/>
                    <xs:attribute name="MyAttribute2" type="xs:string"/>
                </xs:extension>
            </xs:simpleContent>
        </xs:complexType>
For the record, this is called a complex type with simple content. (Complex type because it contains multiple data values, and simple content because the content of the element body is a simple type.)

The above example can be interpreted as saying, a new type is to be created that has a body of type xs:int (specified by the base attribute) that is extended by the addition of attributes to form a complex type with simple content.

See 8 - Specifying How Many Times an Attribute Can Appear for further information on how to specify how often an attribute can occur.

[Top] [Contents]

9.2 - Defining Types that Contain Multiple Elements

To define a type that allows you to define elements of the form:
    <MyElement>
        <MyChildElement1>123</MyChildElement1>
        <MyChildElement2>Text</MyChildElement2>
        <MyChildElement3>123.45</MyChildElement3>
    </MyElement>
Your schema should look like:
        <xs:complexType name="MyType">
            <xs:sequence>
                <xs:element name="MyChildElement1" type="...type..."/>
                <xs:element name="MyChildElement2" type="...type..."/>
                <xs:element name="MyChildElement3" type="...type..."/>
            </xs:sequence>
        </xs:complexType>
Here, the xs:sequence element is the key to specifying that all the elements can appear together. As such, this is often described as a sequence.

This is similar to a struct in C/C++ and is called a complex type with complex content (because the body of the element has multiple elements).

Note that, subject to occurrence constraints, within a sequence the elements must appear in an XML instance in the same order that they appear within the schema definition.

Each of the above elements can also have min and max occurrences defined as described in 7 - Specifying How Many Times an Element Can Appear. For example:

        <xs:complexType name="MyType">
            <xs:sequence>
                <xs:element name="MyChildElement1" type="xs:int" minOccurs="0"/>
                <xs:element name="MyChildElement2" type="xs:string" maxOccurs="10"/>
                <xs:element name="MyChildElement3" type="xs:float"/>
            </xs:sequence>
        </xs:complexType>
If you want to add attributes to your type, so that elements can appear as:
    <MyElement MyAttribute1="123" MyAttribute2="data">
        <MyChildElement1>123</MyChildElement1>
        <MyChildElement2>Text</MyChildElement2>
        <MyChildElement3>123.45</MyChildElement3>
    </MyElement>
use the following style of schema definition:
        <xs:complexType name="MyType">
            <xs:sequence>
                <xs:element name="MyChildElement1" type="...type..."/>
                <xs:element name="MyChildElement2" type="...type..."/>
                <xs:element name="MyChildElement3" type="...type..."/>
            </xs:sequence>
            <xs:attribute name="MyAttribute1" type="...type..."/>
            <xs:attribute name="MyAttribute2" type="...type..."/>
        </xs:complexType>
The number of times each attribute can appear can be specified as described in 8 - Specifying How Many Times an Attribute Can Appear. This might yield something like:
        <xs:complexType name="MyType">
            <xs:sequence>
                <xs:element name="MyChildElement1" type="xs:int" minOccurs="0"/>
                <xs:element name="MyChildElement2" type="xs:string" maxOccurs="10"/>
                <xs:element name="MyChildElement3" type="xs:float"/>
            </xs:sequence>
            <xs:attribute name="MyAttribute1" type="xs:int" use="required"/>
            <xs:attribute name="MyAttribute2" type="xs:string"/>
        </xs:complexType>

[Top] [Contents]

9.3 - Defining Types that Contain One of a Selection of Elements

You may wish to define a type that has multiple possible child elements defined, but only one of the specified child elements is allowed to be used at a time in any one instance of the type. This is similar to a union in C/C++ (although in the case of schema, the name of the selected value is recorded as well as the value itself; thus sometimes called a distinguished union).

For example, you may want your XML instance to appear as:

    <MyElement>
        <MyChildElement1>123</MyChildElement1>
    </MyElement>
Or:
    <MyElement>
        <MyChildElement2>Text</MyChildElement2>
    </MyElement>
but NEVER:
    <MyElement>
        <MyChildElement1>123</MyChildElement1>
        <MyChildElement2>Text</MyChildElement2>
    </MyElement>
In this case your schema should look like:
        <xs:complexType name="MyType">
            <xs:choice>
                <xs:element name="MyChildElement1" type="...type..."/>
                <xs:element name="MyChildElement2" type="...type..."/>
            </xs:choice>
        </xs:complexType>
The main difference between this and the case where multiple elements may appear is the presence of the xs:choice element. As you might expect, this construct is often called a choice.

As with the multiple element form, the child elements can have their occurrences specified as described in 7 - Specifying How Many Times an Element Can Appear, to give you a shema snippet of the form:

        <xs:complexType name="MyType">
            <xs:choice>
                <xs:element name="MyChildElement1" type="xs:int" minOccurs="0"/>
                <xs:element name="MyChildElement2" type="xs:string" maxOccurs="10"/>
            </xs:choice>
        </xs:complexType>
If this is done, you might end up with an instance that looks like:
    <MyElement>
        <MyChildElement2>ABC</MyChildElement2>
        <MyChildElement2>DEF</MyChildElement2>
        <MyChildElement2>GHI</MyChildElement2>
    </MyElement>
Or even (when an absent MyChildElement1 is chosen):
    <MyElement>
    </MyElement>
Again, you can add attributes to this, to get XML instances of the form:
    <MyElement MyAttribute1="123" MyAttribute2="data">
        <MyChildElement1>123</MyChildElement1>
    </MyElement>
by using the following style of schema definition:
        <xs:complexType name="MyType">
            <xs:choice>
                <xs:element name="MyChildElement1" type="...type..."/>
                <xs:element name="MyChildElement2" type="...type..."/>
                <xs:element name="MyChildElement3" type="...type..."/>
            </xs:choice>
            <xs:attribute name="MyAttribute1" type="...type..."/>
            <xs:attribute name="MyAttribute2" type="...type..."/>
        </xs:complexType>

[Top] [Contents]

9.4 - Specifying an Empty Element

You may wish to specify an empty element of the form:
    <MyElement/>
There is no explicit way to do this in XML Schema. Instead an idiom can be used that is a complex type without any child elements defined. Thus a schema definition for this is:
        <xs:complexType name="MyEmptyType">
        </xs:complexType>
Or equivalently:
        <xs:complexType name="MyEmptyType"/>
This is basically a stripped down version of the sequence definition described in 9.2 - Defining Types that Contain Multiple Elements. Observe however that even the xs:sequence part of the sequence definition is not required.

You can of course add attributes to this type, that would allow instances of the form:

    <MyElement MyAttribute1="123" MyAttribute2="data"/>
As you might expect from the previous material, this can be done as follows:
        <xs:complexType name="MyType">
            <xs:attribute name="MyAttribute1" type="...type..."/>
            <xs:attribute name="MyAttribute2" type="...type..."/>
        </xs:complexType>

[Top] [Contents]

10 - Defining More Restricted Simple Types

The XML Schema specification defines a number of simple types. However, in a number of cases you may wish to use more restricted simple types. For example, you may wish to limit the range of an integer to specific values, and restrict the length of a string. This section briefly describes how to do that.

The simple types defined in XML Schema can be further restricted using what are called facets. The XML Schema specification defines which facets can be applied to each simple type.

The XML Schema facets are: xs:length, xs:minLength, xs:maxLength, xs:pattern, xs:enumeration, xs:whiteSpace, xs:maxInclusive, xs:maxExclusive, xs:minExclusive, xs:minInclusive, xs:totalDigits, xs:fractionDigits. We will only explain a few of these facets here. The general form for restricting a simple type is:

        <xs:simpleType name="MyNewType">
            <xs:restriction base="...type being restrict...">
                ...restricting facets...
            </xs:restriction>
        </xs:simpleType>
Here, the type specified by the base attribute is the initial type from which the new type is being derived (by restriction).

Perhaps the most useful types to restrict are the integers and xs:string.

The most common restriction of an integer is to limit the range of valid values it can take. This can be done using the xs:maxInclusive, xs:maxExclusive, xs:minExclusive, xs:minInclusive facets. For example, to limit an integer to the range 0 <= x <= 100, you can do:

        <xs:simpleType name="My0to100IntType">
            <xs:restriction base="xs:int">
                <xs:minInclusive value="0"/>
                <xs:maxInclusive value="100"/>
            </xs:restriction>
        </xs:simpleType>
Or, to limit an integer to the range 0 < x < 100, you can do:
        <xs:simpleType name="MyEx0to100IntType">
            <xs:restriction base="xs:int">
                <xs:minExclusive value="0"/>
                <xs:maxExclusive value="100"/>
            </xs:restriction>
        </xs:simpleType>

Note that it is legal to restrict a value that has already been restricted. For example, you may choose to define a type that is a restriction of xs:int to the range 0 to 100. You could then define a further type that restricted this new type to the range 0 to 10. This could be done by doing:

        <xs:simpleType name="My0to100IntType">
            <xs:restriction base="xs:int">
                <xs:minInclusive value="0"/>
                <xs:maxInclusive value="100"/>
            </xs:restriction>
        </xs:simpleType>

        <xs:simpleType name="My0to10IntType">
            <xs:restriction base="My0to100IntType">
                <xs:maxInclusive value="10"/>
            </xs:restriction>
        </xs:simpleType>
Note that the order of definitions in XML schema is not important, so this could equally be written as:
        <xs:simpleType name="My0to10IntType">
            <xs:restriction base="My0to100IntType">
                <xs:maxInclusive value="10"/>
            </xs:restriction>
        </xs:simpleType>

        <xs:simpleType name="My0to100IntType">
            <xs:restriction base="xs:int">
                <xs:minInclusive value="0"/>
                <xs:maxInclusive value="100"/>
            </xs:restriction>
        </xs:simpleType>
The most common restrictions applied to strings is to their lengths, and the patterns they are allowed. Restricting a string may also be used to define an enumerated value.

To restrict the length of a string, you can do:

        <xs:simpleType name="My0to100CharStringType">
            <xs:restriction base="xs:string">
                <xs:minLength value="0"/>
                <xs:maxLength value="100"/>
            </xs:restriction>
        </xs:simpleType>
To limit the valid patterns of a string, you can do:
        <xs:simpleType name="MyPatternCharStringType">
            <xs:restriction base="xs:string">
                <xs:pattern value="ABC\d+"/>
            </xs:restriction>
        </xs:simpleType>
Here, the value of the value attribute contains a Perl-like regular expression. The main difference is that the specified regular expression is implicitly anchored to the beginning and end of the string being matched, so that the whole XML instance value must match the pattern, not just a fragment of it. Put another way, if converted to Perl, the pattern above would become ^ABC\d+$.

Multiple facets can be applied to a type at the same time, so you can do:

        <xs:simpleType name="MyPattern0to10CharStringType">
            <xs:restriction base="xs:string">
                <xs:minLength value="0"/>
                <xs:maxLength value="10"/>
                <xs:pattern value="ABC\d+"/>
            </xs:restriction>
        </xs:simpleType>
In this case, all the constraints applied to a type using facets must all be valid in an XML instance.

Another common case of applying restrictions to strings is forming enumerations. This can be done as:

        <xs:simpleType name="MyFavoriteColorType">
            <xs:restriction base="xs:string">
                <xs:enumeration value="Red"/>
                <xs:enumeration value="Blue"/>
            </xs:restriction>
        </xs:simpleType>
With the above example, the string is only allowed to be "Red" or "Blue". For example, if you have an element definition of:
         <xs:element name="MyElement" type="MyFavoriteColorType"/>
The only valid instances of this are:
         <MyElement>Red</MyElement>
or:
         <MyElement>Blue</MyElement>
Note that with enumerations it is very important to consider what will happen when you want to upgrade your schema to the next version. Using the form above, it would be illegal to add an additional enumeration value in a subsequent version of the schema. (Check for other tutorials in our series for how this might be made possible.) Thus it is important to make sure that when using the definition method described here, the enumerated values are a closed set that will provably not need extending at a later date.

[Top] [Contents]

10.1 - Same Simple Type, Different Name

Occasionally it is desirable to have types that have the same properties, but have different names. For example, a session identifier type may just be a regular xs:int. There is no special method for declaring this in Schema. Instead, you need to use the restriction method described above, but not actually add any restrictions! For example:
         <xs:simpleType name="SessionIdentifierType">
             <xs:restriction base="xs:int"/>
         </xs:simpleType>

[Top] [Contents]

11 - Defining Elements (Part 2) - Defining Elements using Local Types

For simplicity we have chosen to separate definition of elements from the definition of types. Element definitions are then linked to type definitions using the form:
        <xs:element name="MyElement" type="...Type..."/>
In general, this is a good approach.

You may wish to combine a type definition with an element definition. This is called a local type.

To define a local type you basically remove the type attribute from the element definition, remove the name attribute from the type definition, and move the type definition into the body of the element definition. For example, instead of:

        <xs:element name="MyElement" type="MyFavoriteColorType"/>

        <xs:simpleType name="MyFavoriteColorType">
            <xs:restriction base="xs:string">
                <xs:enumeration value="Red"/>
                <xs:enumeration value="Blue"/>
            </xs:restriction>
        </xs:simpleType>
You use:
        <xs:element name="MyElement">
            <xs:simpleType>
                <xs:restriction base="xs:string">
                    <xs:enumeration value="Red"/>
                    <xs:enumeration value="Blue"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:element>

[Top] [Contents]

12 - Documenting Your Schema

Typically you'll want to add some documentation to your Schema. XML Schema allows a couple of ways to do this. The first method you can use is standard XML comments, such as:
         <!-- This Element is important! -->
         <xs:element name="MyElement" type="MyType"/>
Such comments are easy to add, but are not strictly associated with the items they document.

XML Schema includes another way of documenting schema components by using the xs:annotation and xs:documentation elements. These elements can only appear at certain places within a schema, typically as the first element after a main 'keyword' such as xs:schema, xs:element, xs:attribute, xs:complexType, and xs:simpleType. For example:

        <xs:element name="MyElement" type="MyType">
            <xs:annotation><xs:documentation>
                Some useful information.
            </xs:documentation></xs:annotation>
        </xs:element>
        
        <xs:attribute name="MyAttribute" type="MyType">
            <xs:annotation><xs:documentation>
                Some useful information.
            </xs:documentation></xs:annotation>
        </xs:attribute>
        
        <xs:complexType name="MyType">
            <xs:annotation><xs:documentation>
                Some useful information.
            </xs:documentation></xs:annotation>
            ...
        </xs:complexType>
        
        <xs:simpleType name="MyType">
            <xs:annotation><xs:documentation>
                Some useful information.
            </xs:documentation></xs:annotation>
            ...
        </xs:simpleType>

[Top] [Contents]

13 - Putting It All Together

We mentioned in 5 - Schema Preamble that a schema has the form:
    <xs:schema targetNamespace="http://MyDomain.com/MySchema.xsd"
                    xmlns="http://MyDomain.com/MySchema.xsd"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    elementFormDefault="qualified" >
    ...element definitions...
    ...type definitions...
    </xs:schema>
Now more about the nature of element and types definitions has been covered, we can present a more complete example of a schema. For this we reproduce the example given earlier.
    <?xml version="1.0" encoding="utf-8" ?> 
    <xs:schema targetNamespace="http://xml2cpp.com/config.xsd"
                    xmlns="http://xml2cpp.com/config.xsd"
                    xmlns:xs="http://www.w3.org/2001/XMLSchema"
                    elementFormDefault="qualified" >

        <!-- The basic element definition -->
        <xs:element name="config" type="ConfigType"/>
        
        <!-- A complex type with complex content                 -->
        <!--     e.g. XML instance of the form                   -->
        <!--          <MyElement>                                -->
        <!--              <MyChildElement1>123</MyChildElement1> -->
        <!--              <MyChildElement2>ABC</MyChildElement2> -->
        <!--          </MyElement>                               -->
        <xs:complexType name="ConfigType">
            <xs:sequence>
                <xs:element name="userName" type="xs:string" minOccurs="0"/>
                <xs:element name="maxRecentProjects" type="Int1To10Type"/>
                <xs:element name="recentProject" type="xs:string" 
                                            minOccurs="0" maxOccurs="10"/>
                <xs:element name="childWindow" type="WindowType" 
                                            minOccurs="0" maxOccurs="unbounded"/>
                <xs:element name="dictionary" type="DictionaryFileType" 
                                            minOccurs="0"/>
            </xs:sequence>
        </xs:complexType>
        
        <!-- Definition of a restricted simple type -->
        <xs:simpleType name="Int1To10Type">
            <xs:restriction base="xs:int">
                <xs:minInclusive value="1"/>
                <xs:maxInclusive value="10"/>
            </xs:restriction>
        </xs:simpleType>

        <!-- Definition of an empty element with attributes -->
        <xs:complexType name="WindowType">
            <xs:attribute name="name" type="xs:string" use="required"/>
            <xs:attribute name="width" type="xs:unsignedInt" use="required"/>
            <xs:attribute name="height" type="xs:unsignedInt" use="required"/>
        </xs:complexType>
        
        <!-- A complex type with simple content                 -->
        <!--    e.g. XML instance of the form                   -->
        <!--         <MyElement MyAttribute="1">ABC</MyElement> -->
        <xs:complexType name="DictionaryFileType">
            <xs:simpleContent>
                <xs:annotation><xs:documentation>
                The base type is the name of the dictionary file.
                </xs:documentation></xs:annotation>
                <xs:extension base="xs:string">
                    <xs:attribute name="language" type="xs:string"/>
                </xs:extension>
            </xs:simpleContent>
        </xs:complexType>
        
    </xs:schema>
As you can see, we've made the element definitions and type definitions section more complete here.

Note that we have only included a single global element here. A global element corresponds to an xs:element definition that is a child (but not a grandchild etc.) of an xs:schema element. This is because any element defined by a global element is allowed to be the top-level element in a valid XML instance. Normally it is desired to have only one possible top-level element in an XML instance, and so this constraint is imposed in the schema by only having one global element definition.

[Top] [Contents]

14 - Simple Steps to Writing a Schema

Creating your first schema can be a bit daunting. It helps to have a schema-aware editor, but any text editor will suffice. If you have access to it, the text mode of the XSD Schema editor included as part of Microsoft Visual Studio should suffice for much of your initial needs. The Intellisense of this schema editor is schema aware, and this can help you get a valid schema.

Having selected an editor, follow these steps to create your first schema:

  1. If the outer most element of your XML instance is to be a sequence (as described in 9.2 - Defining Types that Contain Multiple Elements), copy the following into your editor:
        <xs:schema targetNamespace="http://MyDomain.com/MySchema.xsd"
                        xmlns="http://MyDomain.com/MySchema.xsd"
                        xmlns:xs="http://www.w3.org/2001/XMLSchema"
                        elementFormDefault="qualified" >
    
            <xs:element name="MyGlobalElement" type="MyGlobalElementType"/>
    
            <xs:complexType name="MyGlobalElementType">
                <xs:sequence>
                    <xs:element name="MyFirstElement" 
                                type="...TODO: add the type..."/>
                    
                    <!-- ...TODO: Add additional elements as required... -->
                    
                </xs:sequence>
            </xs:complexType>
            
            <!-- ...TODO: Add additional referenced type definitions... -->
    
        </xs:schema>
    
  2. If the outer most element of your XML instance is to be a choice (as described in 9.3 - Defining Types that Contain One of a Selection of Elements), copy the following into your editor:
        <xs:schema targetNamespace="http://MyDomain.com/MySchema.xsd"
                        xmlns="http://MyDomain.com/MySchema.xsd"
                        xmlns:xs="http://www.w3.org/2001/XMLSchema"
                        elementFormDefault="qualified" >
    
            <xs:element name="MyGlobalElement" type="MyGlobalElementType"/>
    
            <xs:complexType name="MyGlobalElementType">
                <xs:choice>
                    <xs:element name="MyFirstElement" 
                                type="...TODO: add the type..."/>
                    
                    <!-- ...TODO: Add additional elements as required... -->
                    
                </xs:choice>
            </xs:complexType>
            
            <!-- ...TODO: Add additional referenced type definitions... -->
    
        </xs:schema>
    
  3. Customize the names beginning with 'My...' to your needs.
  4. Add additional element definitions to the 'MyGlobalElementType' as required.
  5. Add additional types to the referenced types section as required.
  6. Save the schema to a file.
  7. Compile the schema using LMX.
  8. Integrate the generated code into your project!

[Top] [Contents]

15 - Experimenting with Schemas

One of the best ways to learn something is to play around with it. For this reason you may find the LMX evaluation support suite a useful tool to help you learn more about XML Schema. This tool contains a number of example schemas and corresponding XML instances. You can also modify the schemas and instances to see how this affects the results.

The LMX evaluation support suite is included as part of the main LMX XML to C++ code generator download (which it needs to operate). This can be downloaded from:

The suite operates by converting the various schema into C++ code and using the XML instances as input to the programs generated from the C++ code.

To use the suite fully you will need access to a C++ compiler. To interface to the C++ compiler correctly, you may need to modify some of the batch files that the evaluation support suite uses as described in the section titled "Before You Run the Evaluation" in the tools help file.

[Top] [Contents]

16 - Things We've Not Covered

We have so far presented the basics of XML Schema. It is hoped that the subset chosen will serve the bulk of your initial schema requirements. We also hope that by first getting familiar with the techniques described above, you will have a strong foundation for understanding the more advanced aspects of XML Schema.

In case your wondering, what we haven't covered includes using xs:import to import elements from other namespaces (using ref=), xs:include to include one schema within another, xs:redefine to redefine types from one schema to another, extending and restricting complex types, specifying mixed content, xs:union, xs:list, anonymous complex types, groups and attributeGroups, substitution groups, abstract elements and abstract types; to name a few.

[Top] [Contents]

17 - Please Rate This Article

Bookmark with:

Awful, Poor, OK, Good Very Helpful
Check if you would like future articles in this series

Advert:
Interface XML to C++ the easy way using Codalogic LMX

Back to the LMX Support Page