XML Syntax 2
XML Syntax 2
Rajesh Math
10/17/2012
All elements have a parent element except for the root element.
All elements have a start and an end tag (except for empty elements), eg:
underscore, period(.) or hyphen(-). Element names cannot start with the string xml in any case combination (xml is a reserved keyword).
Attributes: elements may have attributes associated with them. Attribute names follow
the element naming rules. Attribute values must be enclosed inside double quotes().
Nested elements: Elements can be nested within other elements. <A><B></B></A> is allowed
10/17/2012
No overlapping tags: XML elements must be properly nested. <A><B></A></B> is not allowed. The XML declaration is the first line of the document. It identifies the document as an XML document, specifies the xml version being used and the character encoding system. <?xml version=1.0 encoding=UTF-8?> Builtin reference Entities: < <left angle bracket> > right anglebracket' 'apostrophe " double quotation mark &&ampersand Attributes: the attribute value must be supplied if the attribute is used and the value must be quoted. <myTag MyAttrib = "x" > ... </myTag> <document versionNo = "1.4" > ... </document>
10/17/2012
Schema Another method for defining the structure and rules for an XML document. Schema gives a tighter definition of the elements and their allowed values as well as the order in which nesting of tags is allowed. XSL eXtensible Stylesheet Language: A markup language that allows you to describe a set of rules for translating one XML document to another XML document. XSLT XSL Transformer: A set of Application Programming Interfaces (API) that are used to accomplish the transformation. Utility programs exist that take an XML document and a XSL file to produce the transformation to a new file. 10/17/2012
Used by Browsers in HTML. DOM parsers are available as libraries for Java, Perl, C++ and many other languages. Uses a tree representation of documents, see above. Very memory and CPU intensive for large documents.
documents. Sequentially processes the document from start to end. useful for extracting single items from the document.
10/17/2012
XML syntax
XML consists of ELEMENTS
content (except for special empty elements). <friend>George</friend> Elements are represented using tags and each tag has a corresponding closing tag unless it is an empty tag, such as: <student id="40123721" grade="A" />
10/17/2012
Attributes, Comments
Attribute values must be present, and must be
quoted. For example, in HTML we could get away with: <hr noshade>. This is not legal in XHTML, where it must be written <hr noshade="noshade"/>. Similarly, in XML:
<myTag myAttrib="1.6"> ..... </myTag> NB: An open issue in XML design is whether a particular entity should be modelled/described as a tag/element of its own, or as an attribute of an existing element. The general "rule-of-thumb" is that elements should be thought of as containers (which are understood to have contents) and attributes are characteristicsof the element.
7 SICSR XML - Lecture 1 10/17/2012
CData
The content of a CDATA section is not treated as
markup. Typically used to include data that will be used by another application eg. JavaScript. Syntax is a little messy, and looks like: <! [CDATA [ <script language="javascript" type="text/javascript"> var name="fred"; var x = 3.0; var y = 4.0; if ( x < y ) document.write( x is less than y ); </script> ]]> Note the strange use of the "square brackets", [ CDATA [ ...]]. In XHTML the above example would be written:
<script language="javascript" type="text/javascript"> <! [CDATA [ var name="fred"; var x = 3.0; var y = 4.0; if ( x < y ) document.write( x is less than y ); ]]> </script>
8 SICSR XML - Lecture 1 10/17/2012
Processing Instructions
An XML file can also contain processing
instructions that give commands or information to an application that is processing the XML data. Processing instructions look rather like the lines in the prolog: <?target instructions?> where target is the name of the application and instructions is a string of text which is passed to it, eg: <?xml-stylesheet type="text/xsl" href="weather.xsl"?>
9 SICSR XML - Lecture 1 10/17/2012
Namespace
Similar concept to scope rules for variables in programming. For example, in Java, the key word "this" is used as a prefix to refer to the instance variable to avoid confusion with a local variable of the same name. Another example: in Perl, the keywords "my" and "local" provide fine-grain control over variable scope. Universal Resource Identifiers (URIs) are used to uniquely identify a namespace.
URI
Universal Resource Identifier -- can be a
URL or a URN
URL
Universal Resource Locator
URN
Universal Resource Name
10/17/2012
10
URN Syntax
The syntax is similar to URL All URNs have the following syntax:
<URN> ::= "urn:" <NID> ":" <NSS> where <NID> is the Namespace Identifier, and <NSS> is the Namespace Specific String. A namespace can be declared for any XML document type (custom markup language). The namespace is identified using a unique URN or URL. IMPORTANT: The URI need not physically exist. It is only being used a means of uniquely identifying a document definition. A name space is a conceptual zone in which all names are unique
11
10/17/2012