Last week, I had a lecture about the fundamentals of XML and JSON. I expected another long explanation about things I was already well acquainted with. However, I’ve learned a few things that others may find useful. Let’s talk about it! ☕

What is XML?

XML stands for Extensible Markup Language. It’s a markup language that specifies a set of rules for encoding documents in a format that is both human-readable and machine-readable. It’s in the same vein as HTML, as the name suggests. An HTML file is a valid XML file, but not the other way around. You might want to resort to use this type of files when looking for a way to store configs, transmit data or simply writing “html-like” documents. Nowadays, some might say that JSON took it’s place but I think it’s still important to know some basics which could come useful when you lease expect it. As long as technologies such as HTTP/HTTPS, RSS feeds & HTML exists, XML is here to stay. 😉

Why validate?

By doing validation we ensure our document conforms to a set of rules detailing the structure and content. Simply, we make sure it’s valid. Seems obvious right? – but what is “valid”? – A valid XML document must be well formed.

It must respect these syntax rules:

  • Tags are case sensitive
  • Elements must be properly nested
  • Documents must have a root element
  • Attributes values be quoted (single or double)
  • Elements must have a closing tag eg. <tag></tag>

How to validate?

To validate our XML document, we’ll need to create two new files. I won’t provide examples because I’m sure you can find some docs and/or tools online to help you with this task.

Firstly, we’ll need a DTD (Document Type Definition) file. It defines the valid building blocks of our XML document and its structure with a list of validated elements & attributes. Secondly, we’ll need a XML schema (XSD) file. It’s a guideline from the World Wide Web Consortium (W3C) that explains how to formally define the elements in an XML document.


Now, let’s test our XML document shall we? 🧑‍🔬

Many online resources and IDEs (eg. Eclipse) provide ways to validate. However, I’ll show you a quick & simple way to do it locally. We’ll need the xmllint program included with libxml library.

If you’re using Arch Linux, you can install the requirements with the following command:

pacman -S libxml2

Validate XML with DTD file:

xmllint --valid --dtdvalid <dtd> --noout <xml>

Validate XML with XML schema (xsd):

xmllint --noout --schema <xsd> <xml>