Put simply, the quality of the HTML you use to write your website is as important as how it looks visually.
Sure, you’ll have been told that your “HTML must be semantic”, but what’s the underlying reason, and who does it affect if it isn’t?
And, for that matter, what does that even mean?
What are semantics?
To understand why semantic HTML is so crucial, we first need to define what the term ‘semantics’ means in the context of a web application.
In the general sense, semantics can be described as:
…the branch of linguistics and logic concerned with meaning.
When using a website, humans associate meaning with parts of the page through a variety of senses – visually (via the browser interface), aurally (if you use screen readers for example), or through touch (Braille output devices).
How do they apply to my website?
Since you have visitors who use assistive technologies, providing a quality interface for them to consume your content should be a default position. We will discuss how to do that in a later post.
In terms of website interaction, two distinct interfaces are used to convey meaning to the user.
The portion of your user demographic who use browsers such as Chrome or Firefox and rely solely on visual cues will be inferring the meaning of the page purely from its visual design language.
However, users of some assistive devices such as screen readers will be reliant on the semantic nature of the HTML to deduce the same meaning.
Assistive devices and machine readability
Assistive devices can use a number of mechanisms that infer meaning from HTML. These include:
- A hierarchy-inferred representation of the page referred to as the document outline. It relies upon important structural elements to build up a structure of the document that can be used for presenting the document in many different ways.
A typical example would be OSX Voiceover’s ‘web rotor’ feature, which presents the user with a list of ‘landmarks’ within the document that allows quick navigation.
- A high-level composition that includes an enriched document object model (DOM) representation, as well as objects from the user interface – referred to as the accessibility tree.
With great power, comes great responsibility
In essence, conveying the meaning of an application’s composition regardless of device, has always been a fundamental concept of the web.
A common pitfall which results in a website being inaccessible, arises from developers not understanding the distinction between describing meaning (the role of HTML), and describing the presentation of that meaning (the role of CSS).
This separation between presentation and content is incredibly powerful, as demonstrated by CSS Zen Garden – a showcase of a variety of visual designs that use the same HTML structure.
Conversely however, this also allows a fixed aesthetic to be typeset into virtually any number of HTML structures, incorporating varying levels of semantic quality.
This can result in a disparity between what is visually displayed and what is conveyed to assistive devices.
So next time you’re creating a new component, start by describing what you mean, not what you see.
How does this still work if HTML is so old?
The web sphere has evolved quickly due to technological enhancements, and rich internet applications that have many of the characteristics of desktop applications are commonplace nowadays.
Since those user interfaces comprised more complex HTML structures, however, they became difficult for assistive technologies to infer their intent without using some level of heuristic analysis.
For example, how can you signify to a screen reader that a portion of your page needs to be re-read due to some content having been dynamically updated?
Or that a value you’ve typed in a form field is of an invalid format?
Moreover, the HTML specifications aren’t flexible enough to include current user interface trends. The answer is mirroring the accessibility model of conventional desktop-based applications.
Copying the desktop application model
Modern operating systems include an accessibility API whose job it is to represent entities within a user interface and expose information about each to consumers such as assistive devices.
For example, a menu button might be registered with a name (‘Edit’), its state (is it pressed in or not?), and a role (a menu item).
In order to construct this representation, accessibility APIs interface with an accessibility tree, which is a composition of the current user interface.
Accessible Rich Internet Applications (ARIA) is a specification that deals with how to make dynamic applications more accessible.
It co-exists alongside HTML and other markup languages as an entirely separate standard, allowing it to evolve independently.
Once previously impossible, we are now able to directly manipulate the accessibility tree through the use of a series of HTML attributes, resulting in a direct connection between HTML components and their equivalent types on the operating system.
This allows for the potential of complete parity between the experiences of web and desktop applications.
An example demonstration is available that demonstrates how easy it is to compose two HTML structures that could visually look the same.
Putting it to the test
To fully understand the importance of the subject, I’d implore you to choose a screen reader (Voiceover, JAWS, NVDA) that is supported by your OS, and spend half an hour learning some basic navigation skills.
After that, try reading your favourite website; if you’re a user who is able to use visual cues, then blindfold yourself and read a portion of the page, and compare the experience with the visual semantics.
How do they relate to one another?
When composing HTML structures, you must ensure that you describe its meaning using the correct HTML elements, and ARIA state and role flags where necessary.
Then, and only then, should you pay attention to the presentation of those structures.