On HTML element identifiers

Published 17:11 on 27 September, 2011

Back in April 2007, I wrote a short article on HTML element identifiers for the .net magazine "Expert Advice" section. I have never republished it online—despite it being 400 words, nicely to the point and suitably succinct—and have long been meaning to reexamine the subject in more detail.

By a total coincidence a discussion in a recent front-end code review and a discussion over the past couple of days on Twitter both relate to the use of HTML element identifiers. The former, a discussion on semantic value and remembering that HTML should not prognosticate styling; the latter, a discussion on the validity of using IDs to target CSS. With both of these considerations I’m going to study the fundamentals of writing good element identifiers within this article.

What’s an element identifier?

HTML has two element identifiers; the id attribute and the class attribute. These attributes can be applied to most of the elements within a modern HTML document.

What do they do?

The id attribute assigns a name to an element. This name must be unique within the document and each element can have only one.

The class attribute assigns a class or set of classes to an element. Any number of elements may be assigned the same class name or names and each element can have many.

In relationship terms this means IDs and elements are a one-to-one relationship, and classes and elements are a many-to-many relationship. IDs are very much an absolute index system and classes are more of a tag-style classification aide.

Make note of those words: "name" and "classification”. I’ll come back to that in a bit.

Mislaid mastery

Put in those simple terms it’s all pretty straightforward, and I think that most front-end developers have a good general grasp of their usage. However, it’s the detail that introduces the issues, and if you take time to step back it’s easy to see why:

Element identifiers are the perfect answer for cross-linking other front-end technologies to your HTML. They are the hooks on which to hang your styling and behaviours, and because of this they take on that personality in your mind. This is where so many developers go wrong.

When front-end developers are first introduced to the id and class attributes, it’s often so the identifiers can be targeted in CSS selectors as a means of styling the developers’ documents. They will also inevitably start using them to hook progressive enhancement into their documents via DOM scripting. These are the two most common uses of element identifiers:

<ul id="site-navigation" class="navigation"></ul>
#site-navigation {
    border-bottom: 2px solid red;
}
.navigation a:focus,
.navigation a:hover,
.navigation a:active {
    outline: 2px solid yellow;
}
var site-nav = document.getElementById("site-navigation");
site-nav.invokeOarsumNavWidget();

// Use some modern browser native DOM + JS win
var navCollection = document.getElementsByClassName("navigation");
navCollection.forEach(function(element, index, array) {
    // Do something to every element
    // with a class of "navigation"
});

Also, if you’ve ever wanted to make use of the URI fragment identifier in a non-destructive way, you may have used the ID element identifier instead of the old-skool named anchor () approach:

<a href="#navigation">Skip to navigation</a>

In each of the above examples, the element identifiers were added specifically to be used by some other language; for styling it is CSS, for dynamic behaviour it is JavaScript, and for linking it is the URI RFC 3305. However, that association also occurs psychologically in our minds as developers and we often have trouble seeing past it.

Names and classifications

Back to those "name" and "classification” descriptions:

It seems we have carelessly forgotten the fact that element identifiers are a feature of HTML and not the other languages that use them as targets. We should be using them as they were originally intended; to name and classify the content within our mark-up.

We already use HTML elements to describe the content which they contain. This is the very heart of semantic HTML and should result in content like heading text being wrapped in h1-h6 tags; lists in ul, ol, or dl tags; and tabular data in table elements. Since IDs and classes are attributes of these elements, we should use them to improve the semantic value of their hosts and provide further descriptiveness or context.

If we return to our previous list example, the following mark-up is reasonably semantic:

<ul>
    <li><a href="/index.html">Home</a></li>
    <li><a href="/about/">About</a></li>
    <li><a href="/blog/">Blog</a></li>
    <li><a href="/blog/archive/">Archive</a></li>
    <li><a href="/contact/">Contact</a></li>
</ul>

However, we can improve it further by giving it some context with an ID:

<ul id="site-navigation">
    <li><a href="/index.html">Home</a></li>
    <li><a href="/about/">About</a></li>
    <li><a href="/blog/">Blog</a></li>
    <li><a href="/blog/archive/">Archive</a></li>
    <li><a href="/contact/">Contact</a></li>
</ul>

Now it’s more obvious what this list of links actually is without seeing it styled up and rendered in the browser.

We could provide even more information with a single class:

<ul id="site-navigation">
    <li><a href="/index.html" class="current-page">Home</a></li>
    <li><a href="/about/">About</a></li>
    <li><a href="/blog/">Blog</a></li>
    <li><a href="/blog/archive/">Archive</a></li>
    <li><a href="/contact/">Contact</a></li>
</ul>

I’m sure most web developers have done something similar in the past, but it’s a very good example of applying context with a class the right way. The HTML now gives us a very good description of the content it contains. In fact, you now only need glance at this markup to understand its function.

Let’s take it a little further. Imagine we have several types of navigation in the page; we can better describe these elements by classifying them as navigation:

<ul id="site-navigation" class="navigation">
    <li><a href="/index.html" class="current-page">Home</a></li>
    <li><a href="/about/">About</a></li>
    <li><a href="/blog/">Blog</a></li>
    <li><a href="/blog/archive/">Archive</a></li>
    <li><a href="/contact/">Contact</a></li>
</ul>

<p id="breadcrumb" class="navigation">
    <a href="/index.html">Home</a>
    &gt; <a href="/blog/">Blog</a>
    &gt; My Blog Post
</p>

<div id="page-footer">
    <p>© 2011 Oarsum Corp, all rights reserved.</p>
    <ul class="navigation">
        <li><a href="terms.html">Terms and conditions</a></li>
        <li><a href="privacy.html">Privacy policy</a></li>
    </ul>
</div>

Simple, right? We’re using the element identifiers to improve the semantics of the markup and gaining improved readability as a side-effect. What’s more the added context can be targeted by whatever technology you so desire.

Anti-patterns

I’m pretty confident that most web developers will end up with markup similar to this without necessarily understanding my point. To that end, let’s take a look at some common anti-patterns (or misuses) that may go some way towards forwarding that understanding:.

The "clearfix" class

Way back in 2004 [ya rly!], Tony Aslett and Position is Everything published a clever method for containing floats that involved using the pseudo :after selector to insert some content and use that as an anchor for clear: both. It was a clever technique and was quickly adopted by the community, often incorrectly bound to a class entitled "clearfix":

.clearfix:after {
    content: ".";
    display: block;
    height: 0;
    clear: both;
    visibility: hidden;
}

/* IE targeted property */
.ie .clearfix {
    zoom: 1;
}

Quite apart from the somewhat erroneous name (it should really be "contain-float") this is absolutely the wrong place to be addressing this sort of styling issue. A styling-specific meta-class such as this is no different to writing a class entitled "red-text" or "big-blue-heading". In my opinion, the small amount of repetition it saves you in the CSS is not worth the non-semantic clutter in your HTML.

There are several ways to contain floats in CSS and the method you should use absolutely depends on the context. What’s more, some of those methods are a single CSS property declaration. For this reason, declaring a class of "clearfix" and implementing a single float clearing solution within it is actually potentially detrimental to the overall efficiency of your CSS.

It’s worth noting also that providing a meta-class tool like this results in a disassociation between the problem and the "fix". You’re likely to experience maintenance issues when you change any float behaviours and the containing mechanism is abstracted from the target’s styles. In addition to this you’re likely to find developers over using the solution to address problems they’ve introduced themselves.

The "hide-offscreen" class

When developers are aware of accessibility issues, and are attempting to provide textual description for otherwise potentially inaccessible content, you will often find a class that looks something like this:

.hide-offscreen {
    text-indent: -999em;
    font-size: 0;
}

/* or */

.hide-offscreen {
    position: absolute;
    left: -999em;
    font-size: 0;
}

This moves text off the screen, thus making it invisible for those not using a screenreader. This is preferable to display: none (which you might use to hide content from all users) because the content is only hidden within the browser viewport, but not removed from the complete rendering of the DOM.

However, what we’ve done here is define our CSS first and then apply that classification to our content. We’re supposed to be doing it the other way around. This class is unhelpful in describing the content that it wraps because it describes the behaviour the CSS should apply. This results in it providing little explanation of the developer’s intent, resulting in HTML like:

<ul class="hide-offscreen">
    <li><a href="#navigation">Skip to navigation</a></li>
</ul>

Not a good class at all. This is an improvement:

<ul class="accessibility navigation">
    <li><a href="#navigation">Skip to navigation</a></li>
</ul>

I've added the accessibility class here to signal intent: it's navigation for improving accessibility. I could have been more specific here, but overall I predict a high likelihood I'll want to signal a similar intent on other features within my document.

However, the double class will cause problems if you need to target both classes at once (e.g. .accessibility.navigation {}) in IE6. That browser will only target the final class in the selector. To fix this, you could combine the classes to create a new one:

<ul class="accessibility-navigation accessibility navigation">
    <li><a href="#navigation">Skip to navigation</a></li>
</ul>

Or a less ugly solution would be to include an ID on the navigation:

<ul id="header-skip-links" class="accessibility navigation">
    <li><a href="#navigation">Skip to navigation</a></li>
</ul>

Or to use a wrapping element and target with the descendant selector:

<div class="accessibility">
    <ul class="navigation">
        <li><a href="#navigation">Skip to navigation</a></li>
    </ul>
</div>

Here we’ve named the content and classified it well. It would be difficult to argue that we haven’t provided any targets for styling or behaviours now too. Unfortunately we’ve had to jump through hoops for IE6 and have broken our rule of changing the HTML for the CSS, but I just wanted to illustrate that there are more options available given a little thought.

Avoiding repetition

It’s entirely fair to argue that both the anti-patterns above avoid repetition within the CSS itself. However, this shouldn’t necessarily be the only reason to adopt one of these types of solution. There are other options available to us to abstract away this kind of anti-DRY maintenance nightmare:

Indeed most CSS preprocessors will allow you similar functionality as it is clearly a "missing feature" of CSS as a whole. The alternative, of course, is to make your "fix" a part of the relevant cascade, so that all selectors beneath inherit that behaviour.

A nice example

A really nice example of classification is this simple design pattern for pagination:

<p id="paginglabel" class="audible">Pagination</p>
<ul role="navigation" aria-labelledby="paginglabel">
    <li><a href="#"><span class="audible">%TYPE% Page</span>1</a></li>
    <li><a href="#" rel="prev"><span class="prev">Previous<span class="audible">: %TYPE% Page</span></span>2</a></li>
    <li><p><span class="audible">You're currently reading %TYPE% page </span>3</p></li>
    <li><a href="#" rel="next"><span class="next">Next<span class="audible">: %TYPE% Page</span></span>4</a></li>
    <li><a href="#"><span class="audible">%TYPE% Page </span>5</a></li>
</ul>

Here we see the "audible" class describing content that should only be exposed to non-visual devices, such as screen readers and content scrapers (like search engine spiders). This is a good classification of a subset of the content.

It’s also worth noting the good use of WAI-ARIA roles and the rel attributes on the next and previous links. These are two other ways you can improve the semantics of your HTML and abstract away some of the classification to a more suitable annotation system.

Clean Code

Markup languages can often add to the obfuscation of code simply by their structure. If we use the element identifiers as they are intended we should end up reducing the overhead to grok relatively clean code.

Clean code is, amongst other things:

  • Readable
  • Understandable
  • Predictable
  • Maintainable

Use of element identifiers as semantic descriptors improves each one of these points. This is especially obvious when dealing with somebody else’s code, and in a team environment I personally feel it is imperative to include this in coding standards and best practices.

Good use of names and classification is just as effective as good commenting. When the code is effectively self documenting through its element identifier annotation, the need for commenting is significantly reduced.

On CSS best practice proposals

Now that I’ve covered the "hows" and "whys", I’d like to highlight why I think certain recommendations for CSS best practice directly contravene the correct use of element identifiers:

There has been some discussion within the front-end community regarding the use of IDs when targeting CSS. The ID attribute has particularly strong specificity in CSS, due to its uniqueness. When used incorrectly, it can cause maintenance problems and specificity bugs. However, the idea that targeting the ID property with CSS should be considered harmful—as seen in CSSLint, for example—is extreme to say the least.

It is reasonably simple to clobber other rules on an element by incorporating an ID in the selector, but this should really be an expected outcome. The use of an ID in a selector should denote the intent that the rule targets only a single occurrence of something within the page. If that is not what the developer intended, then they’ve wildly misunderstood the need for that selector. Certainly it makes sense that targeting an ID should be considered harmful when the developer actually meant to target something less specific, but then that’s just using CSS properly, isn’t it?

The argument that it is less valid to target IDs when working in a truly modular environment is even worse. A truly modular front-end environment will always require instance targeting alongside classification targeting. Sure, it’s likely to be the minority of situations, but there will always be cases in your system where it makes much more sense to target an ID than a class, and since they’re already in place on your HTML by this point [they are, right? We’ve already talked about this. If they’re not, go back and read it again.] you don’t have to worry about your CSS enforcing your HTML.

The developers that make these sorts of recommendations must be aware that they, as leaders in their field, have a responsibility not to mis-sell their solutions. Problems born of misuse do not emphasise an inherent flaw with the technology but perhaps an inherent flaw with the expertise of the user. As an exercise in correlation, it’s worth noting that Chefs do not veto the use of sharp knives because someone once cut their fingertip off with one.

Summary

HTML, CSS, and JavaScript are separate technologies. They are quite tightly related in the world of client-side development, but they should always remain decoupled. The element identifier is an artefact of HTML and should therefore not attempt to predict the hooks required for other technologies; it should simply be a property of effective semantic markup.

Further to this, avoiding the use of an element identifier, simply because it is so often misused by otherwise able front-end developers, is—at the risk of being blunt—ridiculous. Be proud of your HTML, spend as much time perfecting it as you would on any other language, and, most of all, don’t pollute it with the litter produced by a solution, in a related language, that lacks finesse.

Final note

I realise, however, that—despite my banging on about it—there are some developers you just can’t reach. As Jared Spool so eloquently puts it in his blog post, "Beans and noses":

No matter how much you try, you can’t stop people from sticking beans up their nose.