SEO for Web Developers: Keywords and Links

Published 06:47 on 23 March, 2011

Since the dawn of the web, developers and content editors have sought insight and enlightenment into the arcane art of search engine optimisation (SEO). This relentless battle to rise above the competition and feature at the top of the search engine result page (SERP), for a chosen keyword or search term, has forced a constant clandestine evolution upon the search engines themselves whilst they adapt to out-trick the trickster.

What’s more the proliferation of false knowledge, the equally charlatan-populated worlds of SEO consultancy and web development, and the sheer abundance of SEO false positives have meant that snippets of true SEO wisdom are drowning in a sea of irrelevant balderdash.

It is no wonder then that your average web developer is not armed with the erudition required to fight the good fight; knowledge that will allow them to articulate the message of their content to the search engines and therefore be exposed to a wider audience.

Fear ye not, weary web devs, for here is your codex of power. I hereby commit to blog my personal experiences and knowledge of Getting SEO To Actually Work™. Everything I discuss in this series of articles has been learned through experimentation, careful analysis, and search-bot honey-pot test sites. Oh, and mummyfrakkin’ science, bitches.

Keywords

One term you will hear more than others when researching or improving SEO is “keyword”.

A keyword is a word or phrase that contains high relevance to your content’s subject matter. A keyword is best imagined as a term a user might enter into a search engine in the hope of finding content like yours. For example, on a travel website generalised keywords might include “travel”, “vacation”, and “holiday”, whereas more specific keywords might be “5 star holidays” or “Greek holidays”.

As the search engines crawl and index the content on your pages they keep track of those pages in keyword-based indices. Thus, rather than storing 25 billion web pages all in one database, the engines maintain a vast array of smaller databases, each centred on a particular keyword term or phrase. This denormalisation makes data retrieval much faster, reducing it to mere fractions of a second.

Use of keywords

The search engines analyse the embedded keywords within your content to ascertain their relevance to the subject matter. This means a good way to optimise your content is to make sure the keywords are situated in key content structures, such as URLs, page titles, headings, and areas of emphasis (<strong>, or <b>; your non-presentational mark-up bears little relevance here). I’ll cover this in more detail when I talk about on-page optimisation later in this series.

It is also of note that a generalised keyword—such as “car insurance”, for example—will have a broader use, and therefore considerably more competition in terms of rankings. Something more specific like “women’s small car insurance” will limit your competition but might not have such a large search density. There is a clear balance to be found in your market and it’s important to identify the keywords you wish to target for a given page as early as possible.

A word on keyword density

The term “keyword density” describes the density (i.e. occurrence) of a keyword within your content. The term is a veritable minefield of misinformation and should set off alarm bells as soon as it is mentioned.

The density of your keywords bears no relation to the quality, semantics, and relevancy of your content; and it is these characteristics that glean the highest return. Do not be tempted into thinking that a page saturated with keywords will perform better in search rankings than a page that seats fewer keywords in higher quality, higher relevancy content.

Keyword targeting

As I said earlier, it’s important to identify the keywords you wish to target before you do anything else. To be effective, SEO requires that keywords are associated with and contained in your content at as many levels as possible.

Most SEO consultancies will invest significant amounts of time researching keywords for a given market, and also the competition against those keywords. What’s more, that analysis is ongoing, as the value of your chosen keywords will fluctuate wildly due to changes in the market, target saturation (i.e. too many people using it), and other such external factors (including black-hat devaluation which I will cover later).

Let’s make sure we’re not under any false illusions before we get too far in; the two most important things about any page—in terms of SEO—are the links into it, and the keywords associated with those links. You could serve up the most hideous pile of non-web-standards, non-semantic, tables-for-layout, text-in-images content, and if the links into it are plentiful, and also contain the right keywords and/or link juice (more on that later), it will perform well in the rankings. Links are how the search engines will begin to calculate the relevancy of your page. Once they have a relevancy rating, it will then be affected by the analysis of the content of that page.

That’s not to say that there isn’t some artistry in the placement, design, and variance of those links, but since hyperlinking is the very essence of the world wide web, it is unsurprising that it is also the very essence of SEO. A high quality link will be featured on a page containing high relevance to the link’s subject matter, will be situated in an area of highly relevant content, and will have link text containing relevant keywords. It’s all about relevance.

Search engines apply a numeric score to a spidered page known as PageRank. This is often called “Link Juice” in SEO circles and is a calculated number based on the links—and the quality of those links—into, and out of, your page. This score is then used as a factor in calculating the ranking of your page against keyword relevancy.

As the search-bots spider the links within a page they pass a proportion of the host page’s score to each link. The page at the other end of the link is therefore afforded a total based on the score flowing into it from links elsewhere and how it passes that score onward. The best way of imagining this is as a complex network of pipes and sluice gates (links), with the score being the liquid within. High quality pages will have higher page rank due to incoming links with a higher score.

Bear in mind, however, that “link juice” is calculated differently based on “internal” and “external” links. Most search engines appear to deem links within a domain (internal) of less importance than those from a different domain entirely (external). For the most part, my linking tests seems to infer that cross-sub-domain links are also deemed internal. However that sort of testing is fairly difficult to quantify, so I’d recommend investigating for yourself if you’re intending doing something like that.

As an aside, it’s worth noting that a high link score (juice, PageRank, call it what you will) does not necessarily mean high SERP rank.

The “nofollowrel attribute value

Following the dawn of the “blogosphere” and other such UGC (user-generated content), such as forums and guestbooks, there was an explosion of link-spam. These are links that have been placed on as many different sites as possible—either manually, or through automated processes known as spam-bots—with little relevance to the intended page content or conversation. This link spam is a black-hat method of rustling link juice into a given target page.

To combat this, the search engines implemented a scheme that allows a developer to mark a link as non-pertinent in the flow of link juice. To do this, you apply a rel attribute of nofollow:

<a href="http://somewhere.com/" rel="nofollow">Link</a>

nofollow sculpting

The rel="nofollow" attribute used to allow you to sculpt the score that would pass between links on your site, by preventing score flowing through no-followed links. This is no longer the case.

Before Google changed their algorithm, rel="nofollow" used to be handled like this:

Link 1 - Link score/4
Link 2 - Link score/4
Link 3 - Link score/4
Link 4 - Link score/4

Link 1 - Link score/3
Link 2 - No follow
Link 3 - Link score/3
Link 4 - Link score/3

Now it works like this:

Link 1 - Link score/4
Link 2 - No follow
Link 3 - Link score/4
Link 4 - Link score/4

This means that you get significantly less benefit from no-following a link than you used to. However, all the search engines (with Ask being an exception) state that they do take rel="nofollow" into account when weighting rankings:

Google, Yahoo!, and Bing say no-followed links are treated as plain text and therefore have no added emphasis when calculating keyword relevancy. Y! and Bing, however, do state that they use no-followed links for discovery and may follow them without affecting your rankings.

It’s all very complicated but suffice to say, it’s probably better to mark unimportant links on your pages as rel="nofollow" and let the search engines sort out amongst themselves what they want to do with it.

Improving “crawlability”

To improve the efficiency of spidering (i.e. how the search bots crawl your site) it is advisable to maintain a flat site architecture. In this context, site architecture is simply a means to describe the link depth of your pages. “Link depth” is the number of links it takes to navigate from the home page of your site to any given page within it. The shorter the maximum link depth, the better the “crawlability”.

For the most part, search engines understand a “site” to be a confined graph of internal links as mapped by the search bot when it spiders your site. Each pass through the site will give the bot a better understanding of your site-map. There are several ways to inform this process, such as sitemap XML files and the robots.txt file. I’ll cover those later in this series.

Most search engines recommend hosting less than 100 links on any page. This is primarily because, above a certain number, link score will deteriorate due to a search engine defined degenerative quality equation. This will result eventually in links that are not spidered at all. Secondly, a large number of links on any page is a signal of diminished content quality and will directly affect SERP rankings as a result.

The arbitrary value of 100 is misleading though as a high quality landing page, such as your homepage, can easily maintain 250-300 good quality links without penalty. The pages at the next level can then host ~200 quality links and so on. With this architecture, a site with a link depth of 4 or 5 can easily maintain millions of pages within its structure with little or no penalty.

Link building is the art of obtaining external links back to your content from high quality (i.e. high relevance to your content subject matter) pages elsewhere on the web. There is a fine line to walk between link building and creating link spam; it is the difference between the relevance of the surrounding content on the link hosting page.

Many SEO companies employ link building specialists who will inundate the web with links, either posted manually or accrued through link sharing contracts, registration with pay-to-link directories, and other such socially manufactured solutions.

Third-party linking

To build effective links it also a good idea to offset the effort. In some cases, placing a link on a site with high traffic, but that isn’t particularly high quality—or isn’t even spidered effectively—can work in your favour. A good example of this is Facebook which hides a large majority of its content from search engine spiders. A link placed on Facebook has a high likelihood of being reposted elsewhere, where the link quality will be vastly improved.

In today’s internet company you will find many Social Media “Experts” who’s sole responsibility is to use social networking to market the brand, the company, and, more importantly, links to the website. Whilst the quality of links on social networks is debatable, the benefit of reposts elsewhere cannot be ignored, regardless of quality.

Another technique when link building is to create content that can be classed as “link bait”. This is content that has a high likelihood of being reposted elsewhere and is therefore subject to significant third-party linking.

In my personal opinion the concept of link bait is very hypothetical; one man’s link bait is another man’s quality content. In web development particularly there are any number of cynical parties waiting in the wings to troll a link as link bait. This may, however, have something to do with the inherent cynicism in the development world in general.

Summary

Understanding the concept of keywords, and then applying that concept to link building and your internal link structure, are really the two base concepts of SEO. Once you have mastered them you should be well on the way to improving your performance in search engine rankings.

The next article in this series will cover SEO in page construction, including information about on-page optimisation, designing your URLs, correct use of HTTP, and other such technical information.