Our Semantic Search Guide continually evolves. You are welcome to suggest edits and submit content for consideration.

Internal Link Audits

A well crafted internal link structure improves your chances of your content being seen. At the right place and at the right time. Your internal links structure is also known as your “Link Graph” and there are three core elements.

  1. Your Navigational or menu link structure,
  2. your breadcrumb link structure and –
  3. The most important in today’s search algorithms – the internal link structures in the body of the text.

The way in which your main content connects to other content on your websites can have a profound effect on both Google and users alike. In this Internal link audit “how-to” guide, we’ll dive straight into internal “body text” links first. Auditing these is the most complex problem to solve. We will also cover Navigational menus and breadcrumbs at the end for more basic SEO.

A recent study found that website owners missed more than 80% of the link opportunities.

Even before Google migrated its ideas towards Semantic search, links acted as important signposts for search algorithms. You can really understand the importance of links for Google’s Pagerank algorithm here. There are a few points of note about PageRank which are worth noting, though.

First, PageRank was calculated at the page level, not the domain level. This means that internal links play a big part in determining the strength of the page in terms of Pagerank. Second, PageRank in its purest form has no context. A link should only have an effect on a search algorithm if it adds to the context of in which it exists. Google did talk about “Topical PageRank”, although were not explicit at the time about the way it implemented it. One paper on Topical PageRank from Cambridge University shows how this works.

For search, the presence of links in a document collection adds valuable information over that contained in the text of the documents alone.

Jardine & Teufel

There are also important reasons why internal links are relevant in the world of semantic search. By linking text closely to content about the entity, you are making life much easier for a reader to understand the meaning of an article and – just as importantly – you are helping Google and other search engines derive the meaning of your content. For example, if you talk about “Queen” on a page, are you talking about a band, a monarch or a lifestyle choice? By linking to an article of content that has schema around this context, machines can readily identify the nature of the relationship between the two pieces of content.

In “15 advantages of using Internal Link Building for SEO“, Fred Laurent makes the argument for internal links compelling.

Once your site relies on content, internal links are as essential to your visibility as external links, to:

Increase the number of long-tail keywords
Better respond to users’ queries
And ultimately, increase your visibility and your organic traffic

Fred Laurent

Combined, Ranking pages in the SERPs is much more effective for any search engine if the Internal Links are taken into account as a major ranking factor. The challenge is to be able to see what the machines see, as they run algorithms across all the pages – with those linking to any given target page having more ranking and contextual relevance than those pages several clicks away. This is really the only reason that the Home Page carries the most weight in search. Usually, the home page is accessible from every other page on the site and therefore becomes the most important.

The Principal Strategies behind an Internal Linking Audit

The main idea behind an Internal link audit is to increase the “Contextual Relevance” of all internal links, such that it is abundantly clear to any human and search engine alike, where the authority for any important topic is on your site. That is to say – for every “Head term”, there should be a clear and agreed target page and all other significant mentions of that topic should always link to that headline page. These pages are generally called “Pillar pages”, “Target Pages” or “Cornerstone Content” depending on the SEO or technology you speak used. This means that when a blog post mentions an important idea, a link should exist within the text that talks about that idea.

Traditionally, SEOs have only focussed on the anchor text, but the surrounding text also gives a link context. It is the underlying meaning that is important. Try not to rely on told that only generate exact match anchor text and audit the site to ensure this has not happened to an extreme in the past.

An exception to the idea that topics should link to their Pillar Page is when you have a more targeted page with a long tail concept that is more appropriate. Links should be as specific as possible and often it is the wording of the outgoing page that will reflect the most targeted link candidate. this is perhaps the hardest part of a link audit. The general strategy is to create a hierarchy of pages around a topic or idea. For example, you may have a headline target of “SEO” but then “On-Page SEO” could be a major pillar of the SEO group (sometimes called a silo). Another might be Site speed and another might be Backlinks. Grouping these pages makes sense, but struct silos are not always a good idea.

A strict silo can make sure that you ONLY link pages within a silo to other pages within the same silo. This is rarely the best way to freely share concepts and topics around a site, although hard walls like pages in different languages, or hotels in different cities may well be a good reason to recommend struct silos.

In addition to strategic considerations, there are also a few other elements that need to be taken into account when conducting an Internal link audit, They are:

  1. Listing internal links that result in 404 pages and
  2. Listing links that redirect and in a similar vein
  3. Look at links which do not link to the correct canonical for a URL

Whatever strategy suits your audit, the hardest task is viewing your link graph. So we will cover tools to do this next.

Viewing your Internal links structure can be achieved with a number of tools. Without prejudice or preference, here are a few.

1. OnCrawl

OnCrawl is an enterprise-level crawler that offers some sophisticated Internal link analyses. In particular, Oncrawl has a metric called “InRank” which they use as a proxy for Google’s PageRank measurement, specifically for the internal links within the site.

OnCrawl shows Internal Link flows

2: Sitebulb

I find that Sitebulb has a great many ways to visualize Internal links. These make it very easy to not only visualize internal links but also to see where questionable internal links are diluting focus or (more likely) where internal pages are not linking to each other and should be.

A Crawl Tree visualization by SiteBulb
Alternative Visualizations in SiteBulb

Recently, my previous company, Majestic, came out with a brand new way to visualize links on a page.

This new visualization shows how links are balanced on a web page. Internal links are in blue and external links are in orange. Each page is segmented into 40 sections, allowing you to see where the links are on the page.

You can see the overall look of the page and see which links are in the body and which are in the navigation. On the downside, Majestic does not render Javascript links at the moment. Also, this visualization really looks at the links out of a page, rather than the links into the page.

4: Screaming Frog

Every SEO’s go-to tool. Screaming frog lets you crawl any website. In doing so it tracks all the internal links that it finds and allows you to sort. The graphic above shows you how to see all the Internal links into a given page in one place. (The next tab also shows the outbound links from the same page.) Unfortunately, this does not separate out the body text links.

Define and Agree the “Pillar Pages”

Your Internal Link audit will only be strategic if you first agree on the most important topics for your business. The ones that you want your business to rank for or be seen as an expert in. You should then select ONE pillar page for each of these main topics. If you can get to a site with one pillar page per topic, your internal link strategy will be cleaner. Internal Link Audits are a measure of how effectively the site is linking topics in the text through to the related pillar pages.

One mistake that sites make is to create automated content which tries to create a page for almost every keyword variation. This is common when trying to cover off (say) a trade for every town and city in a country. In this event, each city is not really a pillar page! You could set the target for each of these pages to be the town or city in question, but you may not mention that town anywhere else on the website. On the other hand, clever use of maps might be able to show “nearby stores” for each town. This kind of tactic, though is not further considered in this guide.

Once the pillar pages have been agreed, the contextual link audit has to centre around what percentage of possible internal links have already been created and how many are still to do. Because of this, we have developed an easy to understand new metric, specifically designed for Internal Link Audits, called “Internal Linking Score”.

Introducing “Internal Linking Score”

Given the 2022 study that the vast majority of internal linking opportunities are missed, we have developed an Internal Linking Score algorithm:

Mathematical equation for internal linking
Internal linking score equation

For the link audit, the challenge is to find the link opportunities in the first place. This is the “gold” within a link audit. Whilst removing links or checking for broken links is interesting as part of an audit, it is a list of link recommendations that will have an immediate and actionable use.

Now that we have a methodology for scoring a site’s Internal Linking, and tools for visualizing internal links, the missing part of the audit is finding a list of internal links that could be added to the website’s content. These are known as “link opportunities”.

A good tactic for finding link opportunities is to use Google search itself. Before starting, it is assumed that there is already an agreed list of key topics the client wishes to rank for and that these topics have clearly defined target pages. Then for each phrase, do the following:

Search for [Keyword] site:sitename.com on Google. The site: command restricts the search to your client site. The target page really SHOULD already come to the top of the list, but if it doesn’t, then make the client aware that the target page should either be a different page or that the content on the target page needs to become more relevant to the topic.

Look at the remaining pages, in the priority on the screen and find the keyword on the page (I use CTRL-F in my browser in Windows). Do not just link the keyword. Instead, find a suitable text and only if it seems appropriate from a user’s perspective reading that page. As this is an audit, I would suggest making a spreadsheet with the following columns:

  • Target URL
  • Title of Target Page
  • Target Topic [keyword]
  • Source URL
  • Title of Source Page
  • Anchor text you propose to use for the link

Other ways to find Link Opportunities

Tools do exist to help find link opportunities at scale. Many, however, try to look up exact anchor text or keyword text matches. These can prove very one sided, as they tend to only find links with no context. They miss synonyms and tend to lack nuance. It is only more recently that semantic based missing link tools have come into being.

Of course – it is not necessary to use a tool to create internal links. You can easily create internal links within your content to other pages on your site. However, you will not achieve scale and will not be able to easily recalculate and redistribute these internal links when content s updated. This will mean that you are likely to miss many internal link opportunities that may be open to you. That said, here is a simple step process to creating internal links within WordPress.

Step 1: Find pages on the site that discuss a particular topic

The best way to do this is to type in a keyword into Google followed by “site:yourdomain.com”. So, to find the pages on this site that might be appropriate for internal links for the TERM “Internal links” I would type this into google.

Step 2: Select your target page for your search term

Usually, the top result will be the page that you would want to have as your target page for the search term you have chosen because this is the one that Google already believes to be the most relevant. If you are writing new content, then, of course, you may choose the new page instead.

Step 3: Identify where other pages should link to the target page

You should insert the link somewhere around where the term is highlighted by Google in the search results. You do this by…

Step 4: Opening the page in edit mode in WordPress

In the introduction, we stated that we would also look at the Navigational menus and Breadcrumbs. Whilst these are very important, I left them to last because they are easier to visualize and understand.

Audit & Minimize Menu Blocks

Most of us tend to think of a website having one menu structure, but most sites tend to have multiple menu blocks. One along the top is common, but there are usually several other menu blocks, as you can see from this example from the BBC News page.

There are at least 7 separate menus on this page

The question for the SEO Audit is when is it appropriate to include any particular menu block and when should it be omitted. In general, “Pillar” pages should seek to reduce the number of menu blocks, whilst “generic” content should be more liberal. This has the effect of making the pillar pages more focussed around their main topic because menus on other pages will link into the pillar page… giving context… but these pages do not reciprocate the menu link back.

Create Templates page styles in your CMS

Since different pages may have different menu clocks on a page, the Internal Link audit should recommend a number of templates that a page can have. These templates will use different menu structures, to help promote this strategy without the content writer needing to overly be concerned about the menu structure. For example:

A Default Post page structure can contain all the relevant menu bars. Unless there is an active reason to make the page a “target” page for SEO, then the more freely the ideas flow through a website, the better.

A Pillar-Post page structure would normally strip out most of the least important menus. A menu for other related pages to the main target topic may still be appropriate and a top-level menu is always useful, but perhaps relegate all other content to a search box. This keeps outbound links more related to the content on the page.

A Vanilla Post Structure with an absolute minimum of menu blocks may be useful if there is absolutely only one desirable call to action for the user.

Two approaches to Breadcrumbs

Breadcrumbs are a special menu type that helps a user to easily navigate up and down a topic funnel. Not all websites use them, but they can be helpful for SEO as they naturally link connected ideas together if constructed sensibly. The two main methodologies are “by category” and “by tagging”. If you have logical areas in your website, then category based breadcrumbs are often simpler. Tags mean that the content writer (or you) will have to give every page a tag or set of tags that group the content into natural topics. Clicking on the tag (or crawling it with a bot) reveals a list of pages with that tag. Whichever approach is used, the best practice is to ensure that you only give each post one Category or tag and that you ALWAYS give a post a category or tag. Look for all the pages on the site which have been assigned the “default” category and make a list of any that have been incorrectly categorized or tagged. I bet there are a few! These should have the category changed, but if you use WordPress, this will also change the URL! (check for all other CMS systems you may encounter). You can use a plugin like “Yoast” or “Redirection” to manage 301 redirects when these are changed – or you can manually force the redirects using htaccess or Cloudflare or several other ways. The important result is that the old URL does not return a 404 after changing the category or tag. It should 301 redirect to the new page. There should not be two pages with the same content.

In Summary

We have looked at the reasons why Internal Link Audits can help the performance of your website. We have proposed a number of tools you can use to conduct internal link audits and listed some pros and cons of each. Your audit should cover:

Low hanging fruit, including:

  • Removing dead links to 404 pages
  • Minimizing redirects and links to non-canonical versions of pages.

Looking at a gap analysis between existing internal links to pillar pages and potential (missing) links to pillar pages.

In order to quantify and evaluate this, we have introduced the Internal Linking Score.

Providing an overview of the navigational link structure seeing that it plays into natural product or service area groupings.

Providing an overview of the effect of any Breadcrumb links and whether they help to guide the users to the pillar pages in a consistent manner.

Next, read How to automate your internal linking
Or read the full Internal Links guide

 

Internal Linking. The Guide.

Internal Linking is a skill that can levitate your site to new heights. This is a comprehensive guide on how to use links to optimize websites. It combines decades of expertise from the InLinks team in developing Internal link roadmaps and strategies.

Cover page for guide showing topics linked together

95% of websites fail at internal linking.

A Study of over 6,000 websites.

This is the result of an analysis that we have conducted on more than 5,000 websites across the globe and the reason why we’ve built this guide.

Definition

Internal linking, as opposed to backlinking, is the art and science of interconnecting content within your web site for SEO.

Our definition of Internal Linking at Inlinks.

Even if you have doubts about Internal links, you might want to consider that Internal Linking from Wikipedia is cited as one reason that they do so well in the search engines, according to Lewandowski, D. and Spree, U., 2011. (Ranking of Wikipedia articles in search engines revisited: Fair ranking for reasonable quality? Journal of the American Society for Information Science and Technology62(1), pp.117-132.)

Internal Linking Guide Contents

First, let’s have a look at the main benefits.

Benefits of setting up an internal linking strategy

If you rely on content marketing to support the growth of your website traffic and rankings, then you will know that you need to produce well-written, high-quality posts to please Google.

And as the competition increases, the chances of ranking at the top of Google SERPs reduce. Good content is not enough. This is why you also need backlinks.

Implementing your strategy will help search engines understand the structure of your site. It will also show what your important pages are.

Here’s a summary of the benefits that internal linking provides to SEO:

SEO benefitExplainer
User experienceInternal links help your visitors to navigate through your content.
Link juice distributionIf you manage to get backlinks from other websites, internal links will help you distribute the link juice to your important pages
Time on siteAdding relevant internal links in your content will make your users more likely to discover other related content, improving session duration
PageviewsSince relevant inlinks will encourage your visitors to continue their visit, it will mechanically increase the number of page views per session
Crawl and indexingA good internal link profile will help Googlebot and search engines better understand your site architecture and help them discover new pages
Long-tail keywordsUsing keyword-rich anchor texts with synonyms and phrases will improve the number of keywords your website is ranking for
RankingsAs a consequence of the above benefits, your overall rankings will increase when you engage in internal linking optimization

Featured Resource: 15 benefits of internal linking

How do you build internal links?

There are mainly two types of internal links: navigational and contextual.

Navigational links are the website’s primary navigational structure, the ones you’ll find in the website’s main menu, in the sidebars and footer. You’ll find the same navigational links structure throughout the website most of the time. They’re mainly used to help users navigate to category pages or company information pages.

Contextual links (or editorial links) are embedded in a page’s body text. These links are very useful for SEO as they help PageRank circulate between your pages. Semantically relevant phrases around the link will convey better SEO juice to the target page.

As soon as you’ve set up your navigational links, you need to start building your contextual links. Here is the process to follow:

Step 1: Define your cornerstone content for a given keyword.

This is probably the more critical step. Often, webmasters think that by having lots of web content on or about the same topic, they will rise to the top of search engines. Nothing could be further from the truth if you do not give all that content hierarchy through the links. Some SEOs call a lack of hierarchy “cannibalization”. The search engines see multiple pages on the site that COULD all rank for a given topic. If the MAIN page is not defined, no page has enough clarity or confidence to rank.

Actively decide which page should be the master page for a given phrase or topic. You also decide that the other pages should NOT rank for that topic. Link to the cornerstone content when the topic is mentioned elsewhere.

For more information: How to associate target entities to web pages

Step 2: Find anchor text opportunities

Use your site’s search functionality to find other mentions of those keywords. You can also use the popular Google hack to do this. Search in Google for “Your keyword site:yoursite.com”. (That is to say, the SITE: command within a Google search will limit Google’s search results to the site you specify).

This latter approach is not practical if Google has not yet adequately indexed all the content on your website, so do use your site’s search function if it has one.

Wherever you find your keyword mentioned on the site, link that keyword through to the cornerstone page. This is not as straightforward as it sounds. If your keyword is too specific, you may not find all the mentions in a search. Worse, you may use an increasingly unnatural “anchor text”. (Anchor text is the text that the reader sees when looking at the link on the web page.) Avoid this.

Try and make sense to humans. For example, you may have a cornerstone page about “The Ritz Hotel, London”. You might have the text “Tea at the Ritz” on the page about afternoon tea. You need to decide whether to use the words “The Ritz” or the whole phrase “Tea at the Ritz” in the anchor text. This should depend on whether there is another page about the concept of “Tea” at “Tea at Hotels”. If not, then use the whole phrase.

Step 4: Repeat with varying keywords and synonyms.

Google often understands variations on a theme. For example, “Site, Website, and Domain” may (or may not) mean the same thing. It will depend on the context of their use.

Assuming you are not talking about another meaning for “site” and “domain”, let’s say you have a cornerstone page about “Websites”. You may also want to link mentions of “sites” and “Domains” to the same cornerstone content. Doing so should help Google see that these are similar concepts.

Always follow best practices

There are several best practices to follow to make the most of your effort.

Here are the main ones you definitely should follow:

  • Serve the interests of your visitors: link to pages talking about similar subjects
  • Use relevant anchor texts and mix keywords with synonyms
  • Always use do follow internal links

Featured Resource: Internal Linking Best Practices

To properly audit your internal links structure, you should breakdown your audit into three main steps:

  1. Diagnose and fix problems
  2. Get an estimate of your internal linking score
  3. Identify your opportunities

1. Diagnose & Fix Issues

To properly optimize an existing internal link structure, you first need to go through potential issues and fix them

Identify and fix broken links

Broken links go to non-existent resources, typically a 404 error page. You can get a clear overview of these broken lists by running a crawl of your website using a tool like Screaming Frog or Sitebulb.

Once you’ve listed all your broken links, you got a few solutions:

  • Change the link destination to an existing page
  • or add a redirect from the non-existent page to a relevant one
  • or simply delete the link

Make sure your links do not cause content duplication. When done inconsistently, they can create duplicate versions of your pages.

This may happen when some of your links to a page end with slashes while others don’t, or when some start with the www version of the URL and others don’t.

Search engines might consider these pages duplicated if you didn’t set up redirect rules.

The best way to handle this issue is to set up the required redirect rules and fix internal links to ensure the consistency of your internal link structure.

Optimize your anchor texts

Any contextual link (embedded in text paragraph) should incorporate a meaningful anchor text. Get rid of any “click here” anchor text. Use keywords and synonyms instead.

If some of your images are used for internal linking, as may happen for Call to Actions, then make sure that you’ve added a meaningful ALT attribute to these images.

Featured Resource: How to do an internal links audit

2. Compute your internal linking score

Whether you got a blog with hundreds of pages or a website incorporating a lot of text content, there is a simple way to know if you’re falling in the 5% of websites with a perfectly optimized internal link structure.

We’ve developed a simple way to assess this with the internal linking score.

Here is the process to follow:

  1. Sign up for a free account on InLinks, then create a project.
  2. Import your pages to identify the named entities they contain.
  3. Associate your essential pages with the entities they relate to.

Then, in the links tab, you’ll see a bunch of statistics, including your overall score (for the selected pages), and a breakdown of this score topic by topic.

Internal linking score computation

You’ll find more details on this internal links score computation in our study about the state of internal linking, but basically, this score is the ratio between existing, hard-coded internal links and internal links opportunities detected by Natural Language Processing.

If you manage a website having hundreds of editorial pages, you probably have tons of link opportunities sleeping in your content. Building them out manually will take you days or weeks. Using a plugin to automate internal link building is not a solution, as your links will suffer from exact match anchor texts and lack of context.

Do you want to optimize your internal linking? Turn your words into actionable data.

Think entities, not keywords, and you’ll be able to interlink between posts in different languages. s (BTW, if you’re not sure about what an entity is, have a look at the Entity SEO guide)

Now, if you manage to compute your internal links score using InLinks, you also have a list of available link opportunities.

This list is obtained first by extracting named entities from your pages to build a Topic Map of your website, listing all topics (aka entities) mentioned in your content.

Example of Topic Map of an SEO agency website

The second step in identifying your link opportunities is associating your target pages with related entities. A tool like InLinks will tell you exactly where you’ve talked about these corresponding “target” entities and if there are links built to your target pages.

If no link has been made, InLinks will show this and build these missing opportunities by selecting the best anchor text on a given page.

Example of a list of internal links opportunities with proposed anchor texts

The main benefit of this entity-based approach is that links opportunities are twofold:

  1. Missing link opportunities will be detected using entity synonyms. A keyword-based approach will only bring you a list of opportunities based on an exact match
  2. Anchor text suggestions will again use entity synonyms, context, and all the knowledge detected in your content to automatically build the missing links, enhancing your topical authority.

Moreover, you can define specific rules, such as “link only to entity A if entity B is also contained in the text”, to sculpt your internal links profile.

More information: How to automate your internal linking

4: Check out our Internal Linking FAQs

We have collected a list of the most frequently asked questions (with answers) about internal linking.

Some Case studies

Finally, in case you’re still not sure if internal links are a key factor for SEO success, here are some case studies showing the impact internal links may have on Google rankings:

Next: Read Internal linking: 15 benefits for your website’s SEO

You can define your own entities on your own web pages. When a search engine such as Google comes to see your site, they will see the underlying structured data on the page, which will allow them to easily categorize the content. Webpage schema is a sensible approach to doing this. You do not need to be listed in other RDFs to have entities recorded around the content you create.

Here is full documentation on structured data from Schema.org. However, as this is the beginner’s guide, here are some quick and easy ways to understand, organize and add structured data to your pages.

Looking at your structured data

Unless you are used to working with code, it can be very difficult to understand what structured data is already on each page of your website. Fortunately, Google provides a simple to use structured data testing tool. You do not have to own a website to be able to use the tool. It works on any site.

Google’s Structured Data Tool Output

Tip: Use the structured data tools on your competition

There are many structured data tools. It is likely, though, that you have a few serious competitors in your niche. Some of these may have done a much better job than you at becoming an entity in their own right or at least becoming an expert on entities that you feel you should own. You can take two steps to view a data structure that might work for you:

1: Use Google’s Knowledge Graph Search Tool to establish whether your competitor’s brand or name is already properly defined in the Google Knowledge Graph.

2: Then use the Structured data tool on the best-represented competitors’ web pages to understand how they used structured data.

Many tools claim to automate schema, but very few, beyond InLinks will create webpage schema for you automatically. This is because most schema tools will create other types of schema – which is valuable of course – but not necessarily help Google understand the content on your site. Another schema may help turn content into a recipe, or an event or describe the author or the organization behind the content. InLinks, on the other hand, describes the content meaning itself, by taking the most important entities and telling search engines through a schema that the page is ABOUT these main entities and then takes the secondary ideas and tells the search engines through a schema that the page MENTIONS these secondary entities.

Webpage Schema Example
This JSON-LD Webpage Schema was automatically generated by inlinks.net

Inlinks can automate this very effectively because it manages its own knowledge base.

  1. Add the URL to inLinks
  2. Associate the URL with one or more primary topics
  3. There is no 3 (if you have already set up inLinks code)

Using Plugins for WordPress

Yoast plugin for WordPress: Yoast’s plugin is one of several that will create your structured data for you. All you have to do is to decide whether your blog should be set up as a person or as an organisation. The plugin then uses your other setup configurations, such as your user profile and social media profiles to build out the structured data.

There are many other plugins that also allow you to manage your structured data on WordPress. A self-updating list is here. You should only integrate plugins with a large (10,000+) user base and one that is being regularly updated to work with your versions of WordPress. It is also probably good to only have ONE plugin trying to manage your structured data at any one time. You can make plugins inactive at the click of a button in WordPress if one plugin is clashing with another.

PreviousSemantic Search Guide ContentsNext Guide: Internal Linking

Whether you have decided that your strategy is to be the entity, be an authority on the entity or play on the edge, your next step is to start marshalling or making your digital assets. Here the thought process is rather different to the old school idea of “content marketing” where you just carry on writing content about a subject and hope it generates organic traffic. The best way to understand this is to return to Google and look more closely at how many different ways digital assets affect something. Let’s choose a very different theme this time… something that might be seen as a bit of a free-for-all “entity”. I’m feeling hungry, so let’s do food:

As with so many entities, Google chooses to have a snippet from Wikipedia in the knowledge box here. There is a very interesting section in the book reference earlier called “Entity-Orientated Search” on the structure of a Wikipedia page. Wikipedia is surprisingly exact and consistent, making it extremely easy for a knowledge base to create structure out of the content in Wikipedia. There are also many other RDFs (Resource Description Frameworks) based on the Wikimedia organisation. We’ll talk a little about RDFs in general and Wikimedia properties in particular separately.

The point I wanted to make here is that there are many other digital assets on this page other than just recipes for Coronation Chicken. There are Youtube videos, for example. Youtube is an extremely large structured data source, so why would you not try to have a video on how to make Coronation Chicken if you wanted to influence this page? Putting your brand of Mayonnaise in the video is part of the optimisation.

Then there are multiple images on the knowledge box. These can come from anywhere on the web, including your website. Do you see that one for “Curry Ketchup”? Now THAT is finding a niche J. My point is that you cannot optimise for Entity search unless you create all the digital assets that Google is choosing to represent on the page. Images are important. There is a renowned case of one brand taking this too far, by changing all the images on Wikipedia for ones that had their brand on. Unfortunately, Wikipedia did not see the funny side and now the case study makes up the majority of their brand page. Ask someone on Twitter if you want to find the case study.

We now also see ratings on the search results. Ratings are another form of structured data, helping Google to assess the quality of the coronation chicken recipes that it might choose from.

Lastly – I notice that Wikipedia thinks coronation chicken was invented by Constance Spry and Rosemary Hume and links TO their entries, which in turn link back. Look at how Wikipedia continually cross-references these facts through internal links (inlinks):

Rosemary Hume’s Wikipedia entry links back to the Coronation Chicken entry

Twitter Content

Once Google has associated an entity with its Twitter profile, a direct search hit on the entity will also return live Twitter posts in the search results! It is therefore important that IF you use Twitter, you properly link to it through structured mark-up and website (and complete the loop by linking back from your profile). On top of this, it is important to make sure the Twitter “tone of voice” is consistent with the rest of your brand story.

Video Content

Whilst posting your own videos on Youtube is a great idea, it is very possible that videos are created by other people that talk about you, your product or entity. For example, if your staff talk at events. These are also powerful assets and you can harness these by including these in your video channel if they are on YouTube or embedding them in your blog content. In doing so, you help to connect the dots for the knowledge graph.

PreviousSemantic Search Guide ContentsNext

This is much harder than it sounds, mostly because businesses do not entirely agree with the message that they want to portray and the niche they want to dominate succinctly enough.  Mary Bowling, a long-time SEO from Ignitor, recommends looking at your own website as if it was your own personal knowledge graph:

Figure 2 Make your site a Knowledge Base of your brand. Reproduced with permission from Mary Bowling.

This approach was also proposed by Jarno Van Driel, known as “@SEOSkeptic” on Twitter, several years ago. However, I think we can step one level beyond this approach. In a modern marketing strategy, you need to communicate with your audience on their terms, not yours. That is to say, some will engage on Twitter, others on Instagram and others on Youtube. Increasingly few will engage directly via your website and this should be factored into your personal knowledge graph.

This means that the relationships (links) should not solely be on your website, but should connect all your digital assets. In addition, Entities in your own personal Knowledge Graph should be extended to other digital assets beyond the website.

This leads to discussing the creation of Digital Assets.

PreviousSemantic Search Guide ContentsNext

Being the Entity

If you are a business or organisation, then you ARE an entity. Google may not have enough confidence, yet, to know this. Every person on the planet is an entity, but Google does not yet try to distinguish between every version of “Purna Patel or “Sally Stokes” on the planet… at least not in the search results. In the end, though, Google is collecting large amounts of this data. Very few of us in Western society can avoid having some form of Google login. Google, is currently having to address privacy concerns, however. This will mean that you being represented in search as an entity will increasingly require you to actively opt-in and request such representation. Google+ was shut down in December 2018, no doubt largely in response to the GDPR regulations in Europe and increased concerns in the US over privacy.

This suggests that Google is being careful to ensure that if you as an individual are represented in Google’s Knowledge Graph (or on the knowledge box in the SERPs), that they are confident this is a result which is not only accurate but also in the public domain and public interest to show. There are many ways to approach becoming a named person or entity, some of which are highlighted in this guide under “RDFs and how to find relevant ones”.

Google My Business (GMB)

As an organisation, your entity can live and flourish in Google, initially through Google My Business. GMB is itself an RDF and a great place for an organisation to start. Being listed in GMB will usually give you the ability to show up as a knowledge box, but this might be only in tight searches. Nevertheless, it acts as a useful launchpad for most organisations.

Becoming connected to an Entity

If you cannot BE the entity, you can still become an entity by association. It is very possible that nobody can own the entity or thing in question. This work is an example. It hopes to show authority in the field of SEO. SEO (or more accurately Search Engine Optimization) is an entity that Google understands. You can see from the knowledge box that writing a book on SEO is probably a great way for Google’s knowledge graph to link you closely with SEO.

Damn! My old sparring partner, Rand Fishkin’s excellent book (co-authored by Eric Enge, Stephan Spencer and Jessie Strichiola) is right there. “mastering the Art of SEO”. The very fact that four authors all known for their SEO are listed on the cover, makes them all semantically close to each other. Do you see how these close associations can easily start to create bubbles in a Knowledge Graph? You might understand Entity Search from the ground up and may have built your own knowledge graph as Inlinks has… but unless you are associated closely with the subject matter, the bubble that already exists will cut you out. Don’t get angry… it is simply Google’s equivalent of the echo chambers we see in society and on social media. These echo chambers in themselves are not good or bad, they just are. You simply need to find another way in…

Other RDFs

Wikipedia is by no means the only data source that Google can extract data from…

Write a book and get it published by a reputable publisher

This will get you associated with the book ontologies. If your book has an ISBN number, then this can be independently referenced. (The USA has a similar book referencing system).

Act in a film or Direct a Play

the IMDB is a powerful RDF database that is believed to be respected by Google as an authoritative (and therefore trusted) source of information about actors and directors. If you are in a film and listed in the credits, you can get into the IMDB and then claim your listing, much like you can with Google My Business. Having this listing will either help you to become an entity in your own right or will give a neutral and verifiable link for the creation of a Wikipedia entry.

Stand for something!

If you are a Congresswoman or a Member of Parliament, it is almost impossible not to be considered as an entity, because all the other people will also be considered entities.

If you are a band, get on the Festival Circuit

Your band may not be an entity, but Reading Rock Festival, Glastonbury, or Burning Man certainly is. By getting onto the bill of these established entities, you create independently verifiable information about the band.

A few music festivals likely to be listed in Google’s knowledge graph

Next: Aligning your online presence with your niche

PreviousSemantic Search Guide ContentsNext

Become an entity or an expert on an entity

Your first strategic decision is whether you want to try to BE a fully defined entity in your own right. There has been a move in recent years away from optimizing for keywords and instead simply trying to make your brand stand out from the crowd online. One reason this works well is that your brand can become an entity that you more or less can control (although not always). Once you have an entity on Google’s knowledge graph, what that entity gets up to will be continuously updated in the knowledge graph. If you are a band, for example, then marketing your new album organically becomes MUCH easier than it would be for a record store to market the same album. The knowledge base will simply update, showing the new album. This immediately creates a short vector between the album and the band. The relationship is defined… but the record shop may have a harder time and will need an edge strategy.

Strategies covered in more detail

Below are several competing ideas for semantic SEO. The SEO industry rarely agrees on anything and tends to use the phrase “it depends” way too often for C-suites to take SEOs seriously. In the end, you will need to weigh up the merits and risks associated with each approach and act accordingly.

PreviousSemantic Search Guide ContentsNext

Getting a Wikipedia entry is fraught with dangers. Inlinks has chosen not to list a specific strategy. Instead, we are bringing in tips and ideas from well-known practitioners in online retrieval, including inLinks users.

One of the challenges is that Wikipedia is controlled ultimately by a very small and not necessarily unbiased group of people. According to Ricardo Baeza-Yates (24 minutes into this lecture), 0.04% of the users of Wikipedia create 50% of all posts. That is considerably more extreme than Facebook or Twitter, also cited in the same lecture.

0.04% is less than 1 in 2,000 users.

I have previously discussed the bias that results from this problem over on my personal blog.

What the Experts Say

I approached Wikipedia editor, Search Engine Journal author and Webmasterradio online radio personality, Jim Hedger to get some thoughts.

“The crux to Wikipedia is to go very slowly and build personal authority. It’s a community driven legacy project with a high sense of purpose and mission. It has a hierarchy of authority but most decisions are made by regular editors who subscribe to a common set of guidelines.

Topical areas people want to edit in become little sub-tribes of networked contacts who have worked the subject material for years and newly interested people. Such communities are built on trust in long term dedication to accuracy and skill. Pretty much everything else revolves around some variation on the rules of educated and civil society.

Cite everything you can. Wikipedia is all about providing new paths for users to follow when examining and evaluating information if there’s a credible source. (There is strict criteria for what can be considered “credible”.)

Don’t try to impose your ideas on other people without first considering their backgrounds and experiences. Wikipedia isn’t social media. There is a definable right and a wrong and a great deal of proof is required to prove oneself right, even on things that are obvious to every observer on Earth.

Forgo: political bias or commercial goals; ‘I mean…’, ‘like’, ‘ok’, ‘so like’, ‘of course’, ‘but’, ‘you know…’. Polite, educated, civil society and all that. We have incredibly complex polite, educated, civil societies already made up of people who have known each other since they went to school together. We all know how things are done amongst people who have lived to learn to trust each other eh? It’s the same, ’tis the same in the whole wide world. Keep your political and / or commercial ideas at an arm’s length from your profile until you know its OK to introduce them in subtle ways.

Citations are extremely effective ways of being subtle but, of course, they’re the among most examined elements of newer editors’ work. Images are another way of introducing subversive or commercial content without being completely obvious.

Almost all Wikipedia editors, meta-editors, and admins can read (almost) but fewer will be able to visually contextualize an image unless they are extremely familiar with a topic. Know when to pick your battles. Unless you’re behind the scenes or sit on an American Parents Teachers Association, it’s hard to describe the levels of petty bullshit that fly around in discussions about ideas or controversial edits. You have a finite amount of social capital and community respect. Know how to invest it so it grows rather than spend it so it declines .”

Jim Hedger

Arnout Hellemans, a Dutch search specialist, agrees that you should take small steps and not try to dive in. He also recommends focussing on Wikimedia’s prime data repository, WikiData. Paraphrasing his telephone conversation:

I really became interested in Wikipedia after reading a SEMRush article [by Jacques Bouchard] on how to use Wikidata. The trick is to move slowly and connect dots. Let me start with the example of a hotel, such as the Waldorf in New York. Look up other hotels that have entries on Wikidata and look at the “identifiers” section. [This represents other URls that represent the same physical entity.] Now make sure that you add similar identifiers to your hotel.

Wikidata is the ‘Linking pin’ between all of the trusted topics of your entity.

Take your time and do not make multiple edits on the same entity. Edit and add identifiers to many other areas and add to the collective repository and not just ones you are directly interested in.

For SMBs and people it is mych harder to use Wikidata.

Arnout Hellemans

Information Retrieval expert, Dawn Anderson takes a much more direct approach:

Do something notable I would say. Getting into Wikipedia is not a given for anybody.

Dawn Anderson

This is great advice but demonstrates how challenging it is to warrant comment in what is an encyclopedia. There is often a feeling of anguish at the personal level that you or your favourite entity does not warrant inclusion in Wikipedia, but would you have expected such an entry in previous versions of encyclopedias? Such and Encarta or the Encyclopedia Britannica? If not, then perhaps this is a pause for thought.

Jason Barnard concurs, but adds a cautionary note:

When thinking about a place in the Knowledge Graph, I would say ‘find your springboard’. As Dawn says, what makes you notable (and worthy)? Wikipedia’s rules are a great guide, but are no longer the ‘law’. The opportunities have gotten MUCH wider in the last year. And will get wider still in the years to come.

If you create an entry that is not worthy of a place, or overdo editing on pages you are closely associated with, you will get a warning, or possibly removed. The job to get a page relisted is very very difficult, and the work to remove a warning is very slow and delicate. Be warned !

Jason Barnard

Greg Niland of GoodROI suggests:

Using the Help a Reporter site can help to build up enough media mentions to support a case for inclusion.

Greg Niland

This looks at solving the problem from a side-on perspective and avoids trying to manipulate or edit Wikipedia directly. The theory is that if you can be cited as an authority in a reputable source, such as the Wall Street Journal or the New York Times, then this significantly increases the odds of a third party using your citation as an independent citation to back up a Wikipedia entry. Note that this strategy is not directly aiming at BEING an entity on Wikipedia, but instead develops LINKS from Wikipedia.

Avoiding editing your own entry

I asked “how should I suggest people deal with the thorny point that if you are connected to the entity/article, you are not supposed to edit the entity/article? This, to me, seems a little misguided as it means by definition, the editors are NOT experts in the content they are editing… but how should a would-be-notable address this?” I received this sage response from

I would suggest going to the “Talk” tab, start a thread there and just tell them that you realize you can’t edit it because you’re connected to it, but lay out the inaccuracies/corrections/additions and ask if someone would please make those changes.

Doc Sheldon

Other Resources:

  • A very good article on getting listed on Wikipedia is offered here.
  • Wikipedia also gives a guide itself here.
  • This SEMRush article from 2015 is also cited above.

PreviousSemantic Search Guide ContentsNext

Can you make this guide better? Send suggestions via the blue chat icon. ==>

When researching how the KG was being updated, it initially took me a long time to find entities that were anything except Wikipedia listings. It turns out, though, that Google has a lot of data that it does not initially reveal in the knowledge graph answer box.

Google’s knowledge graph extrapolates insights gleaned from its data set. Here is an example:

Google made two leaps here. The first was in what I searched for. I searched for “brother” and Google returned a sister! Google knows that “brother”, “Sister” and “siblings are semantically so close that Google made the substitution for me (and didn’t even tell me that it had). The second leap is that Google has provided details on a person without their own Wikipedia page.

In fact, there is no specific entity for Kasmira Cooke anywhere in the Wikimedia set of sites, if we use “Wikidata.org” as a measure:

How did Google get to this level of confidence? Google uses content to add to existing entries and in the process, creates new relationships. each “triple” as described in an earlier section, creates two entities. So in this case, Google felt it could trust the content on Wikipedia which gives several triples in just this section:

(From the Wikipedia page for Freddie Mercury)

Now Google knows:

  • Freddie Mercury (is the brother of) Kasmira Bulsara
  • KashmiraBulsara (is a type of) Person
  • KashmiraBulsara (is the same as) Kasmira Cooke

In fact, Google can then carry on collecting information about the new entity. Put “Kasmira Cooke” into Google and you get a pretty solid looking knowledge box.

What this teaches SEOs

You do not NEED to have a Wikipedia page to get your own entity in Google’s Knowledge Graph. Even so, it very much helps to be related (in this case quite literally) to an entity existing in Wikipedia. Have a good think about the entity you would LIKE to get listed in Google’s knowledge graph. Does it have any close relationships with any listings in Wikipedia? Does the person running that entity have a famous brother/sister/father/mother? If so, that person might get listed in Wikipedia as related to an existing entity. From there, they have their own entity. After this, you can possibly use schema to help Google understand that this entity runs the entity you wish to get listed.

Hire a Chair / Patron

Not all of us have the luxury of a famous brother or sister. But Princess Anne has nine pages of charities that she supports. These allow Google to make the connection. It does not in any way GUARANTEE it, though. Leuchie Forever Fund is a charity supported by Princess Royal, but as of the writing date, this charity did not have an entity, but it offers a potential path for the enterprising SEO to develop.

Who says that the Old School Tie network is dying out in the age of automation?

Start with a Unique Word to Brand your Entity

Google would have had a lot more difficulty in making these relationships if Freddie Mercury was not a unique name and if his surname had not been “Bulsara”. Uniqueness helps the KG reach levels of confidence faster. I am not suggesting a change of name will guarantee success, but it might be a consideration if you are just starting out and have not yet settled on a strategy.

Google is an agnostic White Man

This might be a little contentious, but “brother” and “sister” both have different meanings in black and religious communities. Google has connected these words so closely with the word “siblings” that its algorithm may have become closed to other interpretations of these words. This may emanate from the types of people involved in curating the initial seed set. This bias is a recognized problem in the building of Knowledge graphs.

There are also other databases that google considers beyond Wikipedia… let’s look at a few approaches to getting into these…

PreviousSemantic Search Guide ContentsNext