The following guest article has been written by Stan de Jesus Oliveira, Stan specializes in writing blog articles on advanced and semantic SEO.
You can find his SEO blog (in French) on: https://createur2site.fr/seo/
InLinks is a powerful tool that has become essential for both me and my clients. However, an ongoing challenge we face is the difficulty in retrieving new URLs, as well as those we haven’t introduced to the tool yet. Although InLinks’ internal tool is useful for detecting links, it can sometimes miss some pages or not recognize them.
Faced with this issue, I looked for a faster and more effective solution to ensure that all our URLs are accounted for. In this article, I’ll share a simple yet powerful script to help overcome this challenge.
It can also be helpful to determine how many pages are left to be added to choose the appropriate plan on InLinks. Although InLinks offers many impressive features, it unfortunately does not provide this type of information directly.
A Python script
I chose to use Python to identify the missing pages. Known to many technical SEOs, Python is both powerful and easy to use. Moreover, thanks to Google Colab, we can run Python code without having to execute it on our operating system.
Step 1: Extracting HTML from InLinks
To start, download the HTML from your Inlinks interface.
Retrieve the HTML code from your InLinks interface. Head to the “websites pages” where the list of links is displayed. Once on the page, inspect the element and copy the HTML code present in the <html> section.
Copy and paste the code in a .html file. To simplify the procedure, name the HTML file “source-code-inlinks.html” .
Why is this step necessary? The goal of this step is to spot all the links present on the page. In other words, it allows analyzing the entire code to extract only the URLs of your site. This is a significant time saver, because directly from the interface, the URLs appear truncated. Without this method, manually retrieving each URL would be tedious (several hundred pages in general): you would have to individually click on each link and note the URL displayed on the right side.
Step 2: Run the script
- Go to the Google Colab link: https://colab.research.google.com/drive/1P44tQzZXEyEZGiTgpDehABlmAYIJP1bR?usp=sharing
- Launch the Python script
- Enter your URL
- Download the InLinks HTML file you just copied.
Step 3: Compare with your sitemaps
You don’t have to do anything.
After extracting the links from the InLinks file, the script will compare these links with those in your sitemaps (post-sitemap.xml and page-sitemap.xml).
If needed, you can modify the links on the Google Colab:
Step 4: Displaying results
The script will then display the list of links present in your sitemaps but missing in InLinks.
Once you have the missing links, simply copy and paste them into the InLinks dashboard:
That’s all !
By simplifying the discovery and accounting of missing URLs, we are able to maximize the efficiency of our work on InLinks. Put this script to the test and let technology ease your SEO approach. Happy optimizing to everyone!