Before we dive in, I want to cover – at a high level – our general philosophy on link analysis and planning. Although it should become obvious as you read on, it ought to be made clear upfront, as it helps us create a linear process for building backlink campaigns.
We believe, like anything pertaining to SEO, reverse engineering competitors will always remain not only a crucial step but the most logical strategy to inform an action plan. In Google’s black boxed world we have little clues that come in the form of patents, quality rating guideline docs, resources they publish, etc.
These serve as great resources for SEO testing ideas. But, in my opinion, we shouldn’t take any of these things at face value. We have no idea if Google is still using protocols from a 15 year old patent in their current document ranking algorithms. So, we inform ourselves using these tidbits, and we test. The SEO Mad Scientist was founded to be a detective that can use these clues to serve as one of a few information vehicles that inspire tests.
This is a thin abstract knowledge layer that should serve as a sliver of your SEO campaign building strategy.
Then, we have competitive analysis.
I’m going to make a declaration that really does not have a counterargument, at least not one of sound logic. Reverse engineering what’s working in a SERP is the strategy everyone should use to influence what optimizations you perform. There really isn’t a better way.
To simplify this statement into the most palatable example, let’s take a trip back to seventh-grade algebra. Solving for ‘x,’ or any variable, requires you to look at what constants exist and then perform some simple orders of operation to discover the value of said variable. We can look at what our competitors are doing, the topic coverage of their pages, the links they build, keyword densities, etc.
Now, if you are collecting hundreds or thousands of pieces of correlative data, I would say a majority of that is going to be unhelpful. The benefit of tracking larger swaths of data like this would be to measure when certain things seem to shift to be more causative of rank changes when tuned. For most people, it will serve you just as well to have a much more concentrated list of best practice items you ought to reverse engineer.
The final layer of this strategy is to outperform. As macro as it may seem, especially in extremely competitive SERPs where it could take years to match the top competitors, building to be in parity with sites in the top spots is phase one.
Once there, the idea is to go above and beyond to feed Google proper signals that continue to push rankings and allow you to create a stronghold at the top of the SERPs. Proper signals and “best practice” items unfortunately can come down to SEO common sense.
I hate typing that because it thrusts us into a realm of subjectivity. It takes experience, testing, and some notches in the ole belt of SEO success that builds the confidence to identify where your competitors went wrong and how to address that during the planning stages.
5 Steps to Understanding Your SERP Ecosystem
Exploring the ecosystem of websites and links that power a SERP can offer us a smorgasbord of actionable information invaluable to a link plan. In this section we will work on organizing this data into a digestible system that will allow us to identify valuable patterns and insights for our campaign.
I would like to take a quick moment to expand on the thought process that led us to organizing the SERP data in this manner. You’ll find in the next section our protocol for taking a very close look at the top few competitors, but there is a story being told if we take a step much further back. For example:
Start performing some queries in Google and you are quickly met with millions of results. Sometimes over 500m results. This means while we mostly dedicate our attention to the top few websites to analyze, one could argue the links pointing to even the top 100 results are statistically significant, or at least they can be assuming they pass the link litmus test for not being complete spam or junk.
I want insight on large swaths of what powers the top ranking sites in the sea of documents Google has cached for these queries and with that data here are just a few of the things we can accomplish.
1. Find prominent links powering your SERP ecosystem
In this case a prominent link is defined as a link that continues to pop up in our competitors backlink profiles. As you can see from the image below, where we are looking at a smaller number of websites in the ecosystem for demonstration purposes, there are links that point to almost every site in the top 10.
Analyzing more competitors, if you so wish, will uncover more intersections like the one above. I love this strategy and it’s supported by solid SEO theory from a couple of sources which I will cite below.
- https://patents.google.com/patent/US6799176B1/en?oq=US+6%2c799%2c176+B1– This patent expands on the original PageRank concept by factoring in topics or context, effectively acknowledging that different clusters (or patterns) of links matter differently depending on the subject area. It’s an early example of Google refining link analysis beyond a single global PageRank score, suggesting the algorithm detects patterns of links among topic-specific “seed” sites/pages and uses that to re-rank or refine scoring.
Relevant Quote Excerpts
Abstract:
“Methods and apparatus consistent with the present invention compute multiple importance scores for a document… The multiple importance scores may be biased by different distributions so that each importance score is particularly suited for documents of a particular topic. … The importance scores may then be combined with a measure of similarity to a query to provide a rank for the document.”
- Implication: Google identifies certain “topic” sets (or clusters of sites) and uses link analysis within those sets to produce “topic-biased” scores.
- While this doesn’t outright say “we look kindly on link patterns,” it confirms Google looks at how and where links appear, segmented by topic—a more nuanced approach than a single global link factor.
Column 2–3 (Summary), paraphrased:
“…A plurality of ‘topic vectors’ are established. Each topic vector is associated with one or more authoritative sources… Documents linked from these authoritative sources (or within these topic vectors) may receive an importance score that reflects that affiliation.”
- Implication: If the link profile of a page aligns with certain authority “seeds” or typical linking patterns for a topic, that page may be deemed more relevant/important for that topic.
- https://ftp.cs.toronto.edu/pub/reports/csrg/405/hilltop.html – While not a Google Patent this paper was written by Krishna Bharat which at a later date joined Google as leader of the Google News efforts.
Hilltop is an algorithm that attempts to find “expert documents” for a topic—pages recognized as authorities in a certain field—and sees who they link to. These linking patterns can pass authority to other pages. While not phrased as “Google sees a pattern of links and likes it,” the underlying concept is that if a set of recognized experts commonly link to the same resource (pattern!), it’s a strong signal.
Relevant Quote (from original paper)
“An expert document is one that is about a specific topic and has links to many non-affiliated pages on that topic… The Hilltop algorithm identifies and ranks documents that links from experts point to, boosting documents that receive links from multiple experts…”
Hilltop is an algorithm that attempts to find “expert documents” for a topic—pages recognized as authorities in a certain field—and sees who they link to. These linking patterns can pass authority to other pages. While not phrased as “Google sees a pattern of links and likes it,” the underlying concept is that if a set of recognized experts commonly link to the same resource (pattern!), it’s a strong signal.
Implication: If multiple experts in a niche link to a particular site or page, that is recognized as a strong (pattern-based) endorsement.
While Hilltop itself is older, it’s believed aspects were folded into Google’s broader link analysis algorithms. This concept of “multiple experts linking similarly” is effectively Google looking at a pattern of backlinks.
Although these two are specific to links there is a lot of language in the patents that allude to Google “learning” what a quality SERP ought to look like based on quality signals it finds on other documents that are relevant to yours.
I want to always look for positive prominent signals that repeat themselves during competitive analysis and then take advantage of those opportunities whenever possible.
2. Discovering Outlying Link Opportunities Using Degree Centrality
The obvious list of links we find if we want to build toward competitive parity comes from the top several ranking websites. Manually parsing through dozens of backlink downloads from Ahrefs is painfully tedious work. Even processing that out for a VA or staff member makes for a contested queue of never ending work.
Ahrefs does allow for 10 competitors to be entered in their link intersect tool so if you are a subscriber (which in my opinion still outperforms all of the other link intelligence tools on the market, nevermind the spotty PR issues they have had in the recent past) so you definitely go that route if you’re comfortable with that depth and want to avoid this extra work.
For us, as I mentioned earlier, our interest lies in going far enough outside the list of links that every other SEO is building to achieve parity with the top handful of websites. This builds us somewhat of a moat early in the planning phase as we look to stir up the SERPs.
So, we set a few filters in our SERP Ecosystem and we look for “opportunities” which are defined as links that our competitors have that we do not.
We instantly find the orphaned nodes on the network graph. Sort the table by DR (I am not in love with 3rd party metrics but it does help for quickly finding nuggets in a large list of URLs) and we find some absolute powerhouse links that can be added to our outreach workbook.
3. Control and Manage your Data Pipelines
Easily Add New Competitors and Links into the Network Graphs. Once you have your SERP ecosystem setup, adding to it is a breeze. Removing spam links you want to ignore, blending competitor data from different related queries to manage a more macro database of backlinks, and so much more.
Organizing and filtering your data inside of your own environment is the first step to being able to create scalable outputs and operate at a level of detail that your competitors simply cannot.
Moving data in and out and creating internal automations while introducing additional layers of data analysis can inspire the innovation of novel concepts and strategies.
Make it your own and you will find there are many more use cases for a setup like this, far too many to cover in this blog.
4. Identify Mini Authority Websites using Eigenvector Centrality
In graph theory eigenvector centrality implies that nodes (websites) become more important as they are attached to other nodes deemed important. The more important the neighbors of the node, the more important we deem that node to be.
This outer ring of nodes shows six of the sites that link to a significant number of our well ranking competitors. The kicker is, the site they link to (the node in the center) links to a competitor WAY down the SERPs. If you notice the DR (34) there is a good chance it can get lost in the filtering as we try to locate the “best” links we want to attempt to acquire.
The only caveat with this method is manually clicking through your table is NOT the best option to identify these opportunities. You will want to use a script to crawl your data with a configuration that sets a rule on how many “important” sites ought to link to a website before being considered significant enough to make your outreach list.
Not beginner friendly, but as stated above, once the data is in your ecosystem, writing the script to find these little gems takes an insignificant amount of time.
5. Identifying and Taking Advantage of Disproportionate Competitor Link Distribution
While the concept isn’t novel, being able to look at 50-100 websites in the SERP and identify pages on their site that get the most links, is a great way to mine some super variable information.
Mind you, we can look ONLY for “top linked pages” on a site but you are not going to find nearly as much useful information there, especially on well SEO’d websites. You’ll find some regular link building to the homepage and the main service or location pages. Ideally, we are looking for the pages that have a disproportionate amount of links. To do this programmatically you’ll need to filter these opportunities out using some applied maths, with the specific model really being up to you. This can be a bit difficult because sometimes the number of backlinks you want to set as outliers can quickly change based on the size of the numbers. Ex. a 20% concentration of links on a website with only 100 links built compared to one with 10 million links built is a radically different scenario.
A single page getting 2 million links while hundreds or even thousands of other pages make up the rest of the 8 million, means that page should be reverse engineered. Did it go viral? Is it a tool or free resource? Something is attracting those numbers of links.
The page with 20 links sitting on a site where 10-20 other pages account for the rest of the 80 could very well be a typical local website where a SEO link built heavier to a target service or location URL.
Just because a score isn’t definitionally an outlier that doesn’t mean it’s not potentially a URL of interest and vice versa.
Having said that, I lean heavier toward Z-scores. Standard scoring is done by subtracting the mean (finding the sum of backlinks from all pages on the website and dividing that number by the number of pages on the site) from the individual data point (the number of backlinks of the page you’re scoring) and then dividing that by the standard deviation of the dataset (dataset being all of the backlink counts for each page on the website)
Don’t get bogged down by these terms if you skipped a few stats classes. It’s not difficult. The z-score formula is simple enough. You can use this standard deviation calculator for your manual testing. Just plug your numbers in to gather some results and get a feel for your outputs. If you’re sold you can build z-score segmentation into your workflow and display the results in your data visualization tool.
With this data you can start investigating why certain competitors are acquiring atypical amounts of links to specific pages on their site and use it as inspiration to create content, resources, tools, etc. that people provably love to link to.
There is so much more utility that definitely justifies taking the time to put together a process to look at larger amounts of link data. The opportunities you can take advantage of are near endless.
getting Started
First, you will need a source(s) for backlink data. We are big fans of Ahrefs as their data seems to consistently outperform their competitors. With that in mind, Blending data from other tools is a great idea if the capability exists on your end to do so.
INSERT LINK RECOMMENDATION FUNNEL
& like “Oh wait, HAHA we built a DOPPPEEEEE tool that does for you cause ya know, we are web 20”
Unearthing links in one platform that you will not find in any of the others is pretty commonplace but you will need to consider your budget and willingness to ingest the data and normalize it into one format.
Next you will need a data visualization tool. There is no shortage of data visualization tools that you can employ to accomplish our goal. Here are a few resources to help you choose one:
https://www.toptal.com/designers/data-visualization/data-visualization-tools
https://www.simplilearn.com/data-visualization-tools-article
https://www.reddit.com/r/dataanalysis/comments/171h8g6/freeopensource_data_visualization_tools/
If you decide to choose one from the lists above, jump onto YouTube for some quick tutorials or prompt your favorite AI tool to build you a curriculum you can follow. There are also some great python libraries that have some handy visualization tools. If there is enough interest we could also look at open sourcing our setup as well.
I digress. (that we built this FOR YOU!)
Also consider hiring a freelancer to build one of these for you is also viable as it is not a complex development endeavor.
Narrowing our Focus for Competitive Parity
The first phase of our link planning exercise saw us gathering and looking at larger sets of data and applying some filters to give us a comprehensive list of backlinks we can attempt to build for our target URL(s). The next phase has us doing WHAT (Anchors it seems – should be mentioned)
Determining Anchor Text Plan
When it comes to looking at anchor texts, I like to work with the top 3-5 ranking URLs for the query I am after. You can definitely circle back to your SERP ecosystem above and create a map of all anchor texts that point to the corpus of websites you are tracking, but for this exercise we are going to focus on what Google put at the top of their SERP.
Keep in mind we are looking at two different views below. One is a raw look at the link gap between pages and the anchor texts that point to them and the other is a link building schedule.
Once the sites are added into the system you can make the choice to completely remove outliers by simply not selecting them as part of your math. I make this judgement through a manual review as sometimes a backlink heavy competitor might appear authoritative but at a closer look they have a heavily spammed backlink profile.
Building that level of volume, especially when you are building quality links, to achieve parity with a spammed profile makes little sense.
We built a system that classifies anchor text so when we start to build links we are not disrupting the pattern of anchors that the top websites achieved their rankings with. Anchor text is still taken seriously by us, not just because internal testing that validates their importance, but Google also has patents that assigns strength to the anchors as well:
https://patents.google.com/patent/US7260573B1/en – Personalizing Anchor Text Scores in a Search Engine: This patent outlines a method where search engines calculate personalized page importance scores for documents based on user-specific parameters. These scores are combined with information retrieval metrics to generate personalized rankings, with anchor text playing a significant role in this process.
https://patents.google.com/patent/US8577893B1/en – This patent describes analyzing the context surrounding a hyperlink, including the anchor text, to assess a document’s relevance and combat manipulative practices like anchor text spamming. By evaluating the text adjacent to links, the system assigns context identifiers that influence document ranking.
https://patents.google.com/patent/US6285999B1/en – This patent introduces a technique that uses anchor text of links to a document to characterize its relevance, rather than relying solely on the document’s content. By analyzing anchor text descriptions pointing to a page, the system assigns a rank based on how well search query terms match these descriptions.
While link classification is done programmatically, we built a moderation dashboard so we can double check the outputs and modify them if needed. We originally built this circa 2018 and have been refining the logic for years and have run immense amounts of data through it and we usually find the results to be on point, but we always want to be flexible to match the top performers.
An often overlooked metric when people perform competitive analysis, tier two links can be pushing power for your competitors and you want to be aware when that’s the case so you can ensure you are not just matching efforts on tier one and missing out because you missed this during the link planning phase.
Now that we have the backlink gap and anchor text analysis completed let’s break out a calendar of link building. This really gives us an idea on budget and allows us to set both internal and external expectations on the length of the link building component of the campaign.
I like to look at the duration and volume perspectives of the link building based on our gathered data and make final decisions before compiling this all into a workbook so the link building team can get started.
Sorry for the format but if you zoom in you can see the layout of the workbook. In the next piece of link building content I will show you guys are outreach and link building processes so you can go from research to ranking. I will also be looking at more link content to produce to supplement some of the areas of this process I just didn’t have the opportunity to cover lest this turn into a proper book instead of an article.
Don’t Guess at Your Link Building
Link building is both an art and a science. The complexities of building the right links—at the right time and in the right context—can make or break your SEO efforts. Each specific situation requires a carefully tailored approach to ensure your links not only drive authority, but also maximize their impact on your rankings. That’s why your link building philosophy must be backed by data.
Are your competitors dominating the search rankings while you’re left wondering why? It’s time to stop guessing and start strategizing. Our link-building philosophy is rooted in tested, reliable methods designed to bridge the gap between you and your competition. But don’t just take our word for it—see the data for yourself.
Get a free link gap report today and discover exactly where your website stands. This is your opportunity to uncover missed opportunities, outpace competitors, and drive measurable results. Don’t let them have all the top spots; claim your edge now with actionable insights tailored to your goals.
Click below to start your free analysis and take the first step toward SEO success!