Lesson 1.5: How Search Algorithms Work

Article by: Matt Polsky

In this lesson, we start diving deeper into search algorithms and how Google ranks a piece of content. We focus on Google because its worldwide search market share is over 91%, and other search engines have similar algorithmic inputs.

What is a Search Algorithm?

An algorithm is a set of mathematical instructions used to reach a specific outcome. In terms of SEO, search engines use algorithms to order results and decide what features a search engine results page (SERP) should include.

It may be easiest to think of an algorithm as a recipe. You include ingredients that most likely determine the quality and adjust the recipe over time to make the perfect dish.

Google has hundreds, if not thousands, of ingredients that go into what pages should rank highest, and this recipe continues to evolve as search behavior changes. 

Seo Recipe

The Original Google Search Algorithm

Other search engines came before Google. Aliweb in 1993, WebCrawler and Lycos 1994 and Yahoo! (search) and AltaVista in 1995. Google didn't come on the scene until 1997. So what made Google different?

Algo

Google set itself apart with the quality of results they surfaced. Their algorithm focused on external links (e.g., mysite.com links to yoursite.com) and treated external links as votes. In the first iteration of their algorithm, sites with the most links tended to rank the highest. 

This differentiator, known as PageRank, allowed the quality of Google's results to surpass the other search engines of the time.

While links provided better search results and were harder to manipulate, they still could be gamed. As we'll see in later lessons, link quality became more important than quantity, and Google began manually penalizing sites with blatant manipulation. 

The Need for Continual Updates

SEO as an industry began in 1997. The first known mention of "Search Engine Optimization" came from John Audette of Multimedia Marketing Group (MMG). Later in the year, Danny Sullivan (now a Googler) created Search Engine Watch - a site dedicated to helping website owners rank sites better.

With the birth of a new industry comes both good and bad. On the one hand, you have groups doing great marketing and generally following the rules. On the other hand, you had groups spamming the internet to no end to outrank competitors. The dichotomies became known as "white hat" and "black hat" SEO.

White Hat Vs Black Hat Seo

Google isn't a huge fan of anyone manipulating their search results. And now, with an entire industry dedicated to it, they had to make moves and update their algorithms more frequently. 

Updating the Algorithm

To keep quality high and mitigate mass manipulation, Google and other search engines have to keep their algorithms up to date. They do this by adding new signals or refining old signals.

Updating the algorithm started as a very manual process involving engineers testing specific inputs and putting those changes in front of a test group to determine whether the result was positive or negative.

Old Algo

The downside is the test group was rarely representative of the entire sample.

Additionally, in this era that lasted generally until around 2015, engineers had to rerun each individual algorithm manually to resort results. With manual refreshes, sites affected by these algorithms wouldn't have the chance to recover until the algorithm ran again. In some cases, that turned out to be more than a year.

Imagine being a business relying on organic traffic and losing it all for 13 or more months. Some businesses in this era closed their doors, or started over with different domains.

Panda

Today, the search quality system hinges more on machine learning (ML) and artificial intelligence (AI) than subjective opinion - though Google employs Quality Raters to review results and the effectiveness of algorithms.

The Current Algorithm

In reality, Google's algorithm isn't a single script. The algorithm is a collection of algorithms that interpret signals from web pages (i.e., keywords, links, page speed, etc.) to rank the content that best answers a search query. Google claims to release anywhere from 3-5 updates per day every year, though the majority are too small to notice.

However, they typically release a handful of highly prominent updates, which they refer to as "core updates." Highly prominent in this case means site owners would likely see impacts and changes to search results.

Before core, Google released similar-sized updates, but they typically only targeted a narrow type of spam. The smaller targets made it easier for SEOs to determine what factors played into the algorithm and adjust accordingly. 

With core, the updates typically cover a breadth of topics and are more challenging to decipher. Google provides advice for sites hit by core updates, which is relatively generic but worth reading.

You can find Google's advice for core updates here.

What Makes Up Google's Algorithm?

Google's algorithm changes constantly. In the mid-2000s, Google claimed there are over 200 factors. However, in the current ecosystem where ML and AI help determine rankings, there's likely thousands of factors and subfactors.

In fact, Google has said a handful of times they don't even know all the factors…

Rankbrain

Here's What We Do Know

Google wants to show quality pieces of content, so the factors change over time. Many traditional factors still exist but have different weights, or those weights vary.

To complicate things further, factors can change by the page or search intent – meaning the same page may be weighed differently depending on the query.

Traditional Ranking Factors:

  • Crawlability (if Google can't crawl the site or see the content, it won't rank)
  • Keyword usage
  • External links

Newer Ranking Factors:

  • User satisfaction (RankBrain)
  • Query interpretation (BERT/MUM)
  • Freshness
  • Mobile-friendliness
  • Page speed
  • High-quality content (unique/not spun, concise, highly edited and insightful)
  • High trust content (from or reviewed by an expert, well-sourced and verified

*Pages involving health, finance, government, people groups, etc. get held to higher standards. Google refers to these queries as "Your Money or Your Life" (YMYL)

Bert

BERT and MUM

BERT stands for Bidirectional Encoder Representations from Transformers and is a machine learning, natural language processing (NLP) model that helps search bots understand language as humans do.

MUM is short for Multitask Unified Model and works on transformer technology, similar to BERT. MUM is supposed to be 1000 more powerful than BERT and expands beyond text into image, audio and video content.

BERT and MUM provide a contextual layer on Google's information retrieval (IR) systems that was previously unfathomable. Take the following example:

Pujols2

Before BERT/MUM, a query like this would likely show Site A - a Wikipedia or biography of Albert Pujols. Today, it understands more of the context behind the words and would likely show an answer box with the city of his birth, as with Site B.

In our next lesson, we look back at some historical algorithms, how they impacted search as we know it and the tactics we take.

Copyright 2025 MattPolsky.com