What Can We Learn About Google From The Yandex Leak

Tomek Rudzki

7 min read

Published: February, 2023

Updated: January, 2026

Yandex’s source code was leaked, so we now know about the most important ranking signals for this search engine. I won’t link these documents here, but a quick search on Google or social media should help you quickly find them. Even if you don’t care about Yandex (primarily used in Russia), this news is significant. […]

Even if you don’t care about Yandex (primarily used in Russia), this news is significant. It’s direct insight into the inner workings of a fully-fledged Google competitor.

Let’s see what we can learn from this leak about how to do better SEO. I’ll discuss some of the most exciting variables I found and how they can inform our thinking about search.

Yandex collects user information

The job of a search engine like Google, Yandex, or Bing is to answer a user’s query.

But to answer that query, it has to be understood. And the user’s specific intent must be inferred from everything the search engine knows about the user.

That’s why search engines collect as much user information as possible, such as previous searches, location, or device.

Yandex is no different, and we find evidence for it in the leaked data. For instance, Yandex collects the FI_REQUEST_IS_FROM_IOS variable, which checks if a given user is on an iOS device.

Yandex collects tons of website data

Yandex, just like Google and Bing, has an index of pages that can potentially answer their users’ needs. But to find pages best suited to help their users, they must analyze them thoroughly.

The leak surfaces tons of page- and domain-related variables used by Yandex as ranking signals.

Below are some examples which I found the most exciting or surprising.

Yandex checks if a page has any map service implemented (FI_PAGE_HAS_MAPS_API),

Yandex judges the quality of a given using the overall quality of the host – the website (FI_PAGE_QUALITY_HOST),
Yandex checks if there is no NSFW content, including text, images, and videos,
Yandex checks if a document contains user feedback/comments,
Yandex judges the page by the last modification date and the number of known duplicates,
Yandex pays attention to social posts from verified accounts that link to a given page.

There are over 18k various factors in the Yandex leak. I think they are worth studying, but it’s beyond the scope of this post. I only want to bring your attention to the vast range of analyses Yandex makes to categorize all pages in its index.

Yandex is using user behavior metrics

Like Bing, Yandex is using behavioral metrics to signal page quality.

Time spent on time matters:

FI_BROWSER_HOST_CNT_DWELL_TIME_LOG checks the average time spent by a user on a specific website – this data is segmented per localization and country,
FI_MORE_90_SEC_VISITS_SHARE checks the percentage of visits longer than 90 seconds,
FI_MORE_160_SEC_VISITS_SHARE checks the percentage of visits longer than 160 seconds.

Yandex also uses immediate popularity as a ranking factor. It measures the average number of visits within three hours.

They also consider how deeply the average user interacts with the website (average session depth).

This points to similarities between Yandex and Bing.

Let me quote Bing’s documentation:

“Bing also considers how users interact with search results. To determine user engagement, Bing asks questions like: Did users click through to search results for a given query, and if so, which results? Did users spend time on these search results they clicked through or quickly return to Bing? Did the user adjust or reformulate their query?”

Yandex is using algorithms similar to Google’s

The leak also shows several factors that directly or indirectly correspond to some of the mechanisms we know Google is using.

Both Google and Yandex use BERT.
Both Google and Yandex use sitewide quality signals instead of only page-level signals (such as FI_PAGE_QUALITY_HOST).
Both Google and Yandex use PageRank.
Yandex also has rules for specific websites. For instance, Yandex treats Wikipedia links differently. Yandex also has rules for particular websites. For example, there is a factor named FI_DSSM_SUNHOME_POPULARITY. It checks the probability that sunhome.ru is a popular host for this query.
Both Google and Yandex have a notion of YMYL pages. Yandex has a specific algorithm to detect host quality for medical websites (FI_MEDICAL_HOST_QUALITY_METRIC”). It also has neural models to detect content quality for financial and legal topics (FI_FIN_LAW_URL_QUALITY).
Both search engines can annotate different parts of the content (so they understand page layout). We know Google uses a centerpiece annotation mechanism to differentiate between main content, supplementary content, and ads.
Both Google and Yandex share some common basic ranking factors (such as mobile friendliness, which Yandex measures with the FI_IS_MOBILE_BEAUTY_HOST variable).

How to use the Yandex leak to be a better SEO

When you know the quality and relevance signals a search engine uses to surface the best content, it’s pretty easy to improve your rankings.

First, you check which ranking factors have the highest impact (they don’t all have equal weight in the ranking algorithm). Then you select the factors that are actionable for you and easy to improve on your end. Focus on improving these factors on your website and measure the impact.

I don’t expect Yandex will rewrite its codebase to prevent people from gaming with its algorithm. So if you want to improve your Yandex rankings, it’s now easier than ever – technically speaking.

But when it comes to Google, things aren’t this easy.

If you compare the search results for the same queries between Google, Yandex, and Bing, you’ll quickly notice significant differences. This points to the fact that even if they use similar ranking signals, they weigh them differently or use them for different query types.

But the Yandex leak is a tremendous opportunity to reverse-engineer how people running one of the most successful search engines in the world think. Study these documents to understand how a search engine sees your business and what you can do to improve your search visibility.

Lesson 1: Ranking signals, not ranking factors

There is a discussion among SEOs about what is a ranking factor and what is not.

We must change our thinking to reflect that we’re in the machine-learning era.

Let me discuss two examples: grammar errors and word count. Google officially denies both to be ranking factors.

Because they aren’t ranking factors. But they possibly contribute to your SEO success.

Google published a research paper about detecting high-quality content. The sample was extraordinary – 500M documents. The algorithm described took into account features like word count and grammar correctness. Surprised?

Word count is not a ranking factor in the sense that documents with a higher word count will get a better position.

But it can obviously be used as a ranking signal. Depending on the query and user, the ranking algorithm may or may not use it as a factor in sorting the search results.

Lesson 2: Search is more complex than we think

We chase after specific, measurable ranking factors. And we keep looking for straightforward answers to simple questions like “Is word count a ranking factor?”

According to the leak, Yandex is using 18000 different ranking signals. Similarly to Bing and Google, it’s a state-of-the-art search engine.

Do you expect Google or Bing to use just 200 ranking factors? And do you expect any single Google employee even to remember them all?

Chasing a handful of measurable metrics probably won’t make you successful.

Instead, think of how to be a good SEO. On the road to self-improvement, you cannot only focus on a single thing. Instead, adopt the search engine perspective to understand the place your pages can and should occupy on the SERPs. Then, make those pages unforgettable so that you don’t just acquire traffic but also make it work towards your ultimate business goals.

Tomek Rudzki

Author

Tomek is a co-founder of ZipTie.dev and specializes in AI search optimization and SEO. He regularly shares his insights about AI search on our blog and wrote the ebook "AI Survival for SEO."

August 2025

What are the unique features of ZipTie.dev?

Most AI search tracking tools tell you where you rank but leave you guessing about how to improve - forcing you to figure out optimization strategies on your own. ZipTie is the only platform that combines comprehensive monitoring across Google AI Overviews, ChatGPT, and Perplexity with a built-in content optimization module that provides specific, actionable recommendations for improving your AI search performance. This guide breaks down the unique features that separate ZipTie from basic tracking tools and explains how each one helps you win more visibility in AI-powered search results.

July 2025

3 Steps to Optimize for AI Search Using ZipTie

Most businesses know they need to optimize for AI search but have no systematic process for actually doing it - leaving them to guess at what changes might improve their visibility. ZipTie's content optimization feature analyzes what ChatGPT, Perplexity, and Google AI Overviews actually require, identifies specific gaps in your existing content, and provides actionable recommendations to fix them. This step-by-step guide shows you how to use ZipTie to transform underperforming content into material that earns both citations and brand mentions across major AI search platforms.

May 2025

GSC’s Huge Search Gap

Google Search Console is hiding approximately 50% of your search traffic as "anonymous queries" - leaving you blind to the conversational searches that increasingly drive visitors to your site. Through systematic testing, I've confirmed that GSC fails to track most long-tail, conversational queries until they reach a certain popularity threshold, and even then it only reports data forward from that point. This growing blind spot means you're making strategic decisions about content and optimization based on incomplete data that misses the actual questions your audience is asking.

March 2025

Are Google AI Overviews common in the United Kingdom?

ZipTie just rolled out AI Overviews monitoring to seven new countries – the UK, Australia, Canada, India, Brazil, Japan, and Singapore. This got me wondering: how often do these AI Overviews actually pop up in the UK compared to other places? Since AI Overviews can totally change how people interact with search results, it’s worth […]

December 2024

State Of AI Overviews. 5 Disruptions Found After Analyzing 500k Queries

Google AI Overviews is the most controversial and anxiety-provoking change in search. It is already a top focal point for businesses relying on organic search. We have seen quick adoption of an AI search experience, resulting in the quick rollout of AI Overview, even expanding to 100 other countries and territories recently. With so much […]

November 2024

Entering the revolution of AI Search Engines

AI-powered search engines are changing how we find information online. It’s no longer some toys for geeks. Gartner predicts AI search will capture 25% of the traditional search market by the end of 2025. What is happening now is the rapid development of AI search engines Plus, just a few days ago, there was news […]

14-Day Free Trial

Get full access to all features with no strings attached.

What Can We Learn About Google From The Yandex Leak

Yandex collects user information

Yandex collects tons of website data

Yandex is using user behavior metrics

Yandex is using algorithms similar to Google’s

How to use the Yandex leak to be a better SEO

Lesson 1: Ranking signals, not ranking factors

Lesson 2: Search is more complex than we think

Tomek Rudzki

Related content

What are the unique features of ZipTie.dev?

3 Steps to Optimize for AI Search Using ZipTie

GSC’s Huge Search Gap

Are Google AI Overviews common in the United Kingdom?

State Of AI Overviews. 5 Disruptions Found After Analyzing 500k Queries

Entering the revolution of AI Search Engines

14-Day Free Trial