The Index Coverage report in Google Search Console allows you to check the indexing statuses of your pages and shows you any indexing problems Google encountered on your website.
Regular monitoring of these statuses is extremely important. It allows you to quickly spot any issues that can keep your pages away from Google’s index and take action to resolve them.
But to take action, you need to understand what each status means.
The documentation on the Index Coverage report provided by Google is not always clear, and it doesn’t cover common scenarios and all the potential causes of a given status. That’s why I created this article to summarize the Index Coverage report statuses and provide you with information on what each status indicates.
Different statuses indicating the same issue
Before I jump into explaining the Index Coverage report statuses, I’d like to take a moment to discuss the names of the statuses.
Different statuses might indicate the same problem, but they differ only in the way Google discovered the URL. More specifically, the word “Submitted” specifies that Google found your page inside your sitemap (a simple text file listing all of the pages you want search engines to index).
For example, both statuses “Blocked by robots.txt” and “Submitted URL blocked by robots.txt” indicate that Google can’t access the page because it’s blocked by robots.txt (I’ll explain this status later). However, you explicitly asked Google to index your page by putting it inside your sitemap in the latter case.
If your page reports a status with the word “Submitted”, you have two options: resolve the issue or remove the URL from your sitemap. Otherwise, you’re asking Google to visit and index a page that is impossible to index. This situation can lead to wasting your crawl budget.
Crawl budget indicates how many pages on your website Google can and wants to crawl. If you own a large website, you need to ensure Google spends your crawl budget on valuable content. By monitoring the statutes containing the word “Submitted”, you can make sure you don’t waste your crawl budget on pages that shouldn’t be indexed.
Now, let’s talk about the Index Coverage report statuses in detail.
The two most common indexing issues
In 2021, I researched the most commonly occurring indexing problems. It turned out that these two statuses were particularly common:
- Crawled – currently not indexed,
- Discovered – currently not indexed.
Both of the statuses indicate that your page is not indexed. The difference lies in whether Google visited your page already or not. The table below sums up the differences between these statuses:
|Crawled – currently not indexed
|Discovered – currently not indexed
Now, let’s take a closer look at each of these statuses.
Crawled – currently not indexed
The Crawled – currently not indexed status indicates that Google found and crawled your page, but it decided not to index it.
There might be many reasons for this status to appear. For example, it might be just an indexing delay, and Google will index the page soon, but it also might indicate a problem with your page or the whole website.
In my guide, How to Fix Crawled – Currently Not Indexed, I listed potential causes of this status and ways of fixing them. In short, the primary reasons for the Crawled – currently not indexed status include:
- A page doesn’t meet the quality standards – if your page is of low quality, Google will most likely ignore it. To fix this issue, improve the quality of your content. You can look inside Google’s Quality Raters Guide to gain insight into what the search engine is looking at when assessing the quality of pages.
- A page got deindexed – if the page was indexed in the past but got deindexed, it will report the Crawled – currently not indexed status. There might be many reasons why it dropped out of the index. It most likely was replaced by a higher-quality page, and to resolve the issue, you need to improve its quality.
- A poor website structure overall – Google uses the internal linking structure to assess the importance of your pages. If there aren’t enough internal links pointing to a given page, Google might decide the page is not important enough to index it.
Discovered – currently not indexed
The Discovered – currently not indexed status means that Google found your page, but it hasn’t crawled or indexed it.
There are a lot of reasons that can cause this status. Gosia Poddębniak explained all of them in her article How To Fix “Discovered – Currently Not Indexed” in Google Search Console, along with the solutions to each problem. The primary factors causing the status include:
- Crawl budget issue – if you own a large website, but your crawl budget doesn’t allow for crawling all of the pages, Google might discover your page, but it won’t crawl it. If you want to learn more about optimizing your crawl budget, visit the Ultimate Guide to Crawl Budget Optimization.
- Poor internal linking – like I already mentioned, the internal linking structure helps Google determine the importance of a page. If no internal links point to your page, Google will most likely ignore it to save its resources and focus on more important ones.
- Google discovered patterns in your URL – Google analyzes the patterns in your URLs to, e.g., detect duplicate content and reject a page from the crawling process while discovering the URLs.
Statuses related to duplicate content
Google wants to index pages with distinct information to provide its users with the best possible results and not waste its resources on duplicate content. That’s why when it detects that two pages are identical, it chooses only one of them to index.
In the Index Coverage report, there are two main statuses related to duplicate content:
- Duplicate without user-selected canonical, and
- Duplicate, Google chose different canonical than user.
The above statuses indicate that a page is not indexed because Google chose a different version to index. The only difference is whether you tried to tell Google which is your preferred version using a canonical tag (an HTML tag indicating which is your preferred version if more than one exists), or you left no hints, and Google detected duplicate content and chose the canonical version on its own.
|Did you declare a canonical tag?
|Why isn’t the inspected page indexed?
|Duplicate without user-selected canonical
|Google decided on its own that Page A and B are duplicated and chose Page B as the canonical one.
|Duplicate, Google chose different canonical than user
|Yes – Page B has a canonical tag pointing to Page A.
|Your canonical tag wasn’t obeyed. Google decided that Page B is the canonical one.
Duplicate, Google chose different canonical than user
A canonical tag is only a hint, and Google is not obligated to respect it.
The Duplicate, Google chose different canonical than users status indicates that Google disagreed with your canonical tag, and chose a different version of the page to index instead.
If the page’s versions are identical, Google choosing a different version than you might not bring any consequences to your business. However, if there was a meaningful difference between the versions and Google picked the wrong one, it might decrease your organic traffic by displaying the wrong version to the users.
One of the main reasons for the status is inconsistent signaling, e.g., you added the canonical tag to one version of a page, but internal links and sitemap indicate that a different version is the canonical one. In that case, Google needs to guess which is the real canonical page and might ignore your canonical tag. That’s why it’s crucial to ensure you are consistent and all of the signals point to one version.
You can find more possible causes and solutions for Duplicate, Google chose different canonical than user in my colleague’s article How To Fix “Duplicate, Google chose different canonical than user” in Google Search Console.
Duplicate without user-selected canonical
Duplicate without user-selected canonical indicates that your page is not indexed because Google thinks it’s duplicate content, and you didn’t include a canonical tag.
I treat this report as an opportunity to see what type of pages Google detects as duplicate content. It allows you to take action and consolidate the content by, e.g., redirecting all versions to one page or using a canonical tag.
In the Index Coverage report, there’s also a status called Duplicate, submitted URL not selected as canonical. It indicates the same issue as Duplicate without user-selected canonical, but as I mentioned at the beginning, the page reporting Duplicate, submitted URL not selected as canonical, was found inside a sitemap.
Statuses indicating that you don’t want your page to be indexed
The statuses in this chapter indicate that a page is not indexed because you explicitly told Google not to index them, and the search engine respected your wish.
Excluded by ‘noindex’ tag
The noindex tag is a very powerful tool in the hands of a website owner. It’s an HTML snippet used to tell search engines that the URL shouldn’t be indexed.
Noindex is a directive, so Google has to obey it. If Google discovers the tag, it won’t index the page, and it will mark it as Excluded by ‘noindex’ tag (or Submitted URL marked ‘noindex’ if you included this page in your sitemap).
In a situation when the page reporting Excluded by ‘noindex’ tag really shouldn’t be indexed, you don’t need to take any action. However, sometimes a developer or an SEO wrongly adds the noindex tag to important URLs, resulting in those pages dropping out from Google’s index. I recommend you carefully review a list of all the pages reporting Excluded by ‘noindex’ tag to ensure no valuable page was mistakenly marked as noindex.
Alternate page with proper canonical tag
The Alternate page with proper canonical tag status indicates that Google didn’t index this page because it respected your canonical tag.
What does this mean for you?
In most cases, you don’t need to take any action. It’s mostly information that everything works correctly regarding this URL. However, there are two cases when you shouldn’t skip this report:
- Sometimes you implement canonical tags by mistake, i.e., all pages are canonicalized into one. It’s worth monitoring this report and reviewing the tags to ensure that there is no mistake.
- The canonical tag is a hint and may not be obeyed by Google. If little to no pages appear in this report, that might indicate that your canonical tags are not being obeyed.
Statuses indicating that Google can’t crawl your page
Before Google can index your page, it needs to be allowed to crawl it to see its content.
Crawling can be blocked for many reasons. You might do it on purpose to keep search engines away from specific content and save your crawl budget, but it also might be an effect of a mistake or malfunction.
In the table below, you can find the Index Coverage report statuses indicating an issue that results in crawling being blocked.
Google wants to ensure that the pages it shows for Google users are of high quality. To fulfill this mission, it uses advanced algorithms such as a soft 404 detector.
Google marks pages as soft 404s when it detects the page doesn’t exist, even if the HTTP status code doesn’t indicate it.
If a page is detected as soft 404, it won’t get into Google’s index. In such cases, Google will mark it either as:
- Soft 404, or
- Submitted URL seems to be a Soft 404.
However, like every mechanism, a soft 404 detector is prone to false positives, meaning your pages may be wrongly classified as soft 404 and eventually deindexed. Common reasons for this situation include:
- Irrelevant redirects – if Google detects that a page is redirected to an irrelevant one, it won’t follow the redirect and treat the page as a soft 404. That’s why you should always ensure to redirect to a relevant page.
- Many 404-like words on a page – an example can be an eCommerce page including phrases like “product unavailable.” In this case, Google might wrongly assume that the page doesn’t exist at all. To resolve this issue, you can delete the 404-like words or replace them with different ones.
- No content or rendering issue – if there’s little or no content on a page (e.g., empty product listings), or Google can’t render the content, it might assume that the page doesn’t exist. That’s why you should always ensure your pages have unique content (or add a noindex tag to the empty pages). Additionally, you can use the URL Inspection tool to see how your page renders and notify your developers in case of any problems.
If you’re interested in learning more about soft 404s, I recommend you check out Karolina Broniszewska’s article on soft 404s in SEO, where she covered the topic in detail.
Understanding the Index Coverage report is essential to give your pages the best chances of being indexed.
Unfortunately, Google’s documentation doesn’t provide you with all of the pieces of information necessary to diagnose and resolve problems on your website. It’s not always clear whether a status requires your immediate attention or is just information that everything is going well.
I hope my article helped you understand how to look at the Index Coverage report and made analyzing the statuses a little easier. Remember that regular monitoring of the report is the key to spotting technical SEO issues quickly and preventing your business from losing organic traffic.