One of the steps of technical SEO is to find and fix soft 404 errors. Soft 404 errors can be very confusing because, in many cases, it’s not clear what the issue is, making troubleshooting and fixing a cumbersome process.
In this guide, you’ll learn everything you need to know about soft 404 errors, including what they are, how to find them, and how to fix them.
- What Is a Soft 404 Error?
- Not found (404) Vs. Soft 404 Errors
- Do 404 Errors Affect Your SEO?
- How To Find Soft 404 Errors?
- How To Fix Soft 404 Errors
What Is a Soft 404 Error?
A soft 404 error occurs when a page requested by the user cannot be found or is invalid, and the server, instead of returning the correct HTTP error code (404 or 410 not found), returns an HTTP status code 200 OK (success).
In simple words, this means that while a page is invalid, instead of giving search engines the correct error code so that they ignore it, your server returns a 200 OK code, which tells them that the page is valid.
As a result, search engines keep crawling these pages and list them in the search results.
The most common causes of soft 404 errors are:
- You have pages with no or little content. This makes Google think the page should return a 404/410 code, not a 200 OK code. An example of this is empty tag pages that display no content.
- There is a temporary issue with crawling. When Google tries to crawl the page, some of the page resources (CSS, JS) cannot be loaded, and as a result, the page comes with no content, making Google think it should be a 404.
- Google falsely marks a page as ‘seems to be a 404’ while nothing is wrong with the page.
It is important fix soft 404 errors because:It’s a bad practice -
A page should return the correct HTTP status code. In the case of missing, invalid or non-existent pages it should either return a 404/410 (not found), or a 301 (moved) and not a 200 (success code).
It’s a bad user experience - You don’t want users to click on a link from search engine results and land on a page on your website with little or no content and value to the user.
Your crawl budget gets wasted - Search engines, instead of spending time crawling your important pages, spend time crawling and indexing soft 404 pages.
What is the role of HTTP Status Codes?
If you are confused on what are HTTP status codes and what is their role, all you have to know is that it’s a way to help crawlers understand whether their request to fetch a page was a success, failure or something else.
Every time a webpage is accessed by a search engine crawler, the first thing that they check is the HTTP status code. The HTTP response code is a 3-digit number that tells search engines if the page is valid (code 200), not found (404/410) or moved (301).
The status code is included in the page header and It’s only visible to the crawlers and not to the users.
For a complete list of all HTTP status codes, read this guide.
Not found (404) Vs. Soft 404 Errors
The difference between Not Found (404) and soft 404 errors is that in the case of Not Found (404) the page is not found and the returned HTTP status code is a 404 or 410 (which correctly corresponds to not found).
In the case of a soft 404 error, the page is not found, but instead of returning the HTTP status code 404, the page returns the 200-success code, and this is misleading.
In simple words, for both cases, the page response code should have been 404 but that’s not the case with soft 404 errors.
Another major difference is that 404 pages are not indexed by search engines and they won’t appear in the search results but soft 404 pages are indexed by search engines and they may appear in the search results.
Do 404 Errors Affect Your SEO?
It depends on the case.
There are valid cases where 404 pages are normal and expected. For example, when a product is no longer available you can display a 404 page to the user to let them know that the particular product is permanently gone.
Another example is when you want to completely remove a page from the search results. By returning a 404-status code, you tell search engines that the page was deleted permanently.
When it comes to soft 404 errors it’s trickier because there are cases that the page is valid but Google thinks that it’s not and cases that the page is not valid and your server returns 200 OK.
In those cases, it’s better to investigate why the errors occur and fix them.
As a general rule of thumb, you should avoid having 404 errors on your site to optimize your crawl budget, avoid confusing search engines and offer users a good experience.
How To Find Soft 404 Errors?
The most reliable way to find 404 errors is through the Google Search Console and, in particular, the Page Indexing Report and the URL Inspection tool.
If you haven’t done so already, the first step is to register your website with Google. This will give you access to several features to improve your SEO.
Page Indexing Report
- Login to Google search console
- Next, click on Pages under Indexing to view the PAGE INDEXING REPORT.
- Scroll down to "Why Pages aren't indexed." look for Soft 404 and Not Found (404) errors.
Click on the error description to get more details about the affected pages.
URL Inspection Tool
Another way to find the HTTP status response code is to use the URL Inspection tool.
Enter a URL into the URL Inspection tool and click enter.
Click on VIEW CRAWLED PAGE and then MORE INFO.
You will see the HTTP Response in the screenshot above.
How To Fix Soft 404 Errors
To get rid of soft 404 errors, you can use one of the five solutions below:
- Check that the page is indeed a soft 404 or a false alarm
- Configure your server to return the proper not found error code (404/410)
- Improve the page and request indexing
- Redirect the page using a 301 redirection
- Keep the page on your site but de-index it from search engines
1. Check that the page is indeed a soft 404 or a false alarm
Sometimes, Google Search Console may incorrectly mark a page as a soft 404, so the first step is to check if that’s the case. Move the mouse over a URL and click the “Open in a new tab” button.
If it’s a valid page of your site and you want this to appear in the search results, click the VALIDATE FIX button.
This action will force Google to recrawl the page and change its status code. This process may take a few days, and you’ll be notified of the result by email.
Inspect the page and Test the Live URL - Another method is to move the mouse over a URL and select INSPECT URL. This will give you more information about the page and the option to REQUEST INDEXING.
Before you do that, you can click the TEST LIVE URL button to force Google to refresh the report.
In many cases, you may find that the page is ok and no further action is required.
2. Configure your server to return the proper not found error code (404/410)
In the case that a page is indeed not available or not valid, you should configure your website to return the correct HTTP response code (either a 404 or 410) and resubmit the page to Google using the REQUEST INDEXING button of the URL Inspection tool.
The easiest way to configure your site to return a 404 code for invalid pages is to delete the pages. By deleting the page, your HTTP server will show the 404 pages when a requested URL is not found.
Pro Tip: It is important to have a custom 404 page that gives users options on what they can do next. The 404 page can contain links to your homepage and to your most popular pages and can even contain a search box to help users find what they are looking for.
This is how my custom 404 page looks like:
3. Improve the page and request indexing
In the case that a page is available but Google insists on considering the page as a soft 404, you can improve the page’s content and resubmit the page to Google either through the REQUEST INDEXING button or through VALIDATE FIX.
Usually the above happens when a page has very little content and Google wants to remove the page from its index.
By adding more content, you show search engines that the page has value and the soft 404 error will go away.
How to force Google re-index a page – How to use the 'Request Indexing' feature to force Google to recrawl and re-index a page.
4. Redirect the page using a 301 redirection
Another way to fix a soft 404 error is to redirect the affected page to a valid page. This is done by adding a 301 redirect in your .htaccess file which tells search engines that the page has moved to a new location.
This is how it looks:
Redirect 301 /soft-404-page https://example.com/new-page-URL
Another way to add a 301 redirection is to use Yoast SEO plugin.
When adding a 301 redirection you need to make sure that both pages have similar content.
Best SEO practices entail using 301 redirections on pages that have similar content. Do not redirect pages that don’t have similar content, you better delete the page and set it to return a 404.
5. Keep the page on your site but de-index it from search engines
Another option is to keep the page on your site but add a noindex directive in the header to instruct search engines NOT to index the particular page.
You can do this using Yoast SEO or by manually adding <meta name="robots" content="noindex,follow"/>
in the page header.
When you do this Google will no longer show the page under the ERROR report but you can see them in the EXCLUDED report under the SOFT 404 section.
How to fix crawl errors in Google Search Console – A detailed guide covering all possible errors you can see in Google search console and how to fix them.
Key Learnings
Dealing with 404 errors and especially soft 404 errors is an advanced SEO technique. I know from experience that not all users can understand HTTP status codes and what they mean.
To simplify things, all you have to know is how to use the Google search console to check for 404 errors and learn how to fix them.
The two most common methods are by using a 301 redirection or by adding the noindex directive to a page header to remove a page from the search engine index.
For best SEO performance it is always recommended not to have soft 404 errors so you should make every effort to find them and fix them.