One of the steps of technical SEO is to find and fix soft 404 errors. Soft 404 errors can be very confusing because, in many cases, the issue is unclear, making troubleshooting and fixing them cumbersome.
In this guide, you’ll learn how to find and fix soft 404 errors.
What Is a Soft 404 Error?
A soft 404 error occurs when a page requested by the user cannot be found or is invalid, and the server, instead of returning the correct HTTP error code (404 or 410 not found), returns an HTTP status code 200 OK (success).
In simple words, this means that while a page is invalid, instead of giving search engines the correct error code so that they ignore it, your server returns a 200 OK code, which tells them that the page is valid.
As a result, search engines keep crawling these pages and list them in the search results.
What Causes Soft 404 Errors?
- You have pages with no or little content. This makes Google think the page should return a 404/410 code, not a 200 OK code. An example of this is empty tag pages that display no content.
- There is a temporary issue with crawling. When Google tries to crawl the page, some of the page resources (CSS, JS) cannot be loaded, and as a result, the page comes with no content, making Google think it should be a 404.
- Google falsely marks a page as ‘seems to be a 404’ while nothing is wrong with the page.
Why You Should Fix Soft 404 Errors?
It’s a bad practice - A page should return the correct HTTP status code. In the case of missing, invalid or non-existent pages it should either return a 404/410 (not found), or a 301 (moved) and not a 200 (success code).
It’s a bad user experience - You don’t want users to click on a link from search engine results and land on a page on your website with little or no content and value to the user.
Your crawl budget gets wasted - Search engines spend time crawling and indexing soft 404 pages instead of crawling your important pages.
What is the role of HTTP Status Codes?
If you are confused about HTTP status codes and their role, all you have to know is that they help crawlers understand whether their request to fetch a page was a success, failure, or something else.
Whenever a search engine crawler accesses a webpage, it first checks the HTTP status code. The HTTP response code is a 3-digit number that tells search engines if the page is valid (code 200), not found (404/410) or moved (301).
The status code is included in the page header. It’s only visible to the crawlers, not to the users.
For a complete list of all HTTP status codes, read this guide.
Not found (404) Vs. Soft 404 Errors
The difference between Not Found (404) and soft 404 errors is that in the case of Not Found (404) the page is not found and the returned HTTP status code is a 404 or 410 (which correctly corresponds to not found).
In the case of a soft 404 error, the page is not found, but instead of returning the HTTP status code 404, the page returns the 200-success code, which is misleading.
In both cases, the page response code should have been 404, but that’s not the case with soft 404 errors.
Another major difference is that search engines do not index 404 pages and will not appear in the search results, while search engines index soft 404 pages and may appear in the search results.
How To Find Soft 404 Errors?
The most reliable way to find 404 errors is through the Google Search Console, particularly the Page Indexing Report and the URL Inspection tool.
If you haven’t done so already, the first step is to register your website with Google. This will give you access to several features to improve your SEO.
Page Indexing Report
- Login to Google search console
- Next, click on Pages under Indexing to view the PAGE INDEXING REPORT.
- Scroll down to "Why Pages aren't indexed." look for Soft 404 and Not Found (404) errors.
Click on the error description to get more details about the affected pages.
URL Inspection Tool
Another way to find the HTTP status response code is to use the URL Inspection tool.
Enter a URL into the URL Inspection tool and click enter.
Click on VIEW CRAWLED PAGE and then MORE INFO.
You will see the HTTP Response in the screenshot above.
How To Fix Soft 404 Errors
To get rid of soft 404 errors, you can use one of the five solutions below:
- Check that the page is indeed a soft 404 or a false alarm
- Configure your server to return the proper not found error code (404/410)
- Improve the page and request indexing
- Redirect the page using a 301 redirection
- Keep the page on your site but de-index it from search engines
1. Check that the page is indeed a soft 404 or a false alarm
Sometimes, Google Search Console may incorrectly mark a page as a soft 404, so the first step is to check if that’s the case. Move the mouse over a URL and click the “Open in a new tab” button.
If it’s a valid page of your site and you want this to appear in the search results, click the VALIDATE FIX button.
This action will force Google to recrawl the page and change its status code. This process may take a few days, and you’ll be notified of the result by email.
Inspect the page and Test the Live URL - Another method is to move the mouse over a URL and select INSPECT URL. This will give you more information about the page and the option to REQUEST INDEXING.
Before you do that, you can click the TEST LIVE URL button to force Google to refresh the report.
In many cases, you may find that the page is ok and no further action is required.
2. Configure your server to return the proper not found error code (404/410)
If a page is indeed not available or valid, you should configure your website to return the correct HTTP response code (either a 404 or 410) and resubmit the page to Google using the REQUEST INDEXING button of the URL Inspection tool.
The easiest way to configure your site to return a 404 code for invalid pages is to delete them. By deleting the pages, your HTTP server will show the 404 pages when a requested URL is not found.
Pro Tip: It is essential to have a custom 404 page that gives users options on what they can do next. The 404 page can contain links to your homepage and to your most popular pages and can even contain a search box to help users find what they are looking for.
This is how my custom 404 page looks like:
3. Improve the page and request indexing
If a page is available but Google insists on considering it a soft 404, you can improve its content and resubmit it to Google through the REQUEST INDEXING button or through VALIDATE FIX.
The above usually happens when a page has little content and Google wants to remove it from its index.
Adding more content shows search engines that the page has value, and the soft 404 error will disappear.
4. Redirect the page using a 301 redirection
Another way to fix a soft 404 error is to redirect the affected page to a valid page. This is done by adding a 301 redirect in your .htaccess file which tells search engines that the page has moved to a new location.
This is how it looks:
Redirect 301 /soft-404-page https://example.com/new-page-URL
Another way to add a 301 redirection is to use the Yoast SEO plugin.
When adding a 301 redirection, you must ensure that both pages have similar content.
Best SEO practices entail using 301 redirections on pages with similar content. Do not redirect pages that don’t have similar content; you better delete them and set them to return a 404.
5. Keep the page on your site but de-index it from search engines
Another option is to keep the page on your site but add a noindex directive in the header to instruct search engines NOT to index the particular page.
You can do this using Yoast SEO or by manually adding <meta name="robots" content="noindex,follow"/>
in the page header.
When you do this, Google will no longer show the page under the ERROR report, but you can see it in the EXCLUDED report under the SOFT 404 section.
Do 404 Errors Affect Your SEO?
It depends on the case.
There are valid cases where 404 pages are normal and expected. For example, when a product is no longer available, you can display a 404 page to let the user know that the particular product is permanently gone.
Another example is when you want to remove a page from the search results completely. Returning a 404-status code tells search engines that the page was permanently deleted.
When it comes to soft 404 errors, it’s trickier because there are cases where the page is valid but Google thinks it’s not and cases where the page is not valid and your server returns 200 OK.
In those cases, it’s better to investigate why the errors occur and fix them.
As a general rule of thumb, you should avoid having 404 errors on your site to optimize your crawl budget, avoid confusing search engines and offer users a good experience.
RESOURCES TO LEARN MORE
How to fix crawl errors in Google Search Console – A detailed guide covering all possible errors you can see in Google search console and how to fix them.
Key Learnings
Dealing with 404 errors, especially soft 404 errors, is an advanced technical SEO technique. I know from experience that not all users can understand HTTP status codes and their meaning.
To simplify things, you only have to know how to use the Google search console to check for 404 errors and learn how to fix them.
The two most common methods are to use a 301 redirection or to add the noindex directive to a page header to remove a page from the search engine index.
It is always recommended that soft 404 errors be avoided for best SEO performance. Therefore, you should make every effort to find and fix them.