Thousands of 404s
-
Hi there,
I'm working on a site that has a ridiculous number of 404s being returned by webmaster tools. We believe this was because there was an onpage error that was amending the urls and adding in folders that shouldn't have been in a big spiral i.e. /salons/uk/teeth became something like /salons/uk/teeth/salons/edinburgh/hair/teeth...
Anyway, we think the issue is now sorted, but these pages were indexed it seems, and so it looks like Google is still searching for them when it crawls the site. What's my best move? It's the sheers volume (over 13,000) that has me concerned so I thought it best to seek some expert advice before continuing.
Thanks in advance!
-
As it's all sorted now, I really wouldn't worry about them too much. You can use the remove URL functionality in WMT, but this is a manual process so I wouldn't do this. If I were in your position, I'd probably just let the pages keep 404ing'. After a bit, Google will usually stop trying to recrawl the 404 pages. Right now they are probably trying to recrawl incase the 404 was an accident.
If it's causing a bandwidth problem, you can solve with a robots.txt as suggested earlier.
-
Hi Philip!
If these URL's are already indexed, you should 301 Redirect them to the right URL (if they by chance have some inbound links). You could also try the URL removal tool from Google (see https://support.google.com/webmasters/answer/1663416) if all you want is to get rid of them.
Good luck, hope this helps.
//Anders
-
Hi Philip,
If all the urls have the same URL pattern, I would give it a try adding the structure to the robots.txt so you'll prevent Google from crawling the pages. Even better would be if you could add the noindex tags to the page.
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are there ways to avoid false positive "soft 404s" by Google
Sometimes I get alerts from Google Search Console that it has detected soft 404s on different websites, and since I take great care to never have true soft 404s, they are always false positives. Today I got one on a website that has pages promoting some events. The language on the page for one event that has sold out says that "tickets are no longer available" which seems to have tripped up Google into thinking the page is a soft 404. It's kind of incredible to me that in the current era we're in, with things like chatGPT that Google doesn't seem to understand natural language. But that has me thinking, are there some strategies or best practices we can use in how we write copy on the page so Google doesn't flag it as soft 404? It seems like anything that could tell a user that an item isn't available could trip it up into thinking it is a 404. In the case of my page, it's actually important information we need to tell the public that an event has sold out, but to use their interest in that event to promote other events. so I don't want the page deindexed or not to rank well!
Technical SEO | | IrvCo_Interactive0 -
Can adding thousands of new indexable URLs to my site at once be a problem?
Hi everyone, I am currently working on a project that will quickly add thousands of new indexable URLs to my site. For context, the site currently has over a million indexable pages. Is there any danger of adding a few thousand URLs at once to the site? Could it potentially affect crawlability/SEO/other pages? Thank you!
Technical SEO | | StevenLevine0 -
Old forum with 404s, what should I do?
Hello, So I'm helping out some friends with their SEO. I've just run a Screaming Frog crawl of their entire site (which took hours and hours I might add). They used to have a forum connected to the site, which is no longer active. Google is still indexing all of the old URLs, which unsurprisingly return 404 errors. What should they do to prevent Google from indexing these pages? That's assuming they need to do anything at all. They don't have access to these old forum posts and therefore won't be able to fix the URL or resource adding a 301 redirect pointing to the most relevant alternate page. I'm new to SEO but my instinct is that they need to have the page return a 410 ‘Gone’ response code to give search engines a clear signal that the page no longer exists and won’t be returning, and removing the internal links to that URL or resource. 1. Is this interpretation correct?
Technical SEO | | jordanayresaira
2. What is the impact of leaving these 404s? There are over a thousand, so there's a lot 3. What should I recommend?0 -
404s still showing in GWT
Hi, My client recently undertook a site migration. Since the new site's gone live GWT has highlighted over 2000 not found errors. These were fixed nearly 2 weeks ago and they're still being listed in GWT. Do I have to wait for Google to re-crawl the page before they're removed from the list? Or do I need to go through the list, individually check them and mark them as fixed? Any help would be appreciated. Thanks
Technical SEO | | ChannelDigital0 -
Google WMT continues reporting fixed 404s - why?
I work with a news site that had a heavy restructuring last spring. This involved removing many pages that were duplicates, tags, etc. Since then, we have taken very careful steps to remove all links coming into these deleted pages, but for some reason, WMT continues to report them. By last August, we had cleared over 10k 404s to our site, but this lasted only for about 2 months and they started coming back. The "linked from" gives no data, and other crawlers like seomoz aren't detecting any of these errors. The pages aren't in the sitemap and I've confirmed that they're not really being linked from from anywhere. Why do these pages keep coming back? Should I even bother removing them over and over again? Thanks -Juanita
Technical SEO | | VoxxiVoxxi0 -
How not to lose link juice when linking to thousands of PDF guides?
Hi All, I run an e-commerce website with thousands of products.
Technical SEO | | BeytzNet
In each product page I have a link to a PDF guide of that product. Currently we link to it with a "nofollow" <a href="">tag.</a> <a href="">Should we change it to window.open in order not to lose link juice? Thanks</a>0 -
Google indexing thousands crazy search results with %25253
In GWT I started seeing very strange pages indexed a few weeks, and Google is no reporting over 21,000 of pages (blocked by robots.txt) with weird URLs like this: http://www.francesphotography.com/?s=no-results:no-results%25252525252525253Ano-results%2525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%252525252525252525253Ano-results%25252525252525252525253Ano-results%25252525252525252525253Ano-results%2525252525252525252525253Adanna&cat=no-results http://www.francesphotography.com/?s=no-results:no-results%2525253Ano-results%25252525253Ano-results%25252525253Ano-results%25252525253Ano-results%2525252525253Ano-results%25252525252525253Ano-results%25252525252525253Ano-results%25252525252525253Adanna&cat=no-results The current robots.txt looks like this: User-agent: *
Technical SEO | | BoulderJoe
Disallow: /wp-content Disallow: /wp-admin Disallow: /wp-includes
Disallow: /data
Disallow: /slideshows
Disallow: /page/*/?s=
Disallow: /?s=
Disallow: /search This website is running an up to date WP install with Yoast's Google Analytics and SEO plug-in. I can't point to anything specific that happened with the site when these URLs started appearing even after I modified the robots.txt. What can be done to try and stop Google from creating and indexing these goofy URLs? I see lots of sites having this issue when I search in Google, but no one seems to have a solution.0 -
Search for 404s on Sandbox
Can I verify an IP in google webmaster tools to search for any 404s? Or maybe i could do it with seomoz tools? Thanks!
Technical SEO | | tylerfraser0