Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How to de-index old URLs after redesigning the website?
-
Thank you for reading.
After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain).
It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?)
What is the best way to deal with this issue?
-
Thank you Clever PhD, really valuable insights!
-
I completely agree with all of the above - I've taken her point more like my own. Where receiving thousands of annoying 404 errors from pages that haven't existed for many months just gets annoying!

-
I respectfully disagree with all of the above. Please repeat after me, 404s are not bad, they are diagnostic, 404s are not bad, they are diagnostic, 404s are not bad, they are diagnostic.
After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain).
**Part 1 Internal links that 404s from Moz Crawl: **The 404s that show up in the Moz crawl are only going to be from an internal link on your website. The Moz crawl only looks at internal links and not links from other website. In other words, if you see 404s in your Moz crawl, that means, somewhere, you are linking to those pages and that is why the 404s are showing up. Download the CSV and you will find them in your Moz crawl. Other tools such as screaming frog, Botify, Deep Crawl, will show you a similar analysis.
Simple solution. Go through your code and remove the internal links on your site that direct the Moz crawler to those pages and the 404s will go away. (FYI this same approach will work for any internal 301s) These 404 errors in the Moz report are great diagnostic signals on where to fix your site. It is bad for users to click on a link within your website and get sent to a page that does not exist.
**Part 2 external links from Search Console: **The 404s that show up in Search console can come from your internal links on your site AND external links from other sites. Google will keep trying to crawl these links due to other sites linking to pages on your site and your own internal links. For internal link fixing - see suggestion above. For external links you need a different approach.
Look at the external links, where are they coming from? Are they from quality websites? Do they go to formerly important pages on your websites (ie pages that were good converters? If so, then use the 301 redirect to send them to the correct replacement page (and this is not always the home page). You get users to the correct page and also any link equity is passed along as well and this can help with your site rankings. If the link goes to former page on your site that was not any good to start with and the links that come into it are poor quality, then you just let the page 404. Tools such as Moz Open Site Explorer or Ahrefs or Majestic can help with this assessment - but usually you can just look at a site linking to you and tell if it is crap or not.
You need to consider the above regardless of if you want to get the pages that are 404ing in question out of the Google index as if you get Google to remove the page from the index, it will then see the internal link on your site and then find the 404 again. If you have removed the links to the 404 pages on your site, eventually Google will stop crawling them and drop out of the index.
Important note regarding the use of robots.txt. Blocking Google from crawling the 404s will not remove the pages from the index, Google will just stop crawling them. Google has to be able to crawl the URL to see the 404 and then see that it is a bad page and then remove the page from the index. Blocking with robots.txt stops Google from doing that. As soon as you take the page out of robots Google will recrawl and the 404 shows up again. Robots.txt treats a symptom that is a red herring, allowing the 404 to occur takes care of the issue permanently.
Dead pages are a natural part of the web. Let Google see the 404 (if it truly is a page that should 404 and has no link equity that should be passed along with a 301). Google will crawl the 404 several times, you will see it in search console several times. It is ok. You are not penalized for X number of 404s. You may lose ranking if you 404 a page that Google used to rank well, but this is just because Google will not keep a page highly ranked that does not exist :-). Help Google out by cleaning up your internal link structure so when it sees that you do not link to the page any more, then that is a signal that the page should 404. Google knows that due to the nature of the web, pages will time out on occasion and show an error. Google will continue to recrawl a page just to make sure, it wants to give you the benefit of the doubt. Therefore, you have to give clear directives by not linking to dead pages so that after Google double and triple checks the page, it will finally drop it. You will see the 404 in your Search Console for several months then it will eventually go away.
Hope that makes sense. Good luck!
-
Hey Lana, If you really think that 301 does not make sense in that case you can always add the URLs in the robots.txt file and once Google will recrawl your website, Google will de-index the pages from the index.
Another thing you can do is using the de-index feature in Google webmaster tool. You can do that by getting in to your GWT, Optimization > Remove URLs and do that accordingly.
Hope this helps!
-
I see the point. Thanks Liam. As the most of our 404 pages starts with /en-GB/ i will do like this:
Disallow: /en-GB/
-
Hi Lana,
I've been having the same problem on one of our websites. I've been 301 redirecting over 5,000 URL's but still receive a lot of 404 errors. One of the main reasons for these 404 errors still appearing is other bots such as Bing Bot that is still crawling the old URL's.
To resolve this, I would just block them in your robots.txt file. We blocked our old product URL's that were under a "product directory like this:
User-agent: *
Disallow: /product/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any Tips for Reviving Old Websites?
Hi, I have a series of websites that have been offline for seven years. Do you guys have any tips that might help restore them to their former SERPs glory? Nothing about the sites themselves has changes since they went offline. Same domains, same content, and only a different server. What has changed is the SERPs landscape. I've noticed competitive terms that these sites used to rank on the first page for with far more results now. I have also noticed some terms result in what seems like a thesaurus similar language results from traditionally more authoritative websites instead of the exact phrase searched for. This concerns me because I could see a less relevant page outranking me just because it is on a .gov domain with similar vocabulary even though the result is not what people searching for the term are most likely searching for. The sites have also lost numerous backlinks but still have some really good ones.
Intermediate & Advanced SEO | | CopBlaster.com1 -
What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
We are now introducing 5 links in all our category pages for different sorting options of category listings.
Intermediate & Advanced SEO | | lcourse
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon. Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization? On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.0 -
Index process multi language website for different countries
We are in charge of a website with 7 languages for 16 countries. There are only slight content differences by countries (google.de | google.co.uk). The website is set-up with the correct language & country annotation e.g. de/DE/ | de/CH/ | en/GB/ | en/IE. All unwanted annotations are blocked by robots.txt. The «hreflang alternate» are also set. The objective is, to make the website visible in local search engines. Therefore we have submitted a overview sitemap connected with a sitemap per country. The sitemap has been submitted now for quite a while, but Google has indexed only 10 % of the content. We are looking for suggestion to boost the index process.
Intermediate & Advanced SEO | | imsi0 -
If I own a .com url and also have the same url with .net, .info, .org, will I want to point them to the .com IP address?
I have a domain, for example, mydomain.com and I purchased mydomain.net, mydomain.info, and mydomain.org. Should I point the host @ to the IP where the .com is hosted in wpengine? I am not doing anything with the .org, .info, .net domains. I simply purchased them to prevent competitors from buying the domains.
Intermediate & Advanced SEO | | djlittman0 -
Yoast SEO Plugin: To Index or Not to index Categories?
Taking a poll out there......In most cases would you want to index or NOT index your category pages using the Yoast SEO plugin?
Intermediate & Advanced SEO | | webestate0 -
Should pages of old news articles be indexed?
My website published about 3 news articles a day and is set up so that old news articles can be accessed through a "back" button with articles going to page 2 then page 3 then page 4, etc... as new articles push them down. The pages include a link to the article and a short snippet. I was thinking I would want Google to index the first 3 pages of articles, but after that the pages are not worthwhile. Could these pages harm me and should they be noindexed and/or added as a canonical URL to the main news page - or is leaving them as is fine because they are so deep into the site that Google won't see them, but I also won't be penalized for having week content? Thanks for the help!
Intermediate & Advanced SEO | | theLotter0 -
Old Redirecting Website Still Showing In SERPs
I have a client, a plumber, who bought another plumbing company (and that company's domain) at one point. This other company was very old and has a lot of name recognition so they created a dedicated page to this other company within their main website, and redirected the other company's old domain to that page. This has worked fine, in that this page on the main site is now #1 when you search for the other old company's name. But for some reason the old domain comes up #2 (despite the fact that it's redirecting). Now, I could understand if the redirect had only been set up recently, but I'm reasonably sure this happened about a year ago. Could it be due to the fact that there are many sites out there still linking to that old domain? Thanks in advance!
Intermediate & Advanced SEO | | VTDesignWorks1 -
URL Length or Exact Breadcrumb Navigation URL? What's More Important
Basically my question is as follows, what's better: www.romancingdiamonds.com/gemstone-rings/amethyst-rings/purple-amethyst-ring-14k-white-gold (this would fully match the breadcrumbs). or www.romancingdiamonds.com/amethyst-rings/purple-amethyst-ring-14k-white-gold (cutting out the first level folder to keep the url shorter and the important keywords are closer to the root domain). In this question http://www.seomoz.org/qa/discuss/37982/url-length-vs-url-keywords I was consulted to drop a folder in my url because it may be to long. That's why I'm hesitant to keep the bradcrumb structure the same. To the best of your knowldege do you think it's best to drop a folder in the URL to keep it shorter and sweeter, or to have a longer URL and have it match the breadcrumb structure? Please advise, Shawn
Intermediate & Advanced SEO | | Romancing0