How Best to Handle Inherited 404s on Purchased Domain
-
We purchased a domain from another company and migrated our site over to it very successfully. However, we have one artifact of the original domain in that there was a page that was exploited by other sites on the web. This page allowed you to pass any URL to it and redirect to that URL (e.g. http://example.com/go/to/offsite_link.asp?GoURL=http://badactor.com/explicit_content).
This page does not exist on our site so the results always go to a 404 on our site. However, we find that crawlers are still attempting to access these invalid pages.
We have disavowed as many of the explicit sites as we can, but still some crawlers come looking for those links. We are considering blocking the redirect page in our robots.txt but we are concerned that the links will remain indexed but uncrawlable.
What's the best way to pull these pages from search engines and never have them crawled again?
UPDATE: Clarifying that what we're trying to do it get search engines to just never try to get to these pages. We feel the fact they're even wasting their time on getting a 404 is what we're trying to avoid. Is there any reason we shouldn't just block these in our robots.txt?
-
@gastonriera calm down mate. We have actually tested this at not seen any negative effect on any site we have done it on. It is the "easiest" option, but it won't cause the death and destruction your comment implies. Good day sir.
-
Hi there,
I'm considering that you have over 500k URLs, to be worrying about crawl efficiency. If you have less than that, please don't worry.
Having 404s is completely fine, and google will eventually lower their crawl frequency to those pages.
Blocking them in robots.txt will cause to google stop crawling them, but never to never remove them from the index.
My advice here: don't block them in robots.txtAs Rajesh pointed out, you could force those 404s into 410 to tell Google that they are gone forever. Yet, Google said that they treat 404s and 410s as the same.
John Mueller said over a year ago that 4xx status codes don't incur in crawl wastage. You can check it our in these Webmasters hangout notes - DeepcrawlHope it helps,
Best luck.
Gaston -
FOR THE LOVE OF GOD DONT REDIRECT 404s TO THE HOME!
This is terrible advice. Doing that you'll turn those 404s into soft 404s, making them more problematic than ever.
-
I would actually recommend redirecting it to the homepage. If you have a Wordpress website and a bunch of 404 pages, you can install a free plugin called "All 404 to Homepage" and this will solve the problem. I would, however, recommend that if you have replacement pages or pages covering similar content, that you redirect those to the corresponding replacement page.
-
You need to do one thing with those 404 pages. Move them as 410 status code. Redirection is not good practice for the same.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Geo-target .ag domain?
Hi Guys, We are looking to purchase a .ag domain for a agriculture website, we want to target two countries Australia (primary) and United States. So the main site e.g. www.farming.ag (will target Australia) While the www.farming.ag/us/ sub-folder will target the United States From my understanding through after reading this: https://www.name.com/domains/ag .ag is for countries Antigua and Barbuda. So i was wondering can you even geo-target to Aus or even the sub-folder to United States in search console? Any advice would be very much appreciated! Cheers.
Intermediate & Advanced SEO | | bridhard80 -
Handling alternate domains
Hi guys, We're noticing a few alternate hostnames for a website rearing their ugly heads in search results and I was wondering how everyone else handles them. For example, we've seen: alt-www.(domain).com test.(domain).com uat.(domain).com We're looking to ensure that these versions all canonical to their live page equivalent and we're adding meta robots noindex nofollow to all pages as an initial measure. Would you recommend a robots.txt crawler exclusion to these too? All feedback welcome! Cheers, Sean
Intermediate & Advanced SEO | | seanginnaw0 -
Putting my content under domain.com/content, or under related categories: domain.com/bikes/content ?
Hello This questions plays on what Joe Hall talked about during this years' MozCon: Rethinking Information Architecture for SEO and Content Marketing. My Case:
Intermediate & Advanced SEO | | Inevo
So.. we're working out guidelines and templates for a costumer (sporting goods store) on how to publish content (articles, videos, guides) on their category pages, product pages, and other pages. At this moment I have 2 choices:
1. Use a url-structure/information architecture where all the content is placed in one subfolder, for example domain.com/content. Although it's placed here, there's gonna be extensive internal linking from /content to the related category pages, so the content about bikes (even if it's placed under domain.com/bikes) will be just as visible on the pages related to bikes. 2. Place the content about bikes on a subdirectory under the bike category, **for example domain.com/bikes/content. ** The UX/interface for these two scenarios will be identical, but the directories/folder-hierarchy/url structure will be different. According to Joe Hall, the latter scenario will build up more topical authority and relevance towards the category/topic, and should be the overall most ideal setup. Any thoughts on which of the two solutions is the most ideal? PS: There is one critical caveat her: my costumer uses many url-slugs subdirectories for their categories, for example domain.com/activity/summer/bikes/, which means the content in the first scenario will be 4 steps away from the home page. Is this gonna be a problem? Looking forward to your thoughts 🙂 Sigurd, INEVO0 -
Domain Issue
Starting a new local travel guide site. Would like to buy a domain and have found one with decent Domain Authority and Trust, but they want $2500 for the domain which I feel is a bit steep since I will be not using any of the content and it is generating hardly any revenue now. . I would rather not start from scratch with no links and no trust. I have a few questions.... -Any suggestions on sites to look for domains or strategy for finding and offering to buy? Any guidelines on how to value domains? If I but it and change registration do I risk losing all the value? Cold I just change technical contact info? Any other suggestions are welcome. Thanks.
Intermediate & Advanced SEO | | Reportcard0 -
Root domain or sub domain? WWW. or NOT WWW......
I am a little confused! I use vidahost whom I have to say I find very helpful. I currently have the domains www.fentonit.co.uk and www.fentonit.com, now the websites was set up using the www.fentonit.com and I have www.fentonit.co.uk as a parked domain pointed to the www.fentonit.com. Confused yet? Now because I wanted the website to show www.fentonit.co.uk I added some code I was given by the guys to the .hta access file and viola up it comes as the .co.uk which is what I wanted. So if your still here and havent A: Killed yourself yet or B: Went to the Pub Then my questions are: 1. Is there going to be an issue from an SEO point of view having my site set up this way and if so how do I resolve it? 2. Would I be better using the root domain fentonit.co.uk (I think this is the root domain, although it iscurrently parked and pointed) as opposed to the sub www domain?.......and finally.......? 3. If it is set up as I stated what exactly would be my root domain, would it be the .co.uk or the .com? Sorry and I completely understand if your not interested in answering it but if you do.....Thanks in advance and I'll take you to the pub...lol Craig www.fentonit.co.uk ( i think)
Intermediate & Advanced SEO | | craigyboy1 -
What is the best way to run a blog?
Hi, I was wondering what is the best way to run a blog? The options I thought of are: Completely separate domain with many links to my main site. blog.domain.com www.domain.com/blog Thanks
Intermediate & Advanced SEO | | BeytzNet1 -
Redirecting Powerful Domains
What do you do if you have a client that never implemented a 301 redirect on their domain? For example here are the OSE stats for the URLs; http://url.com PA: 48 DA: 50 LRD: 65 TL: 1,084 FB: 178 FB: 14 T:5 http://www.url.com PA: 51 DA: 50 LRD: 165 TL: 2,271 FB: 178 FB: 14 T:5 G+1:3 My first instincts are to redirect the first one to the second one, but is it too late for that? Will that screw up all of their established stats? Any input or examples of past experiences with this would be great.
Intermediate & Advanced SEO | | MichaelWeisbaum0 -
Consolidating 3 regional domains
We recently took the decision to consolidate 3 domains for .com.au, .eu and .us. This decision was made before I arrived here and I'm not sure it's the right call. The proposal is to use a brand new .co (not .com isn't available) domain. The main reason is in terms of trying to build domain strength towards one domain instead or trying to grow 3 domains. We re-sell stock simlar to hotel rooms (different industry) and our site is heavily search based. So duplicate content is an issue that we hope improve on with this approach. One driver was we found for example that our Autralian site was outranking out european site in european searches. We don't want to only hold certain inventory on certain sites either because this doesn't work with our business rules. Anyway if we are to go about this, what would be the best practise in terms of going about this. Should we suddenly just close one of the domain and to a * 301 redirect or should we redirect each page individually? Someone has proposed using robots text to use a phased approach, but to my knowledge this isn't possible with robots.txt, thought a phased individual page 301 using htaccess may be possible? In terms of SEO is 1 domain generally better that 3? Is this a good strategy? What's the best 301 approach? Any other advice? Thanks J
Intermediate & Advanced SEO | | Solas0