Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt File Redirects to Home Page
-
I've been doing some site analysis for a new SEO client and it has been brought to my attention that their robots.txt file redirects to their homepage. I was wondering:
Is there a benfit to setup your robots.txt file to do this?
Will this effect how their site will get indexed?
Thanks for your response!
- Kyle
Site URL:
-
Yep, if you add a robots.txt it won't redirect. But I would look to remove the 404 redirect as well. It also looks to me like a meta refresh as well which has potential SEO problems. I would much prefer a 301 if they are really keen to redirect 404s.
The main reason for not redirecting 404s is that it stops you from seeing broken links on your website. Imagine you have a discreet link to a services page that is broken - you wouldn't be able to pick it up with link checkers like Xenu and it could go unnoticed for months if not years. Might be worth suggesting to them that they remove it.
-
This is not a normal behavior, you should respond to robots.txt, put the sitemap link in there or simply :
User-agent: *
Disallow:The actual robots.txt gives :
GET robots.txt 302 Found, which redirects to :
GET 404error.html 200 Ok, which redirect to the home with browser behavior :
<meta http-equiv="refresh" content="0;url=/">
You better change this to a normal response
-
Thanks for the input! I haven't had a chance to view their .htaccess file. I am still in the early stages of reviewing their site. I just wasn't sure if their would be a technical reason for them to do this or if it just happened by accident. It sounds like adding a basic robots.txt file would be the appropriate solution.
-
1. I wouldnt advise redirecting the robots.txt to redirect to home page. It seems that they hve a dynamic 404 redirect system - which when a URL doesnt exist the site redirects it to home. There are god and bad points about this strategy, hoever I would prefer NOT to do it.
2. Re getting site indexed - no it wouldnt hurt them, but would give you much less control over the robots directive, in case you want to add custom instructions. If Google crawlers cant get to it (as in its not user agent cloaked to allow the google bot) you will not be able to do so (eg excluding pages from being indexed via robots wont be ossible).
-
I would be surprised if they purposefully redirected it. Have you been able to take a look at what's in the .htaccess file? If you copy and paste what's in there I might be able to see what's going on with it.
Also, if it is being redirected then it won't get crawled and so it won't have any effect. That could be good or bad depending on what you had written in the .txt file.
EDIT:
Just had a quick look at the site. It seems to 404 straight away and then redirect. Therefore I imagine the robots.txt file doesn't exist and they have it set up to redirect 404ing pages to the homepage. Something that I would advise against (it's useful to know what's 404ing).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt Syntax for Dynamic URLs
I want to Disallow certain dynamic pages in robots.txt and am unsure of the proper syntax. The pages I want to disallow all include the string ?Page= Which is the proper syntax?
Technical SEO | | btreloar
Disallow: ?Page=
Disallow: ?Page=*
Disallow: ?Page=
Or something else?0 -
301 Redirect non existant pages
Hi I have 100's of URL's appearing in Search Console for example: ?p=1_1 These go to on to 5_200 etc.. I have tried to do htaccess and the mod rewrite is on as I can redirect directories to the root i.e RewriteRule ^web_example(.*)$ /$1 [R=301,N,L] However I have tried all kinds of variations to redirect ?p= and either it doesn't work at all or it crashes the website. Can anyone point me in the right direction to fix this.
Technical SEO | | Cocoonfxmedia0 -
Getting high priority issue for our xxx.com and xxx.com/home as duplicate pages and duplicate page titles can't seem to find anything that needs to be corrected, what might I be missing?
I am getting high priority issue for our xxx.com and xxx.com/home as reporting both duplicate pages and duplicate page titles on crawl results, I can't seem to find anything that needs to be corrected, what am I be missing? Has anyone else had a similar issue, how was it corrected?
Technical SEO | | tgwebmaster0 -
Is there any value in having a blank robots.txt file?
I've read an audit where the writer recommended creating and uploading a blank robots.txt file, there was no current file in place. Is there any merit in having a blank robots.txt file? What is the minimum you would include in a basic robots.txt file?
Technical SEO | | NicDale0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
I accidentally blocked Google with Robots.txt. What next?
Last week I uploaded my site and forgot to remove the robots.txt file with this text: User-agent: * Disallow: / I dropped from page 11 on my main keywords to past page 50. I caught it 2-3 days later and have now fixed it. I re-imported my site map with Webmaster Tools and I also did a Fetch as Google through Webmaster Tools. I tweeted out my URL to hopefully get Google to crawl it faster too. Webmaster Tools no longer says that the site is experiencing outages, but when I look at my blocked URLs it still says 249 are blocked. That's actually gone up since I made the fix. In the Google search results, it still no longer has my page title and the description still says "A description for this result is not available because of this site's robots.txt – learn more." How will this affect me long-term? When will I recover my rankings? Is there anything else I can do? Thanks for your input! www.decalsforthewall.com
Technical SEO | | Webmaster1230 -
Blog Ranking NOT home page main website?!
Hi, Our Blog (http://blog.thailand-investigation.com) is ranking for some of our major keywords but not our home page (http://www.thailand-investigation.com)!? Our blog is WordPress and our main website is HTML. It seems like the search engines consider that they are 2 separate websites!? When I check the incoming links to our website, I get also the blog links!!!??? Is it normal? Do I have to build a relation of some kind or write some code saying that it is our Blog... I don't know! I'm not a SEO specialist or even a webmaster. I'm a small business owner and take care on my website. I created by myself but never learned! So, please help! Thanks
Technical SEO | | MichelMauquoi0 -
No indexing url including query string with Robots txt
Dear all, how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt? Thanks!
Technical SEO | | HMK-NL0