Robots txt. in page with 301 redirect
-
We currently have a a series of help pages that we would like to disallow from our robots txt.
The thing is that these help pages are located in our old website, which now has a 301 redirect to current site.
Which is the proper way to go around?
1- Add the pages we want to disallow to the robots.txt of the new website?
2- Break the redirect momentarily and add the pages to the robots.txt of the old one?
Thanks
-
In that case, you'd need to add the robots meta tag at the page level before the tag.
or
-
Hey, for some time we will keep the files in the old domain. Should we break the redirect and insert the disallows to the robot.txt of the old site?
-
So, the problem is that the robots.txt file can't be accessed because of the 301 redirect to the new domain?
Do you plan to keep the help files on the old domain, or will they be removed completely?
-
Hi Laura,
Thanks for your reply. I don't want to disallow the URLs these pages are being redirected to. Actually these URLs are in the old version but still can be accessed. So to put it simply, this is my case:
1- This was our current website: www.kilgray.com (With a 301 redirect)
2- This is our new website: www.memoq.com
3- I would like to disallow the following links on the old website that are still visible (haven't been redirected):
http://kilgray.com/memoq/2015-100/help-en/index.html
http://kilgray.com/memoq/2014/help-en/
-
Do you want to disallow the URLs that these pages are being redirected to? If not, there's no need to add anything to the robots.txt file.
If you do want to disallow the URLs that these pages are being redirected to, use relative URLs in your robots.txt file. For example, let's say olddomain.com/old-help-page/ is being redirected to newdomain.com/new-help-page/. If that's the case, add the following to your robots.txt file.
Disallow: /new-help-page/
There's no need to disallow the specific URLs that are being redirected to something else. Are you trying to get them removed from Google's index or something? If so, Google will update their index eventually based on your 301 redirects.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Bloking pages in roborts.txt that are under a redirected subdomain
Hi Everyone, I have a lot of Marketo landing pages that I don't want to show in SERP. Adding the noindex meta tag for each page will be too much, I have thousands of pages. Blocking it in roborts.txt could have been an option, BUT, the subdomain homepage is redirected to my main domain (with a 302) so I may confuse search engines ( should they follow the redirect or should they block) marketo.mydomain.com is redirected to www.mydomain.com disallow: / (I think this will be confusing with the redirect) I don't have folders, all pages are under the subdomain, so I can't block folders in Robots.txt also Would anyone had this scenario or any suggestions? I appreciate your thoughts here. Thank you Rachel
Technical SEO | | RaquelSaiz0 -
No index tag robots.txt
Hi Mozzers, A client's website has a lot of internal directories defined as /node/*. I already added the rule 'Disallow: /node/*' to the robots.txt file to prevents bots from crawling these pages. However, the pages are already indexed and appear in the search results. In an article of Deepcrawl, they say you can simply add the rule 'Noindex: /node/*' to the robots.txt file, but other sources claim the only way is to add a noindex directive in the meta robots tag of every page. Can someone tell me which is the best way to prevent these pages from getting indexed? Small note: there are more than 100 pages. Thanks!
Technical SEO | | WeAreDigital_BE
Jens0 -
301 redirecting a previously abused URL
A client previously had their most important landing page at domain.com/example.htm They carried out the sort of link building that was commonplace a few years back (exact match anchors, paid blog links etc) targeting this URL, but they also got a bunch of legitimate decent quality links here. I believe they may have had a number of issues when link quality algo updates were rolled out, so rather than try and get links removed and go through the disavow process they instead decided to abandon this URL, let it 404 and start afresh at domain.com/example.html - updating all internal navigation, XML sitemaps etc. So fast forward to today. What is the best practice for this URL these days do we think? Is it now possible to 301 domain.com/example.htm > domain.com/example.html and recover whatever value may be left here? The argument for not doing so may be that you could pass over the negative metrics associated with the old URL, but would this not be handled by the real-time penguin update and the poor links just devalued rather than actually harming? And could this just be tested - i.e. add in the 301, monitor the impact and if things don't go the way we'd want then just remove the 301 again? Would be keen to get a few opinions on this. TIA
Technical SEO | | Salience_Search_Marketing0 -
How do find where a 301 redirect is located
My report says I have http://www.30minuteseder.com/Passover.blog redirected to http://30minuteseder.com/Passover.blog. It is correct, but I can't find where the 301 redirect is located. I looked in my .htaccess file in the root and it's not there. How do I find it so I can change it?
Technical SEO | | Sederman0 -
Google Webmaster redirect vs 301 redirect
OK assuming a client's website has the right tracking script (hopefully analytics isn't effected by this issue), ... what happens if the htaccess file has a 301 redirect to the www-address, but within Google Webmaster Tools, the address chosen to crawl by Google is the non-www address? How will Google handle and which address takes precedence in this situation? _Cindy
Technical SEO | | CeCeBar0 -
301 Redirection of entire section to the homepage
Hi Guys, So here's the deal. Let's say I have a site at mysite.com/ which talks about tomatoes, and I also have a subsection that talks about potatoes at mysite.com/potatoes I want to stop providing information about potatoes altogether so i'm thinking about doing a 301 redirection from all of the pages at mysite.com/potatoes(.*) to the home page. The thing is, mysite.com/potatoes actually has a great page authority (3475 links from 145 domains) so I really don't wan to lose all that juice... Here are my questions: Will the links be added to the ones i have for the homepage already? Since my home page and my /potatoes section ranked for 2 different subjects, how is this transfer going to affect my rankings for the homepage? will it now also rank for both tomatoes AND potatoes? How much time does it usually take for google to recognize the 301 and pass the link juice? Any other tips on optimizing this process? Thank you for your time! -francois
Technical SEO | | nyakim0 -
301 redirect .htaccess problem
Can anyone explain to me why this doesn't work? Redirect 301 /category/diamond-pendants/nstart/1/start/(.*) http://www.povada.com/category/pendants/nstart/1/start/$1 Im trying to replace everything after /start/ and insert it into the new url. Thanks in advance.
Technical SEO | | 13375auc30 -
Subdomain Robots.txt
If I have a subdomain (a blog) that is having tags and categories indexed when they should not be, because they are creating duplicate content. Can I block them using a robots.txt file? Can I/do I need to have a separate robots file for my subdomain? If so, how would I format it? Do I need to specify that it is a subdomain robots file, or will the search engines automatically pick this up? Thanks!
Technical SEO | | JohnECF0