Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
-
Hi- I have a client that had thousands of dynamic php pages indexed by Google that shouldn't have been. He has since blocked these php pages via robots.txt disallow. Unfortunately, many of those php pages were linked to by high quality sites mulitiple times (instead of the static urls) before he put up the php 'disallow'.
If we create 301 redirects for some of these php URLs that area still showing high value backlinks and send them to the correct static URLs, will Google even see these 301 redirects and pass link value to the proper static URLs? Or will the robots.txt keep Google away and we lose all these high quality backlinks? I guess the same question applies if we use the canonical tag instead of the 301. Will the robots.txt keep Google from seeing the canonical tags on the php pages?
Thanks very much,
V
-
No problem
-
Hello Dmitrii,
Yes, that clarifies things perfectly. Thanks very much for your explanation. And I missed this particular WBF, so I will give it a close look as well.
Thanks again for your quick help.
-
Hello, my friend.
You should realize how exactly htaccess' 301 redirects work. They are server side commands/operations. So, when bots request a page, they wait until server response. In case of 301s - they get response "Don't go here, go there". Now, they also may get response from robots.txt saying "you're not allowed to look at the contents of this file/directory", however this will not prevent the server response. That's why sometimes you can see indexed pages, which are saying "blocked by robots". They are indexed though.
Now, in case of canonical links you are correct, since canonical is IN the content of the page, then robots won't be able to read it, therefore won't be able to be told that there is a canonical page.
There is a recent WBF on this subject - https://moz.com/blog/controlling-search-engine-crawlers-for-better-indexation-and-rankings-whiteboard-friday
Hope this clarifies some things.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Search console says 'sitemap is blocked by robots?
Google Search console is telling me "Sitemap contains URLs which are blocked by robots.txt." I don't understand why my sitemap is being blocked? My robots.txt look like this: User-Agent: *
Technical SEO | | Extima-Christian
Disallow: Sitemap: http://www.website.com/sitemap_index.xml It's a WordPress site, with Yoast SEO installed. Is anyone else having this issue with Google Search console? Does anyone know how I can fix this issue?1 -
I need help with redirecting chain to another page and 301, I don't understand on how to fix
Redirect Chain <label>What it is:</label> Your page is redirecting to a page that is redirecting to a page that is redirecting to a page... and so on. Learn more about redirection best practices. <label>Why it's an issue:</label> Every redirect hop loses link equity and offers a poor user experience, which will negatively impact your rankings. <label>How to fix it:</label> Chiaryn says: “Redirect chains are often caused when multiple redirect rules pile up, such as redirecting a 'www' to non-www URL or a non-secure page to a secure/https: page. Look for any recurring chains that could be rewritten as a single rule. Be particularly careful with 301/302 chains in any combination, as the 302 in the mix could disrupt the ability of the 301 to pass link equity.” This is not helping me I don't understand about the 301 do I use the www.jasperartisanjewelry.com or the /jasperartisanjewelry.com I'm confused
Technical SEO | | geanmitch0 -
Best way to handle 301 redirects on a business directory
We work with quite a few sites that promote retail traders and feature a traders' directory with pages for each of the shops (around 500 listings in most cases). As retail strips, shops come and go all the time, so I get a lot of pages that are removed as the business is no longer present. Currently I've been doing 301 redirects to the home page of the directory if you try to access a deleted trader page, but this means a ever growing htaccess file with thousands of 301 redirects. Are we handling this the best way or is there a better way to tackle this situation?
Technical SEO | | Assemblo0 -
301 Redirects
Hi, I have switched my site from a http .co.uk site to a https .com site. I have set a 301 redirect in the .htaccess file pointing all traffic going to the original .co.uk site to go to the new https: RewriteEngine on
Technical SEO | | imoprojects
RewriteCond %{HTTP_HOST} ^up-bus.co.uk$ [OR]
RewriteCond %{HTTP_HOST} ^www.up-bus.co.uk$
RewriteRule ^(.*)$ "https://www.up-bus.com/$1" [R=301,L] however when i search in google for keywords the original .co.uk site is still registering in search, is there something else I am required to do to tell google to use the new https site instead? Do i need to do redirects for every page, or is what I have done above sufficient? Hope you can help, I am struggling with getting our site to register on google search, any advice greatly welcome Thanks in advance, Ian0 -
Google Seeing Way More Pages Than My Site Actually Has
For one of my sites, A-1 Scuba Diving And Snorkeling Adventures, Google is seeing way more pages than I actually have. It sees almost 550 pages but I only have about 50 pages in my XML. I am sure this is an error on my part. Here is the search results that show all my pages. Can anyone give me some guidance on what I did wrong. Is it a canonical url problem, a redirect problem or something else. Built on Wordpress. Thanks in advance for any help you can give. I just want to make sure I am delivering everything I can for the client.
Technical SEO | | InfinityTechnologySolutions0 -
Duplicate page errors from pages don't even exist
Hi, I am having this issue within SEOmoz's Crawl Diagnosis report. There are a lot of crawl errors happening with pages don't even exist. My website has around 40-50 pages but SEO report shows that 375 pages have been crawled. My guess is that the errors have something to do with my recent htaccess configuration. I recently configured my htaccess to add trailing slash at the end of URLs. There is no internal linking issue such as infinite loop when navigating the website but the looping is reported in the SEOmoz's report. Here is an example of a reported link: http://www.mywebsite.com/Door/Doors/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/GlassNow-Services/GlassNow-Services/Glass-Compliance-Audit/ btw there is no issue such as crawl error in my Google webmaster tool. Any help appreciated
Technical SEO | | mmoezzi0 -
Block or remove pages using a robots.txt
I want to use robots.txt to prevent googlebot access the specific folder on the server, Please tell me if the syntax below is correct User-Agent: Googlebot Disallow: /folder/ I want to use robots.txt to prevent google image index the images of my website , Please tell me if the syntax below is correct User-agent: Googlebot-Image Disallow: /
Technical SEO | | semer0 -
Domain Transfer Process / Bulk 301's Using IIS
Hi guys - I am getting ready to do a complete domain transfer from one domain to another completely different domain for a client due to a branding/name change. 2 things - first, I wanted to lay out a summary of my process and see if everyone agrees that its a good approach, and second, my client is using IIS, so I wanted to see if anyone out there knows a bulk tool that can be used to implement 301's on the hundreds of pages that the site contains? I have found the process to redirect each individual page, but over hundreds its a daunting task to look at. The nice thing about the domain transfer is that it is going to be a literal 1:1 transfer, with the only things changing being the logo and the name mentions. Everything else is going to stay exactly the same, for the most part. I will use dummy domain names in the explanation to keep things easy to follow: www.old-domain.com and www.new-domain.com. The client's existing home page has a 5/10 GPR, so of course, transferring Mojo is very important. The process: Clean up existing site 404's, duplicate tags and titles, etc. (good time to clean house). Create identical domain structure tree, changing all URL's (for instance) from www.old-domain.com/freestuff to www.newdomain.com/freestuff. Push several pages to a dev environment to test (dev.new-domain.com). Also, replace all instances of old brand name (images and text) with new brand name. Set up 301 redirects (here is where my IIS question comes in below). Each page will be set up to redirect to the new permanent destination with a 301. TEST a few. Choose lowest traffic time of week (from analytics data) to make the transfer ALL AT ONCE, including pushing new content live to the server for www.new-domain.com and implementing the 301's. As opposed to moving over parts of the site in chunks, moving the site over in one swoop avoids potential duplicate content issues, since the content on the new domain is essentially exactly the same as the old domain. Of course, all of the steps so far would apply to the existing sub-domains as well, IE video.new-domain.com. Check for errors and problems with resolution issues. Check again. Check again. Write to (as many as possible) link partners and inform them of new domain and ask links to be switched (for existing links) and updated (for future links) to the new domain. Even though 301's will redirect link juice, the actual link to the new domain page without the redirect is preferred. Track rank of targeted keywords, overall domain importance and GPR over time to ensure that you re-establish your Mojo quickly. That's it! Ok, so everyone, please give me your feedback on that process!! Secondly, as you can see in the middle of that process, the "implement 301's" section seems easier said than done, especially when you are redirecting each page individually (would take days). So, the question here is, does anyone know of a way to implement bulk 301's for each individual page using IIS? From what I understand, in an Apache environment .htaccess can be used, but I really have not been able to find any info regarding how to do this in bulk using IIS. Any help here would be GREATLY APPRECIATED!!
Technical SEO | | Bandicoot0