Will a Robots.txt 'disallow' of a directory, keep Google from seeing 301 redirects for pages/files within the directory?
-
Hi- I have a client that had thousands of dynamic php pages indexed by Google that shouldn't have been. He has since blocked these php pages via robots.txt disallow. Unfortunately, many of those php pages were linked to by high quality sites mulitiple times (instead of the static urls) before he put up the php 'disallow'.
If we create 301 redirects for some of these php URLs that area still showing high value backlinks and send them to the correct static URLs, will Google even see these 301 redirects and pass link value to the proper static URLs? Or will the robots.txt keep Google away and we lose all these high quality backlinks? I guess the same question applies if we use the canonical tag instead of the 301. Will the robots.txt keep Google from seeing the canonical tags on the php pages?
Thanks very much,
V
-
No problem
-
Hello Dmitrii,
Yes, that clarifies things perfectly. Thanks very much for your explanation. And I missed this particular WBF, so I will give it a close look as well.
Thanks again for your quick help.
-
Hello, my friend.
You should realize how exactly htaccess' 301 redirects work. They are server side commands/operations. So, when bots request a page, they wait until server response. In case of 301s - they get response "Don't go here, go there". Now, they also may get response from robots.txt saying "you're not allowed to look at the contents of this file/directory", however this will not prevent the server response. That's why sometimes you can see indexed pages, which are saying "blocked by robots". They are indexed though.
Now, in case of canonical links you are correct, since canonical is IN the content of the page, then robots won't be able to read it, therefore won't be able to be told that there is a canonical page.
There is a recent WBF on this subject - https://moz.com/blog/controlling-search-engine-crawlers-for-better-indexation-and-rankings-whiteboard-friday
Hope this clarifies some things.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Sites website https://www.opcfitness.com/ title NOT GOOD FOR SEO
We set up a website https://www.opcfitness.com/home on google sites. but google sites page title not good for SEO. How to fix it?
Technical SEO | | ahislop5740 -
Should a login page for a payroll / timekeeping comp[any be no follow for robots.txt?
I am managing a Timekeeping/Payroll company. My question is about the customer login page. Would this typically be nofollow for robots?
Technical SEO | | donsilvernail0 -
Should I do a 301 redirect
Hi Everyone, Hope you can help me out here. I have .co.uk & .ie website with similar content. On a particular section of the .co.uk website it is updated daily (Q&As, Blog posts etc) .ie does have this section but to a lesser degree, no daily updates etc, I was wondering if we should simply do a 301 redirect when someone is on the .ie website to .co.uk, it means the user is getting a much better experience however not entirely the consequences from search engines on this? Thanks
Technical SEO | | Paul781 -
How long should I keep 301 redirects?
I have modified a the URL structure of a whole section of a website and used mod_rewrite 301 redirect to match the new structure. Now that was around 3 months ago and I was wondering how long should I keep this redirect for? As it is a new website I am quite sure that there are no links around with the old URL structure but still I can see the google bot trying from time to time to access the old URL structure. Shouldn't the google bot learn from this 301 redirect and not go anymore for the old URL?
Technical SEO | | socialtowards0 -
How to allow one directory in robots.txt
Hello, is there a way to allow a certain child directory in robots.txt but keep all others blocked? For instance, we've got external links pointing to /user/password/, but we're blocking everything under /user/. And there are too many /user/somethings/ to just block every one BUT /user/password/. I hope that makes sense... Thanks!
Technical SEO | | poolguy0 -
Do we need to manually submit a sitemap every time, or can we host it on our site as /sitemap and Google will see & crawl it?
I realized we don't have a sitemap in place, so we're going to get one built. Once we do, I'll submit it manually to Google via Webmaster tools. However, we have a very dynamic site with content constantly being added. Will I need to keep manually re-submitting the sitemap to Google? Or could we have the continually updating sitemap live on our site at /sitemap and the crawlers will just pick it up from there? I noticed this is what SEOmoz does at http://www.seomoz.org/sitemap.
Technical SEO | | askotzko0 -
Google indexing directory folder listing page
Google somehow managed to find several of our images index folders and decided to include them into their index. Example: websitesite.com/category/images/ is what you'll see when doing a site:website.com search. So, I have two-part question: 1) Does this hurt our site's ability to rank in any way?
Technical SEO | | invision
Because all Google sees is just a directory listing page with a bunch of links to images in the folder. 2) If there could be any negative effect, what is the best way to get these folders out of Google's index?
I could block via robots.txt, but I'm afraid it will also block all the images in that folder from being indexed in Google image search. I could also turn off directory listing in cpanel / htaccess, but then that gives is a 403 forbidden. Will this hurt the site in anyway and would it prevent Google from indexing the images in the directory? Thanks,
Tony0 -
Does 301 redirecting a site multiple times keep the value of the original site?
Hi, All! If I 301 redirect site www.abc.com to www.def.com, it should pass (almost) all linkjuice, rank, trust, etc. What happens if I then redirect site www.def.com to www.ghi.com? Does the value of the original site pass indefinitely as long as you do the redirects correctly? Or does it start to be devalued at some point? If anyone's had experience redirecting a site more than once and they've seen reportable good/bad/neutral results, that would be very helpful. Thanks in advance! -Aviva B
Technical SEO | | debi_zyx0