How can I make it so that robots.txt is not ignored due to a URL re-direct?
-
Recently a site moved from blog.site.com to site.com/blog with an instruction like this one:
/etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.comIt's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!
-
Hi there,
No, haven't tried it yet, but we'll give it a shot. Thanks!
-
Have you thought about adding rel canonicals by chance? Also, how do you know the robots.txt is being ignored are the page showing up in search results? If so maybe the syntax is incorrect in your robots.txt file. Check out robotstxt.org
-
Hi Rocio,
Have you tried YOAST SEO plugin? It has an option to ad to the tags.
That's the easiest way I'd go for.Best Luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disallow wildcard match in Robots.txt
This is in my robots.txt file, does anyone know what this is supposed to accomplish, it doesn't appear to be blocking URLs with question marks Disallow: /?crawler=1
Technical SEO | | AmandaBridge
Disallow: /?mobile=1 Thank you0 -
Weird Blog tags and re-directs
Hello fellow Digital Marketeers! As an in-house kinda guy, I rarely get to audit sites other than my own. But, I was tasked with auditing another. So I ran it through Screaming Frog and the usual tools. I got a couple of URLs come back with timeout messages, so I checked them manually- they're apparently part of a blog's archive: http://www.bestpracticegroup.com/tag/training-2/ I click 'read more' and it takes you to: http://www.bestpracticegroup.com/pfi-contracts-3-myth-busters-to-help-achieve-savings/ The first URL seems entirely redundant. Has anyone else seen something like this? Just an explanation as to why something like that would exist, and how you'd handle that would be grand! Much appreciated, John.
Technical SEO | | Muhammad-Isap0 -
Wordpress 4xx errors from comment re-direct
About a month ago, we had a massive jump in 4XX errors. It seems the majority are being caused by the comment tool on wordpress, which is generating a link that looks like this "http://www.turnerpr.com/blog/wp-login.php?redirect_to=http%3A%2F%2Fwww.turnerpr.com%2Fblog%2F2013%2F09%2Fturners-crew-royal-treatment-well-sort-of%2Fphoto-2-2%2F" On every single post. We're using Akismet and haven't had issues in the past....and I can't figure out the fix. I've tried turning it off and back on; I'm reluctant to completely switch commenting systems because we'd lose so much history. Anyone seen this particular re-direct love happen before? Angela
Technical SEO | | TurnerPR0 -
Robots.txt
I have a client who after designer added a robots.txt file has experience continual growth of urls blocked by robots,tx but now urls blocked (1700 aprox urls) has surpassed those indexed (1000). Surely that would mean all current urls are blocked (plus some extra mysterious ones). However pages still listing in Google and traffic being generated from organic search so doesnt look like this is the case apart from the rather alarming webmaster tools report any ideas whats going on here ? cheers dan
Technical SEO | | Dan-Lawrence0 -
Robots.txt checker
Google seems to have discontinued their robots.txt checker. Is there another tool that I can use to check my text instead? Thanks!
Technical SEO | | theLotter0 -
BEST Wordpress Robots.txt Sitemap Practice??
Alright, my question comes directly from this article by SEOmoz http://www.seomoz.org/learn-seo/robotstxt Yes, I have submitted the sitemap to google, bing's webmaster tools and and I want to add the location of our site's sitemaps and does it mean that I erase everything in the robots.txt right now and replace it with? <code>User-agent: * Disallow: Sitemap: http://www.example.com/none-standard-location/sitemap.xml</code> <code>???</code> because Wordpress comes with some default disallows like wp-admin, trackback, plugins. I have also read other questions. but was wondering if this is the correct way to add sitemap on Wordpress Robots.txt http://www.seomoz.org/q/robots-txt-question-2 http://www.seomoz.org/q/quick-robots-txt-check. http://www.seomoz.org/q/xml-sitemap-instruction-in-robots-txt-worth-doing I am using Multisite with Yoast plugin so I have more than one sitemap.xml to submit Do I erase everything in Robots.txt and replace it with how SEOmoz recommended? hmm that sounds not right. User-agent: *
Technical SEO | | joony2008
Disallow:
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-login.php
Disallow: /wp-content/plugins
Disallow: /wp-content/cache
Disallow: /wp-content/themes
Disallow: /trackback
Disallow: /comments **ERASE EVERYTHING??? and changed it to** <code> <code>
<code>User-agent: *
Disallow: </code> Sitemap: http://www.example.com/sitemap_index.xml</code> <code>``` Sitemap: http://www.example.com/sub/sitemap_index.xml ```</code> <code>?????????</code> ```</code>0 -
Shorter URLs
Hi Is there a real value in having the keywords in the URL structure? we could use the URL: Mybrand.com/software/tablets/ipad/supertrader.html Or instead have the CMS create the shorter version mybrand.com/supertrader.html and just optimize this page for the keyword 'supertrader ipad software'
Technical SEO | | FXDD1 -
URL Rewrite
Using the .htaccess file how do I rewrite a url from www.exampleurl.com/index.php?page=example to www.exampleurl.com/example removing index.php?page= Any help is muchly appreciated
Technical SEO | | CraigAddyman0