How can I make it so that robots.txt is not ignored due to a URL re-direct?

rodelmo4

Recently a site moved from blog.site.com to site.com/blog with an instruction like this one:

/etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.com

It's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!

rodelmo4

Hi there,

No, haven't tried it yet, but we'll give it a shot. Thanks!

JordanLowry

Have you thought about adding rel canonicals by chance? Also, how do you know the robots.txt is being ignored are the page showing up in search results? If so maybe the syntax is incorrect in your robots.txt file. Check out robotstxt.org

Gaston Riera

Hi Rocio,

Have you tried YOAST SEO plugin? It has an option to ad to the tags.
That's the easiest way I'd go for.

Best Luck.
GR.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

How can I make it so that robots.txt is not ignored due to a URL re-direct?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Removed URLs

Magento URL change

Robots.txt

Is having no robots.txt file the same as having one and allowing all agents?

Same URL in "Duplicate Content" and "Blocked by robots.txt"?

URL with tracking code

Our Development team is planning to make our website nearly 100% AJAX and JavaScript. My concern is crawlability or lack thereof. Their contention is that Google can read the pages using the new #! URL string. What do you recommend?

Restricted by robots.txt and soft bounce issues (related).