How can I make it so that robots.txt is not ignored due to a URL re-direct?
-
Recently a site moved from blog.site.com to site.com/blog with an instruction like this one:
/etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.comIt's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!
-
Hi there,
No, haven't tried it yet, but we'll give it a shot. Thanks!
-
Have you thought about adding rel canonicals by chance? Also, how do you know the robots.txt is being ignored are the page showing up in search results? If so maybe the syntax is incorrect in your robots.txt file. Check out robotstxt.org
-
Hi Rocio,
Have you tried YOAST SEO plugin? It has an option to ad to the tags.
That's the easiest way I'd go for.Best Luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removed URLs
recently my site has got some problem some of my URLs are repeating in the SERP ! I removed them by search console and also site : but they show up again Does anyone know what is wrong?
Technical SEO | | talaabshode20200 -
Magento URL change
We have a Magento website parked at HostGator. The site is comprised of both a PC and a mobile version. We changed the URL to a new one ... We made the domain changes in the ‘core_config_data’ (phpMyAdmin) ... We flushed the cache in the ‘File Manager’ part of cPanel (regular and mobile version) Currently we can access the http://newsite.com (on a desktop) with no problem ... We can also access http://m.newsite.com BUT… only from a desktop PC. When we try http://newsite.com from a MOBILE device, it routes to: http://m.OLDsite.com (it keeps going to the old URL) Need some help please. Thanks in advance!
Technical SEO | | Prime850 -
Robots.txt
Google Webmaster Tools say our website's have low-quality pages, so we have created a robots.txt file and listed all URL’s that we want to remove from Google index. Is this enough for the solve problem?
Technical SEO | | iskq0 -
Is having no robots.txt file the same as having one and allowing all agents?
The site I am working on currently has no robots.txt file. However, I have just uploaded a sitemap and would like to point the robots.txt file to it. Once I upload the robots.txt file, if I allow access to all agents, is this the same as when the site had no robots.txt file at all; do I need to specify crawler access on can the robots.txt file just contain the link to the sitemap?
Technical SEO | | pugh0 -
Same URL in "Duplicate Content" and "Blocked by robots.txt"?
How can the same URL show up in Seomoz Crawl Diagnostics "Most common errors and warnings" in both the "Duplicate Content"-list and the "Blocked by robots.txt"-list? Shouldnt the latter exclude it from the first list?
Technical SEO | | alsvik0 -
URL with tracking code
Hi there, At the company i am currently working for we have a problem with shortcut url with tracking in it. They send a lot of brochures with a shortcut URL which redirects to the page of the event with tagging. For example The real URL is:
Technical SEO | | RuudHeijnen
http://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/ The URL in the brochure is:
www.sbo.nl/schoolleiderpo this then redirects to: h
ttp://www.sbo.nl/cursussen/schoolleider-primair-onderwijs/?utm_source=direct&utm_medium=shortcut&utm_campaign=schoolleiderpo Now we can measure the effect of the brochure on on-line traffic and conversion. This is great but a lot of website link to that shortcut url and if the event is put offline the links to it generate an 404. We have now about 800 backlinks that generate this 404 and i want to fix it. Another big problem "i think" is the possibility that google will index this url with tagging. Now i have 2 options: 1. look at al the url with that 404 and redirect them with a 301 to the best page 2. create the shortcut on an page that is most suitable but then i will get the tagging in the URL and i guess google will see this as dublicate content. It is possible that in the future the shortcut url will be used again. What would you suggest as the best sollution.0 -
Our Development team is planning to make our website nearly 100% AJAX and JavaScript. My concern is crawlability or lack thereof. Their contention is that Google can read the pages using the new #! URL string. What do you recommend?
Discussion around AJAX implementations and if anybody has achieved high rankings with a full AJAX website or even a partial AJAX website.
Technical SEO | | DavidChase0 -
Restricted by robots.txt and soft bounce issues (related).
In our web master tools we have 35K (ish) URLs that are restricted by robots.txt and as have 1200(ish) soft 404s. WE can't seem to figure out how to properly resolve these URLs so that they no longer show up this way. Our traffic from SEO has taken a major hit over the last 2 weeks because of this. Any help? Thanks, Libby
Technical SEO | | GristMarketing0