How can I make it so that robots.txt is not ignored due to a URL re-direct?
-
Recently a site moved from blog.site.com to site.com/blog with an instruction like this one:
/etc/httpd/conf.d/site_com.conf:94: ProxyPass /blog http://blog.site.com
/etc/httpd/conf.d/site_com.conf:95: ProxyPassReverse /blog http://blog.site.comIt's a Wordpress.org blog that was set as a subdomain, and now is being redirected to look like a directory. That said, the robots.txt file seems to be ignored by Google bot. There is a Disallow: /tag/ on that file to avoid "duplicate content" on the site. I have tried this before with other Wordpress subdomains and works like a charm, except for this time, in which the blog is rendered as a subdirectory. Any ideas why? Thanks!
-
Hi there,
No, haven't tried it yet, but we'll give it a shot. Thanks!
-
Have you thought about adding rel canonicals by chance? Also, how do you know the robots.txt is being ignored are the page showing up in search results? If so maybe the syntax is incorrect in your robots.txt file. Check out robotstxt.org
-
Hi Rocio,
Have you tried YOAST SEO plugin? It has an option to ad to the tags.
That's the easiest way I'd go for.Best Luck.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I direct users to site page when they search vanity URL?
My company runs a contest via a landing page on our website. The full URL to the landing page is rather long so we have a vanity URL that we use for advertising purposes. I have a 301 on the vanity URL to the landing page URL so people visiting it directly end up where they should just fine. But if a user goes to Google and types the vanity URL into the search bar, the landing page is nowhere to be found in the results. What do I need to do to get the landing page to show in results when people search the vanity URL?
Technical SEO | | jarjarjarvis0 -
What is the best practice to re-index the de-indexed pages due to a bad migration
Dear Mozers, We have a Drupal site with more than 200K indexed URLs. Before 6 months a bad website migration happened without proper SEO guidelines. All the high authority URLs got rewritten by the client. Most of them are kept 404 and 302, for last 6 months. Due to this site traffic dropped more than 80%. I found today that around 40K old URLs with good PR and authority are de-indexed from Google (Most of them are 404 and 302). I need to pass all the value from old URLs to new URLs. Example URL Structure
Technical SEO | | riyas_
Before Migration (Old)
http://www.domain.com/2536987
(Page Authority: 65, HTTP Status:404, De-indexed from Google) After Migration (Current)
http://www.domain.com/new-indexed-and-live-url-version Does creating mass 301 redirects helps here without re-indexing the old URLS? Please share your thoughts. Riyas0 -
URL removals
Hello there, I found out that some pages of the site have two different URL's pointing at the same page generating duplicate content, title and description. Is there a way to block one of them? cheers
Technical SEO | | PremioOscar0 -
No Keyword in URL
SEOMoz (and other platforms) advise that I need to add my keyword to the page URL, however as far as I'm concerned it has been, so why don't these platforms see it. My home page URL is www.salesandinternetmarketing.com, but apparently I haven't added the keyword internet marketing to the URL, what advice can you give me please? Lindsay
Technical SEO | | lindsayjhopkins1 -
Can I 301 Re-Direct within the same site?
I have a magento site and would like to do a 301 redirect from page A to page B. Page B was created after Page A but contains the same products. I want page A to be replaced in the search engines with page B while carrying the link juice from page A. Is this possible? Am I better off just blocking page A through the robots .txt file? Thanks
Technical SEO | | Prime850 -
Duplicate content with same URL?
SEOmoz is saying that I have duplicate content on: http://www.XXXX.com/content.asp?ID=ID http://www.XXXX.com/CONTENT.ASP?ID=ID The only difference I see in the URL is that the "content.asp" is capitalized in the second URL. Should I be worried about this or is this an issue with the SEOmoz crawl? Thanks for any help. Mike
Technical SEO | | Mike.Goracke0 -
Question about Robot.txt
I just started my own e-commerce website and I hosted it to one of the popular e-commerce platform Pinnacle Cart. It has a lot of functions like, page sorting, mobile website, etc. After adjusting the URL parameters in Google webmaster last 3 weeks ago, I still get the same duplicate errors on meta titles and descriptions based from Google Crawl and SEOMOZ crawl. I am not sure if I made a mistake of choosing pinnacle cart because it is not that flexible in terms of editing the core website pages. There is now way to adjust the canonical, to insert robot.txt on every pages etc. however it has a function to submit just one page of robot.txt. and edit the .htcaccess. The website pages is in PHP format. For example this URL: www.mycompany.com has a duplicate title and description with www.mycompany.com/site-map.html (there is no way of editing the title and description of my sitemap) Another error is www.mycompany.com has a duplicate title and description with http://www.mycompany.com/brands?url=brands Is it possible to exclude those website with "url=" and my "sitemap.html" in the robot.txt? or the URL parameters from Google is enough and it just takes a lot of time. Can somebody help me on the format of Robot.txt. Please? thanks
Technical SEO | | paumer800 -
Overly Dynamic URLs
I have a site that I use to time fitness events and I like to post the results using query strings. I create a link to each event's results/gallery/etc. I don't need these pages crawled and I don't want them to hurt my seo. Can I put a "do not crawl" meta on them or will that hurt my overall positioning? What are my other options?
Technical SEO | | bobbabuoy0