Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt in subfolders and hreflang issues
-
A client recently rolled out their UK business to the US. They decided to deploy with 2 WordPress installations:
UK site - https://www.clientname.com/uk/ - robots.txt location: UK site - https://www.clientname.com/uk/robots.txt
US site - https://www.clientname.com/us/ - robots.txt location: UK site - https://www.clientname.com/us/robots.txtWe've had various issues with /us/ pages being indexed in Google UK, and /uk/ pages being indexed in Google US.
They have the following hreflang tags across all pages:
We changed the x-default page to .com 2 weeks ago (we've tried both /uk/ and /us/ previously).
Search Console says there are no hreflang tags at all.
Additionally, we have a robots.txt file on each site which has a link to the corresponding sitemap files, but when viewing the robots.txt tester on Search Console, each property shows the robots.txt file for https://www.clientname.com only, even though when you actually navigate to this URL (https://www.clientname.com/robots.txt) you’ll get redirected to either https://www.clientname.com/uk/robots.txt or https://www.clientname.com/us/robots.txt depending on your location.
Any suggestions how we can remove UK listings from Google US and vice versa?
-
Hi there!
Ok, it is difficult to know all the ins and outs without looking at the site, but the immediate issue is that your robots.txt setup is incorrect. robots.txt files should be one per subdomain, and cannot exist inside sub-folders:
A **
robots.txt**file is a file at the root of your site that indicates those parts of your site you don’t want accessed by search engine crawlersFrom Google's page here: https://support.google.com/webmasters/answer/6062608?hl=en
You shouldn't be blocking Google from either site, and attempting to do so may be the problem with why your hreflang directives are not being detected. You should move to having a single robots.txt file located at https://www.clientname.com/robots.txt, with a link to a single sitemap index file. That sitemap index file should then link to each of your two UK & US sitemap files.
You should ensure you have hreflang directives for every page. Hopefully after these changes you will see things start to get better. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate titles from hreflang variations
Hi, I am working on a large global site which has around 9 different language variations. We have setup the hreflang tags and referenced the corresponding content as follows: (We have not implemented a version X-default reference, as we felt it was not necessary) Using DeepCrawl and Search Console, we can see that these language variations are causing duplicate title issues. Many of them. My assumption was that the hreflang would have alleviated this issue and informed Google what is going on, however i wanted to see if anyone has any experience with this kind of thing before. It would be good to understand what the best practice approach is to deal with the problem. Is it even an issue at all, or just the tools being over-sensitive? Thank you in advance.
Technical SEO | | NickG-1230 -
Robots.txt Tester - syntax not understood
I've looked in the robots.txt Tester and I can see 3 warnings: There is a 'syntax not understood' warning for each of these. XML Sitemaps:
Technical SEO | | JamesHancocks1
https://www.pkeducation.co.uk/post-sitemap.xml
https://www.pkeducation.co.uk/sitemap_index.xml How do I fix or reformat these to remove the warnings? Many thanks in advance.
Jim0 -
Indexing Issue of Dynamic Pages
Hi All, I have a query for which i am struggling to find out the answer. I unable to retrieve URL using "site:" query on Google SERP. However, when i enter the direct URL or with "info:" query then a snippet appears. I am not able to understand why google is not showing URL with "site:" query. Whether the page is indexed or not? Or it's soon going to be deindexed. Secondly, I would like to mention that this is a dynamic URL. The index file which we are using to generate this URL is not available to Google Bot. For instance, There are two different URL's. http://www.abc.com/browse/ --- It's a parent page.
Technical SEO | | SameerBhatia
http://www.abc.com/browse/?q=123 --- This is the URL, generated at run time using browse index file. Google unable to crawl index file of browse page as it is unable to run independently until some value will get passed in the parameter and is not indexed by Google. Earlier the dynamic URL's were indexed and was showing up in Google for "site:" query but now it is not showing up. Can anyone help me what is happening here? Please advise. Thanks0 -
Issues with Magento layered navigation
Hi, We use Magento v.1.7 for our store. We have recently had an SEO audit and we have uncovered 2 major issues which can be pinpointed to our layered navigation. We use the MANAdev layered navigation module. There are numerous options available to help with SEO. All our filtered urls seem to be fine ie. https://www.tidy-books.co.uk/childrens-bookcases-shelves/colour/natural-finish-with-letters/letters/lowercase have canonical url correctly setup and the meta tags as noindex, follow but Magento is churning out tons of 404 error pages like this https://www.tidy-books.co.uk/childrens-bookcases-shelves/show/12/l/colour:24-4-9/letters:6-7 which google is indexing I'm at lost at how to solve this any help would be great. Thank you **This is from our SEO audit report ** The faceted navigation isn’t handled correctly and causes two major issues:● One of the faceted navigation filters causes 404 error. This means that the error isappended each sequence of the navigation options, multiplying the faulty URLs.● The pages created by the faceted nav are all accessible to the search engines. Thismeans that there are hundreds of duplicated category pages created by one of theparameters. The duplication issues can seriously hinder the organic visibility.The amount of 404 errors and the duplicated pages created by faceted navigation makes italmost impossible for a search engine crawler to finish the crawl. This means that the sitemight not be fully indexed and the newly introduced product pages or content won’t bediscovered for a very long time.
Technical SEO | | tidybooks0 -
Hreflang tags with link to redirect loop
Hi guys, I'm having a bit of an issue on a client site that I'm hoping someone can help me with. Basically, the client has two domains, one serving users in the Republic of Ireland (http://www.americanholidays.com), showing Euro prices, and the other serving users in Northern Ireland (http://www.americanholidays.com/gb_en/) showing £ prices. The issue I'm having is that the URL for the Northern Ireland page has a 302 on it and goes through another 2/3 301 redirects until it resolves as http://www.americanholidays.com, however it does then show the £ prices. You can see the redirect chain here: http://tools.seobook.com/server-header-checker/?page=single&url=http%3A%2F%2Fwww.americanholidays.com%2Fgb_en%2F&useragent=1&typeProtocol=11 The homepage is using the Hreflang tag, and pointing search engines to serve the http://www.americanholidays.com/gb_en/ page to users using EN-GB as their language. The page is also using a self-referencing canonical, which I believe may negate the whole Hreflang tag anyway? My main question is - is the fact that the Hreflang for the gb_en page is pointing to a chain of redirects negatively affecting it? (I understand too many redirects are never good). Also, is the canonical negating the Hreflang? Any help/info would be great as I just can't get my head around it! Thanks guys Daniel
Technical SEO | | DanielKiely60 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below,
Technical SEO | | CommercePundit
Can you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0 -
Issue with .uk.com domain
hi i have rockshore.uk.com which is not indexing properly. the internal pages do not show up for the text they have on them, or the title tags. the site is on aekmps shops platform. I understand that a .uk.com is not a proper TLD but i think i have a subdomain of .uk.com Can anyone help? thanks
Technical SEO | | Turkey0