Google robots.txt test - not picking up syntax errors?
-
I just ran a robots.txt file through "Google robots.txt Tester" as there was some unusual syntax in the file that didn't make any sense to me...
e.g. /url/?*
/url/?
/url/*and so on. I would use ? and not ? for example and what is ? for! - etc.
Yet "Google robots.txt Tester" did not highlight the issues...
I then fed the sitemap through http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php and that tool actually picked up my concerns.
Can anybody explain why Google didn't - or perhaps it isn't supposed to pick up such errors?
Thanks, Luke
-
Many thanks Beau - much appreciated.
-
Hey Luke,
It appears that in each of the three examples, there was a plausible case for each example. Let's cover each:
- For /url/?* , it can be expressed that a URL can offer a trailing slash and then a query string, see examples here.
- with /url/? , this covers examples of the above and in addition, would plausibly block product pages that generate query strings, similar to this example from H&M. In essence, only allowing the product page to be seen.
- /url/* , well, that's just anything and everything after the trailing slash.
I guess the question you should ask yourself is "Is this the best approach for the issue?"
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Brand name not ranking in Google
Hi Moz'ers, Could you help me with something I cannot seem to figure out by myself. In June 2017 my company started a rebranding campaign. We've changed our brand name and launched a new website: https://spotler.com. Everything is going fine, but if you Google our brand name "Spotler" our website doesn't show up. How can it be? Our domain authority is 38. It would be wonderful if you could help me. Let me know if you need more information. Best, Simone
Intermediate & Advanced SEO | | Spotler0 -
Is Google ignoring my canonicals?
Hi, We have rel=canonical set up on our ecommerce site but Google is still indexing pages that have rel=canonical. For example, http://www.britishbraces.co.uk/braces/novelty.html?colour=7883&p=3&size=599 http://www.britishbraces.co.uk/braces/novelty.html?p=4&size=599 http://www.britishbraces.co.uk/braces/children.html?colour=7886&mode=list These are all indexed but all have rel=canonical implemented. Can anyone explain why this has happened?
Intermediate & Advanced SEO | | HappyJackJr0 -
Problem with Google finding our website
We have an issue with Google finding our website: (URL removed) When we google "(keyword removed)" in google.com.au, our website doesn't come up anywhere. This is despite inserting the suitable title tag and onsite copy for SEO. We found this strange, and thought we'd investigate further. We decided to just google the website URL in google.com.au, to see if it was being properly found. Our site appeared at the top but with this description: A description for this result is not available because of this site's robots.txt – learn more. We also can see that the incorrect title tag is appearing. From this, we assumed that there must be an issue with the robot.txt file. We decided to put a new robot.txt file up: (URL removed) This hasn't solved the problem though and we still have the same issue. If someone could get to the bottom of this for us, we would be most appreciative. We are thinking that there may possibly be another robot.txt file that we can't find that is causing issues, or something else we're not sure of! We want to get to the bottom of it so that the site can be appropriately found. Any help here would be most appreciated!
Intermediate & Advanced SEO | | Gavo0 -
Block subdomain directory in robots.txt
Instead of block an entire sub-domain (fr.sitegeek.com) with robots.txt, we like to block one directory (fr.sitegeek.com/blog).
Intermediate & Advanced SEO | | gamesecure
'fr.sitegeek.com/blog' and 'wwww.sitegeek.com/blog' contain the same articles in one language only labels are changed for 'fr' version and we suppose that duplicate content cause problem for SEO. We would like to crawl and index 'www.sitegee.com/blog' articles not 'fr.sitegeek.com/blog'. so, suggest us how to block single sub-domain directory (fr.sitegeek.com/blog) with robot.txt? This is only for blog directory of 'fr' version even all other directories or pages would be crawled and indexed for 'fr' version. Thanks,
Rajiv0 -
Low on Google ranking despite error-free!?
Hi all, I'm following up on a recent post i've made about our indexing and especially ranking problems in Google: http://moz.com/community/q/seo-impact-classifieds-website Thanks to all good comments we managed to get rid of most of our crawl errors and as a result our high priority /duplicated content decreased from +22k to 270. In short, we created canonical urls, run an xml sitemap, used url parameters in GWT, created h1 and meta description for each ad posted by users etc. I then used google fetch a few times (3 weeks ago and last week) both for desktop and mobile version for re-approval. Nothing really improves in google rankings (all our core keywords are ranked +50)since months now: yet yahoo and bing organic traffic went up and is 3x higher than google's. In the meanwhile we're running paid campagins on facebook and adwords since months already to keep traffic consistent, yet this is eating up our budget, even though our ctr and conversion rates are good. I realize we might have to create more content on-site and through social media, but right now our social media traffic is already around 50% and we are using more of twitter and google+ as well since recently. Our organic traffic is only 14%; with google only a third of that. In the end, I believe this breakdown should look more something like organic 50%-70%, (paid)social,referral and direct traffic. 50%-30%... I can't believe we are hit by a penalty although this looks like it is the case. Especially while yahoo and bing traffic goes up and google does not. Should I wait for a signal once our site is "approved" again through GWT fetch? Or am i missing something that i need to check as well to improve these rankings? Thanks for your help! Ivor ps: ask me for additional stats or info in a pm if needed!
Intermediate & Advanced SEO | | ivordg0 -
Should I use meta noindex and robots.txt disallow?
Hi, we have an alternate "list view" version of every one of our search results pages The list view has its own URL, indicated by a URL parameter I'm concerned about wasting our crawl budget on all these list view pages, which effectively doubles the amount of pages that need crawling When they were first launched, I had the noindex meta tag be placed on all list view pages, but I'm concerned that they are still being crawled Should I therefore go ahead and also apply a robots.txt disallow on that parameter to ensure that no crawling occurs? Or, will Googlebot/Bingbot also stop crawling that page over time? I assume that noindex still means "crawl"... Thanks 🙂
Intermediate & Advanced SEO | | ntcma0 -
Does Google check Whois
Hello everyone, I own quite a lot of website active in the same niche and sometimes targeting the same keywords, these sites are hosted at different IP's. But they all have the same Whois details, i was wondering if Google checks the Whois-data? And if it affects the serp's? Regards, Yannick
Intermediate & Advanced SEO | | iwebdevnl0 -
Google does not target my website properly!
Hello everyone, My website : www.pentrucadouri.ro, despite is a .ro with romanian content and is hosted in Romania appear for google.ro as a english targeted website.Google see internal pages as romanian ones but main page as english . In order to change this , I added : Also few days ago I uploaded a geositemap and I submitted this to google. Do you have suggestions ? Website ranks 2nd for "cosuri cadou" on google.com and 3rd on bing, but on google.ro ranks 11 . Thanks!
Intermediate & Advanced SEO | | VertiStudio0