Robot.txt error
-
I currently have this under my robot txt file:
User-agent: *
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspxWebMatrix 2.0
On webmaster > Health Check > Blocked URL
I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL:
User-agent: *
Disallow: /
WebMatrix 2.0
Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas?
Thanks
Seda
-
Thanks Irving, it worked
-
Try to spider your site with this link checker tool
bots cannot accept cookies and your site is requiring cookies to be enabled in order to be visited so Google cannot access the site because you are not allowing the visit without the cookie being dropped is most likely the issue.
Disable cookies on your browser and clear your cache and see what happens when you try to visit your site, are you blocked?
These discussions may possibly help
http://www.highrankings.com/forum/index.php/topic/3062-cookie-and-javascript/
http://stackoverflow.com/questions/5668681/seo-question-google-not-getting-past-cookies
-
Thanks Irving, I need a little more help, I am not quite sure if I understand it. What is it that needs to be fixed here?
-
I couldn't relay on SERPS as the website is old, it's been indexed for quite so i didn't think that SERP results would change that quick. I've been receiving the error since yesterday.
It's on SERPS today but would it be there tomorrow? The reason I am saying that is because when i change the Page Title, it doesnt get changed on SERPS instantly, it takes a day or so before i see the changes on SERPS.
-
TECHNICAL ISSUE
It's your cookie policy blocking bots from spidering. Need to fix that at the server level. Easy fix!
http://www.positivecollections.co.uk/cookies-policy.aspx
Your robots.txt is fine.
-
Okay. But that doesn't mean it isn't being indexed. Here's a fun test: Go to any page on your website and select a string of two or three sentences. Google it. Does the page come up in the SERPs?
(I did this to 3 pages on your site and it worked for all of them. Therefore, your site is being indexed.) Why do you need to Fetch?
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
-
That site is also being indexed. Again I ask, what makes you think it is not being indexed? (cause it is)
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
@Jesse: That's the main website, we've got other URLs.Error appears on positivecollections.co.uk
-
Thanks Irving,
www.positivecollections.co.uk is the url
I've tried to remove everything from the robot file and check again on webmaster, same thing happened It's just blocking the main link
-
Are you sure your site isn't being indexed?
Cause I went to your profile and if http://www.mtasolicitors.com/ is your site, then it is definitely being indexed.. What makes you think it isn't?
-
Are you sure there is nothing else in your robots.txt - you can share the url if you like
You can delete this it's doing nothing and don't need to attempt to block bad bots
WebMatrix 2.0
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt blocked internal resources Wordpress
Hi all, We've recently migrated a Wordpress website from staging to live, but the robots.txt was deleted. I've created the following new one: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Allow: /
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Allow: /wp-admin/admin-ajax.php However, in the site audit on SemRush, I now get the mention that a lot of pages have issues with blocked internal resources in robots.txt file. These blocked internal resources are all cached and minified css elements: links, images and scripts. Does this mean that Google won't crawl some parts of these pages with blocked resources correctly and thus won't be able to follow these links and index the images? In other words, is this any cause for concern regarding SEO? Of course I can change the robots.txt again, but will urls like https://example.com/wp-content/cache/minify/df983.js end up in the index? Thanks for your thoughts!2 -
When serving a 410 for page gone, should I serve an error page?
I'm removing a bunch of old & rubbish pages and was going to serve 410 to tell google they're gone (my understanding is it'll get them out of the index a bit quicker than a 404). I should still serve an error page though, right? Similar to a 404. That doesn't muddy the "gone" message that I'm giving Google? There's no need to 410 and die?
Intermediate & Advanced SEO | | HSDOnline0 -
Wordpress Tags error in MOZ
Hi, We are getting an enormous amount of missing meta description only on our tag archives. When we post we fill in a description and are using the Yoast plugin ( getting green lights). Now we are finding that we're missing descriptions in tags archives. What is the best thing to do? We're finding that the tags are creating a separate url for each tag that has missing description even though the post has a full description. 1. To block spiders from crawling tags? 2. To stop using tags? 3. What do you suggest? Thank you
Intermediate & Advanced SEO | | WalterHalicki0 -
Setting Up Hreflang and not getting return tag errors
I've set up a dummy domain (Not SEO'd I know) in order to get some input on if I'm doing this correctly. Here's my option on the set up and https://technicalseo.com/seo-tools/hreflang/ is saying it's all good. I'm self-referencing, there's a canonical, and there is return tags. https://topskiphire.com - US & International English Speaking Version https://topskiphire.com/au/ - English language in Australia The Australian version is on a subdirectory. We want it this way so we get full value of our domain and so we can expand into other countries eventually e.g. UK. Q1. Should I be self-referencing or should I have only a canonical for US site? Q2. Should I be using x-default if we're only in the English language? Q3. We previously failed when we had errors come back saying 'return tags not found' on a separate site even though the tags were on both sites. Was this because our previous site was only new and Google didn't rank it as often as our main domain.
Intermediate & Advanced SEO | | cian_murphy0 -
LinkedIn 999 HTTP Errors
I am working on a website, https://linkedinforbusiness.net and a ton of 999 HTTP errors have only now surfaced. I would venture from reading the "Request denied" error in the log, LinkedIn means to block BLCs attempts to check those links. It might be because the site has a lot of LinkedIn links; maybe they find it suspicious that the server is sending a lot of requests for their links. How do you see this? Any fixes? What level of harm do you think it brings to the site? I have removed all similar links to LinkedIn from my site to avoid this (https://www.hillwebcreations.com). However, this isn't so easily done for LinkedIn For Business, as her work in all about helping businesses and individuals optimize their use of LinkedIn.
Intermediate & Advanced SEO | | jessential0 -
Rich Snippets Not Displaying - Price Error?
We recently implemented Schema.org/product on our site (www.evo.com). In the Google Webmaster Tools Structured Data report we’re getting lots of errors: http://screencast.com/t/Z3QJBctjUvP which I believe is preventing our rich snippets (price, availability, ratings) from showing in search results. When I click into the “Product” data type on the Structured Data report I see that there’s 2 errors: missing price and missing best or worst rating: http://screencast.com/t/SuHVYFLFO5D We are adding the itemprop=“bestRating” code which should take care of the ‘missing best or worst rating’ error. The missing price error is what I want to ask about. There’s a couple strange things here (using this URL as example : http://www.evo.com/skis/line-sir-francis-bacon.aspx - which has been indexed since the code was added): 1) The Webmaster Tools report is finding the schema.org/offer data type and is recognizing the InStock and OutOfStock property of this: http://screencast.com/t/xtHouzeL37q BUT price is not being detected. 2) When I enter the URL into the Structured Data Testing Tool it does detect price: https://www.google.com/webmasters/tools/richsnippets?url=http://www.evo.com/skis/line-sir-francis-bacon.aspx 3) When I fetch the page as GoogleBot itemprop=“price”is present: http://screencast.com/t/Hnqda95N My hunch is that the reason our Rich Snippets are not showing is because of the “price” error. The “?” by the error in WMT says: “This property is missing in the html markup or was not properly highlighted in the Data Highlighter. This can prevent the rich snippet from appearing” Does anyone have an idea why we’re getting the “price” error – or anything else that could prevent our Rich Snippets from displaying? Thanks so much! http://screencast.com/t/SuHVYFLFO5D
Intermediate & Advanced SEO | | evoNick0 -
Robot.txt File Not Appearing, but seems to be working?
Hi Mozzers, I am conducting a site audit for a client, and I am confused with what they are doing with their robot.txt file. It shows in GWT that there is a file and it is blocking about 12K URLs (image attached). It also shows in GWT that the file was downloaded 10 hours ago successfully. However, when I go to the robot.txt file link, the page is blank. Would they be doing something advanced to be blocking URLs to hide it it from users? It appears to correctly be blocking log-ins, but I would like to know for sure that it is working correctly. Any advice on this would be most appreciated. Thanks! Jared ihgNxN7
Intermediate & Advanced SEO | | J-Banz0 -
Robots.txt: Can you put a /* wildcard in the middle of a URL?
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
Intermediate & Advanced SEO | | IHSwebsite0