Robot.txt error
-
I currently have this under my robot txt file:
User-agent: *
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspxWebMatrix 2.0
On webmaster > Health Check > Blocked URL
I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL:
User-agent: *
Disallow: /
WebMatrix 2.0
Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas?
Thanks
Seda
-
Thanks Irving, it worked
-
Try to spider your site with this link checker tool
bots cannot accept cookies and your site is requiring cookies to be enabled in order to be visited so Google cannot access the site because you are not allowing the visit without the cookie being dropped is most likely the issue.
Disable cookies on your browser and clear your cache and see what happens when you try to visit your site, are you blocked?
These discussions may possibly help
http://www.highrankings.com/forum/index.php/topic/3062-cookie-and-javascript/
http://stackoverflow.com/questions/5668681/seo-question-google-not-getting-past-cookies
-
Thanks Irving, I need a little more help, I am not quite sure if I understand it. What is it that needs to be fixed here?
-
I couldn't relay on SERPS as the website is old, it's been indexed for quite so i didn't think that SERP results would change that quick. I've been receiving the error since yesterday.
It's on SERPS today but would it be there tomorrow? The reason I am saying that is because when i change the Page Title, it doesnt get changed on SERPS instantly, it takes a day or so before i see the changes on SERPS.
-
TECHNICAL ISSUE
It's your cookie policy blocking bots from spidering. Need to fix that at the server level. Easy fix!
http://www.positivecollections.co.uk/cookies-policy.aspx
Your robots.txt is fine.
-
Okay. But that doesn't mean it isn't being indexed. Here's a fun test: Go to any page on your website and select a string of two or three sentences. Google it. Does the page come up in the SERPs?
(I did this to 3 pages on your site and it worked for all of them. Therefore, your site is being indexed.) Why do you need to Fetch?
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
-
That site is also being indexed. Again I ask, what makes you think it is not being indexed? (cause it is)
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
@Jesse: That's the main website, we've got other URLs.Error appears on positivecollections.co.uk
-
Thanks Irving,
www.positivecollections.co.uk is the url
I've tried to remove everything from the robot file and check again on webmaster, same thing happened It's just blocking the main link
-
Are you sure your site isn't being indexed?
Cause I went to your profile and if http://www.mtasolicitors.com/ is your site, then it is definitely being indexed.. What makes you think it isn't?
-
Are you sure there is nothing else in your robots.txt - you can share the url if you like
You can delete this it's doing nothing and don't need to attempt to block bad bots
WebMatrix 2.0
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
News Errors In Google Search Console
Years ago a site I'm working on was publishing news as one form of content on the site. Since then, has stopped publishing news, but still has a q&a forum, blogs, articles... all kinds of stuff. Now, it triggers "News Errors" in GWT under crawl errors. These errors are "Article disproportionately short" "Article fragmented" on some q&a forum pages "Article too long" on some longer q&a forum pages "No sentences found" Since there are thousands of these forum pages and it's problem seems to be a news critique, I'm wondering what I should do about it. It seems to be holding these non-news pages to a news standard: https://support.google.com/news/publisher/answer/40787?hl=en For instance, is there a way and would it be a good idea to get the hell out of Google News, since we don't publish news anymore? Would there be possible negatives worth considering? What's baffling is, these are not designated news urls. The ones we used to have were /news/title-of-the-story per... https://support.google.com/news/publisher/answer/2481373?hl=en&ref_topic=2481296 Or, does this really not matter and I should just blow it off as a problem. The weird thing is that we recently went from http to https and The Google News interface still has us as http and gives the option to add https, which I am reluctant to do sine we aren't really in the news business anymore. What do you think I should do? Thanks!
Intermediate & Advanced SEO | | 945010 -
Have a Robots.txt Issue
I have a robots.txt file error that is causing me loads of headaches and is making my website fall off the SE grid. on MOZ and other sites its saying that I blocked all websites from finding it. Could it be as simple as I created a new website and forgot to re-create a robots.txt file for the new site or it was trying to find the old one? I just created a new one. Google's website still shows in the search console that there are severe health issues found in the property and that it is the robots.txt is blocking important pages. Does this take time to refresh? Is there something I'm missing that someone here in the MOZ community could help me with?
Intermediate & Advanced SEO | | primemediaconsultants0 -
Robots.txt - blocking JavaScript and CSS, best practice for Magento
Hi Mozzers, I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento. I'm concerned we are blocking bots from crawling essential information for page rank. My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not? You can view our robots.txt file here Thanks, Blake
Intermediate & Advanced SEO | | LeapOfBelief0 -
Rich Snippets Not Displaying - Price Error?
We recently implemented Schema.org/product on our site (www.evo.com). In the Google Webmaster Tools Structured Data report we’re getting lots of errors: http://screencast.com/t/Z3QJBctjUvP which I believe is preventing our rich snippets (price, availability, ratings) from showing in search results. When I click into the “Product” data type on the Structured Data report I see that there’s 2 errors: missing price and missing best or worst rating: http://screencast.com/t/SuHVYFLFO5D We are adding the itemprop=“bestRating” code which should take care of the ‘missing best or worst rating’ error. The missing price error is what I want to ask about. There’s a couple strange things here (using this URL as example : http://www.evo.com/skis/line-sir-francis-bacon.aspx - which has been indexed since the code was added): 1) The Webmaster Tools report is finding the schema.org/offer data type and is recognizing the InStock and OutOfStock property of this: http://screencast.com/t/xtHouzeL37q BUT price is not being detected. 2) When I enter the URL into the Structured Data Testing Tool it does detect price: https://www.google.com/webmasters/tools/richsnippets?url=http://www.evo.com/skis/line-sir-francis-bacon.aspx 3) When I fetch the page as GoogleBot itemprop=“price”is present: http://screencast.com/t/Hnqda95N My hunch is that the reason our Rich Snippets are not showing is because of the “price” error. The “?” by the error in WMT says: “This property is missing in the html markup or was not properly highlighted in the Data Highlighter. This can prevent the rich snippet from appearing” Does anyone have an idea why we’re getting the “price” error – or anything else that could prevent our Rich Snippets from displaying? Thanks so much! http://screencast.com/t/SuHVYFLFO5D
Intermediate & Advanced SEO | | evoNick0 -
The "webmaster" disallowed all ROBOTS to fight spam! Help!!
One of the companies I do work for has a magento site. I am simply the SEO guy and they work the website through some developers who hold access to their systems VERY tightly. Using Google Webmaster Tools I saw that the robots.txt file was blocking ALL robots. I immediately e-mailed out and received a long reply about foreign robots and scrappers slowing down the website. They told me I would have to provide a list of only the good robots to allow in robots.txt. Please correct me if I'm wrong.. but isn't Robots.txt optional?? Won't a bad scrapper or bot still bog down the site? Shouldn't that be handled in httaccess or something different? I'm not new to SEO but I'm sure some of you who have been around longer have run into something like this and could provide some suggestions or resources I could use to plead my case! If I'm wrong.. please help me understand how we can meet both needs of allowing bots to visit the site but prevent the 'bad' ones. Their claim is the site is bombarded by tons and tons of bots that have slowed down performance. Thanks in advance for your help!
Intermediate & Advanced SEO | | JoshuaLindley0 -
Using folder blocked by robots.txt before uploaded to indexed folder - is that OK?
I have a folder "testing" within my domain which is a folder added to the robots.txt. My web developers use that folder "testing" when we are creating new content before uploading to an indexed folder. So the content is uploaded to the "testing" folder at first (which is blocked by robots.txt) and later uploaded to an indexed folder, yet permanently keeping the content in the "testing" folder. Actually, my entire website's content is located within the "testing" - so same URL structure for all pages as indexed pages, except it starts with the "testing/" folder. Question: even though the "testing" folder will not be indexed by search engines, is there a chance search engines notice that the content is at first uploaded to the "testing" folder and therefore the indexed folder is not guaranteed to get the content credit, since search engines see the content in the "testing" folder, despite the "testing" folder being blocked by robots.txt? Would it be better that I password protecting this "testing" folder? Thx
Intermediate & Advanced SEO | | khi50 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0 -
URL Error or Penguin Penalty?
I am currently having a major panic as our website www.uksoccershop.com has been largely dropped from Google. We have not made any changes recently and I am not sure why this is happening, but having heard all sorts of horror stories of penguin update, I am fearing the worst. If you google "uksoccershop" you will see that the homepage does not rank. We previously ranked in the top 3 for "football shirts" but now we don't, although on page 2, 3 and 4 you will see one of our category pages ranking (this didn't used to happen). Some rankings are intact, but many have disappeared completely and in some cases been replaced by other pages on our site. I should point out our existing rankings have been consistently there for 5-6 years until today. I logged into webmaster tools and thankfully there is no warning message from Google about spam, etc, but what we do have is 35,000 URL errors for pages which are accessible. An example of this is: | URL: | http://www.uksoccershop.com/categories/5_295_327.html | | Error details In Sitemaps Linked from Last crawled: 6/20/12First detected: 6/15/12Googlebot couldn't access the contents of this URL because the server had an internal error when trying to process the request. These errors tend to be with the server itself, not with the request. Is it possible this is the cause of the issue (we are not currently sure why the URL's are being blocked) and if so, how severe is it and how recoverable?If that is unlikely to cause the issue, what would you recommend our next move is?All help is REALLY REALLY appreciated 🙂
Intermediate & Advanced SEO | | ukss19840