Robot.txt error
-
I currently have this under my robot txt file:
User-agent: *
Disallow: /authenticated/
Disallow: /css/
Disallow: /images/
Disallow: /js/
Disallow: /PayPal/
Disallow: /Reporting/
Disallow: /RegistrationComplete.aspxWebMatrix 2.0
On webmaster > Health Check > Blocked URL
I copy and paste above code then click on Test, everything looks ok but then logout and log back in then I see below code under Blocked URL:
User-agent: *
Disallow: /
WebMatrix 2.0
Currently, Google doesn't index my domain and i don't understand why this happening. Any ideas?
Thanks
Seda
-
Thanks Irving, it worked
-
Try to spider your site with this link checker tool
bots cannot accept cookies and your site is requiring cookies to be enabled in order to be visited so Google cannot access the site because you are not allowing the visit without the cookie being dropped is most likely the issue.
Disable cookies on your browser and clear your cache and see what happens when you try to visit your site, are you blocked?
These discussions may possibly help
http://www.highrankings.com/forum/index.php/topic/3062-cookie-and-javascript/
http://stackoverflow.com/questions/5668681/seo-question-google-not-getting-past-cookies
-
Thanks Irving, I need a little more help, I am not quite sure if I understand it. What is it that needs to be fixed here?
-
I couldn't relay on SERPS as the website is old, it's been indexed for quite so i didn't think that SERP results would change that quick. I've been receiving the error since yesterday.
It's on SERPS today but would it be there tomorrow? The reason I am saying that is because when i change the Page Title, it doesnt get changed on SERPS instantly, it takes a day or so before i see the changes on SERPS.
-
TECHNICAL ISSUE
It's your cookie policy blocking bots from spidering. Need to fix that at the server level. Easy fix!
http://www.positivecollections.co.uk/cookies-policy.aspx
Your robots.txt is fine.
-
Okay. But that doesn't mean it isn't being indexed. Here's a fun test: Go to any page on your website and select a string of two or three sentences. Google it. Does the page come up in the SERPs?
(I did this to 3 pages on your site and it worked for all of them. Therefore, your site is being indexed.) Why do you need to Fetch?
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
-
That site is also being indexed. Again I ask, what makes you think it is not being indexed? (cause it is)
-
When I click on Fetch As Google, i get 'Denied by robots.txt'' error.
@Jesse: That's the main website, we've got other URLs.Error appears on positivecollections.co.uk
-
Thanks Irving,
www.positivecollections.co.uk is the url
I've tried to remove everything from the robot file and check again on webmaster, same thing happened It's just blocking the main link
-
Are you sure your site isn't being indexed?
Cause I went to your profile and if http://www.mtasolicitors.com/ is your site, then it is definitely being indexed.. What makes you think it isn't?
-
Are you sure there is nothing else in your robots.txt - you can share the url if you like
You can delete this it's doing nothing and don't need to attempt to block bad bots
WebMatrix 2.0
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Set Robots.txt file to crawl my website at specific times
Our website provider has stated that they can only 'lift' their block on our website in order for it to be crawled as specific times. Is there any way to amend a robots.txt to ensure that it crawls our website at a specific time of day/night in order to coincide with the block being lifted? Many Thanks, Charlene
Intermediate & Advanced SEO | | CharleneKennedy120 -
Syndicated content with meta robots 'noindex, nofollow': safe?
Hello, I manage, with a dedicated team, the development of a big news portal, with thousands of unique articles. To expand our audiences, we syndicate content to a number of partner websites. They can publish some of our articles, as long as (1) they put a rel=canonical in their duplicated article, pointing to our original article OR (2) they put a meta robots 'noindex, follow' in their duplicated article + a dofollow link to our original article. A new prospect, to partner with with us, wants to follow a different path: republish the articles with a meta robots 'noindex, nofollow' in each duplicated article + a dofollow link to our original article. This is because he doesn't want to pass pagerank/link authority to our website (as it is not explicitly included in the contract). In terms of visibility we'd have some advantages with this partnership (even without link authority to our site) so I would accept. My question is: considering that the partner website is much authoritative than ours, could this approach damage in some way the ranking of our articles? I know that the duplicated articles published on the partner website wouldn't be indexed (because of the meta robots noindex, nofollow). But Google crawler could still reach them. And, since they have no rel=canonical and the link to our original article wouldn't be followed, I don't know if this may cause confusion about the original source of the articles. In your opinion, is this approach safe from an SEO point of view? Do we have to take some measures to protect our content? Hope I explained myself well, any help would be very appreciated, Thank you,
Intermediate & Advanced SEO | | Fabio80
Fab0 -
Intermittent DNS errors. IP team not able to diagnose
Intermittent DNS errors showing up in GSC for our fashion portal www.AJIO.com. Our IP team doesn't find any issues at our end. Everytime i write to them, they come back saying 'DNS is resolving fine in all servers'. How do we resolve this? Pl help
Intermediate & Advanced SEO | | AJIOreliance0 -
News Errors In Google Search Console
Years ago a site I'm working on was publishing news as one form of content on the site. Since then, has stopped publishing news, but still has a q&a forum, blogs, articles... all kinds of stuff. Now, it triggers "News Errors" in GWT under crawl errors. These errors are "Article disproportionately short" "Article fragmented" on some q&a forum pages "Article too long" on some longer q&a forum pages "No sentences found" Since there are thousands of these forum pages and it's problem seems to be a news critique, I'm wondering what I should do about it. It seems to be holding these non-news pages to a news standard: https://support.google.com/news/publisher/answer/40787?hl=en For instance, is there a way and would it be a good idea to get the hell out of Google News, since we don't publish news anymore? Would there be possible negatives worth considering? What's baffling is, these are not designated news urls. The ones we used to have were /news/title-of-the-story per... https://support.google.com/news/publisher/answer/2481373?hl=en&ref_topic=2481296 Or, does this really not matter and I should just blow it off as a problem. The weird thing is that we recently went from http to https and The Google News interface still has us as http and gives the option to add https, which I am reluctant to do sine we aren't really in the news business anymore. What do you think I should do? Thanks!
Intermediate & Advanced SEO | | 945010 -
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank? User-agent: * Disallow: / Sitemap: http://www.morganlindsayphotography.com/sitemap.xml Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
Intermediate & Advanced SEO | | morg454540 -
404 errors
Hi, we have plenty of 404 errors. We just deal with those that are of the highest priority (the ones that have high page authority). We have also a lot of errors like this: http://www.weddingrings.com/www.yoy-search.com . Does it make sense to redirect those to the home page or leave them as an 404 error?
Intermediate & Advanced SEO | | alexkatalkin0 -
Robots.txt: Syntax URL to disallow
Did someone ever experience some "collateral damages" when it's about "disallowing" some URLs? Some old URLs are still present on our website and while we are "cleaning" them off the site (which takes time), I would like to to avoid their indexation through the robots.txt file. The old URLs syntax is "/brand//13" while the new ones are "/brand/samsung/13." (note that there is 2 slash on the URL after the word "brand") Do I risk to erase from the SERPs the new good URLs if I add to the robots.txt file the line "Disallow: /brand//" ? I don't think so, but thank you to everyone who will be able to help me to clear this out 🙂
Intermediate & Advanced SEO | | Kuantokusta0 -
Robots.txt & url removal vs. noindex, follow?
When de-indexing pages from google, what are the pros & cons of each of the below two options: robots.txt & requesting url removal from google webmasters Use the noindex, follow meta tag on all doctor profile pages Keep the URLs in the Sitemap file so that Google will recrawl them and find the noindex meta tag make sure that they're not disallowed by the robots.txt file
Intermediate & Advanced SEO | | nicole.healthline0