Search Engine blocked by robots.txt
-
I am getting this error whe I try to crawl http://photosales.belfasttelegraph.co.uk/ but my robots.txt file does not block any bots?
-
Hi Judith,
If you're still having problems, send an email to help@seomoz.org and they'll be able to help you figure out why Roger doesn't want to crawl your site.
-
I am useing the Bing SEO Toolkit, it worked fine for me,
try removeing robots for a test
-
the SEOmoz crawler
-
What are you iusing to crawl?
-
I'm sure why it let you scroll and not me?
-
I just scroled it ok.
but you dont need to put the user agent in twice, this will do below
It was bloking a lot of pages but it crawled the ones you allowedUser-agent: * Disallow: */wo/ Disallow: */ajax/
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Phantom search results showing up in analytics
Hi, I'm getting these weird phantom search results showing up in my analytics account which is skewing my results. I have a wordpress site, and I do have some custom search results pages, but analytics is telling me these hits are coming from google- which I doubt. The end of the query string (&cat=plus-5-results) is not coming from a link created within my site. Anyone have any ideas? Failing that, would somebody be kind enough to explain how I can ignore these hits, but retain other, valid search result requests? Thanks reading! Will Screenshot%202019-05-16%20at%2022.32.10.jpg
Reporting & Analytics | | madegood0 -
404 Status Codes in Google Search Console
Hi all, I've noticed in Google Search Console under 'Crawl errors' - 1. Why does the status code '410' come up as an 'error' in the crawl report? 2. Why are some articles labelled as '404' error when they have been completely deleted and should be a '410' - there are roughly around 1000-2000 of these. Thanks!
Reporting & Analytics | | lucwiesman0 -
Are these Search Console crawl errors a major concern to new client site?
We recently (4/1) went live with a new site for a client of ours. The client site was originally Point2 before they made the switch to a template site with Real Estate Webmasters. Now when I look into the Search Console I am getting the following Crawl Errors: 111 Server Errors (photos) 104 Soft 404s (blogs, archives, tags) 6,229 Not Found (listings) I have a few questions. The server errors I know not a lot about so I generally ignore. My main concerns are the 404s and not found. The 404s are mostly tags and blog archives which I wonder if I should leave alone or do 301s for each to /blog. For not found, these are all the previous listings from the IDX. My assumption is these will naturally fall away after some time, as the new ones have already indexed. But I wonder what I should be doing here and which will be affecting me. When we launched the new site there was a large spike in clicks ( 250% increase) which has now tapered off to an average of ~85 clicks versus ~160 at time of launch. Not sure if the Crawl Errors have any effect, I'm guessing not so much right now. I'd appreciate your insights Mozzers!
Reporting & Analytics | | localwork0 -
No Search Data in Google Search Console (Search Analytics)
Wondering if anyone has experienced 0 data in their Google Search Console (Search Analytics) and found possible solutions to retrieve data? We have a few clients who, prior to the update to Google Search Console, were getting data regularly in terms of the Search Queries report, but ever since the update to Google Search Console, they are no longer receiving data. As an FYI, both the www and non-www versions of the website are verified in Search Console and the XML Sitemaps and Robots.txt files are clean, tested and working fine. Any insights or experience of sites showing 0 data in Search Analytics? Any possible solutions would be greatly appreciated. Thanks!
Reporting & Analytics | | SEO5Team0 -
800,000 pages blocked by robots...
We made some mods to our robots.txt file. Added in many php and html pages that should not have been indexed. Well, not sure what happened or if there was some type of dynamic conflict with our CMS and one of these pages, but in a few weeks we checked webmaster tools and to our great surprise and dismay, the number of blocked pages we had by robots.txt was up to about 800,000 pages out of the 900,000 or so we have indexed. 1. So, first question is, has anyone experienced this before? I removed the files from robots.txt and the number of blocked files has still been climbing. Changed the robots.txt file on the 27th. It is the 29th and the new robots.txt file has been downloaded, but the blocked pages count has been rising in spite of it. 2. I understand that even if a page is blocked by robots.txt, it still shows up in the index, but does anyone know how the blocked page affects the ranking? i.e. while it might still show up even though it has been blocked will google show it at a lower rank because it was blocked by robots.txt? Our current robots.txt just says: User-agent: *
Reporting & Analytics | | TheCraig
Disallow: Sitemap: oursitemap Any thoughts? Thanks! Craig0 -
Nofollow page is being reported as a landing page for organic search in Google Analytics
One of my client's websites includes a series of pages for an enrollment process. All of these pages are blocked by robots.txt. In Google Analytics these pages are showing data as landing pages for organic search traffic, and have been for quite some time. There was recently a surge of organic search traffic landing on one of these pages, coming from multiple search engines. The pages appear to be blocked and I'm not finding any of them in the search results for the keywords that are being reported in GA or by searching for the url. Does anyone have any insights into why this might be happening?
Reporting & Analytics | | rgibson1000 -
What impact will Google's 10/18/2011 announcement of 'Making Search More Secure' have on the ability to track specific keyword queries via Analytics?
The full announcement is here: http://googleblog.blogspot.com/2011/10/making-search-more-secure.html My concern is that the ability for Google Analytics to parse information on specific keyword queries will be diminished. The article hints that Google Webmaster Tools will be exempt from the problem, and I've never relied on Webmaster tools as a go-to for tying specific keyword queries to Goal Tracking (form submissions and sales). The community's thoughts on this one are appreciated. 🙂
Reporting & Analytics | | MKR_Agency0 -
Search within search? Weird google URLs
Good morning afternoon, how are you guys doing today? I'm experiencing a few Panda issues I'm trying to fix, and I was hoping I could get some help here about one of my problems. I used Google analytics to extract pages people land on after a Google search. I'm trying to identify thin pages that potentially harm my website as a whole. It turns out I have a bunch of pages in the likes of the following: /search?cd=15&hl=en&ct=clnk&gl=uk&source=www.google .co.uk, and so on for a bunch of countries (.fi, .com, .sg, .pk, and so on, maybe 50 of them) My question is: what are those pages? their stats are awful, usually 1 visitor, 100% bounce rate, and 0 links. Do you think they can explain my dramatic drop in traffic following Panda? If so, what should I do with them? NOINDEX? Deletion? What would you suggest? I also have a lot of links in the likes of the following: /google-search?cx=partner-pub-6553421918056260:armz8yts3ql&cof=FORID:10&ie=ISO-8859-1&sa=Search&siteurl=www.mysite.com/content/article They lead to custom search pages. What should I do with them? Almost two weeks ago, Dr. Pete posted an article untitled Fat Panda and Thin Content in which he deals with "search within search" and how they might be targeted by Panda. Do you think this is the issue I'm facing? Any suggestion/help would be much appreciated! Thanks a lot and have a great day 🙂
Reporting & Analytics | | Ericc220