Roger bot taking a long time to crawl site
-
Hi all, I've noticed Roger bot is taking a long time to crawl my new site. It started on the 28th Feb 2013 and is still going. There aren't many pages at the moment. Any ideas please?
thanks a lot, Mark.
-
Hi Peter
thanks for your reply. The crawl has now completed and given me some more areas to work on, it's a great tool.
I was so preoccupied with 'hiding' the site over the last couple of months with the easy code:
User-agent: * Disallow: /
I hadn't thought beyond this.
I've noticed Google has now recognised the new robots.txt which has allowed the sitemap to be accepted..
I'll look at your notes, thank you, and work out my next move. I'll let you know how I get on too.
I know (well think) I have to get noindex, follow for 'sorted' category pages...
all the best, Mark.
-
Hi Mike
The crawl has now completed, thank you. I think the results will keep me occupied
all the best, Mark.
-
Hi Mark,
Sorry it's taking a while to crawl your new site.
While I'm not exactly sure what the delay is, one of the possible reasons is through your robots.txt. Here's what I see in a short snippet from your robots.txt:
# Crawlers Setup User-agent: * Crawl-delay: 30 # Allowable Index Allow: /*?p= Allow: /index.php/blog/ Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/ # Directories Disallow: /404/ Disallow: /app/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /js/ Disallow: /lib/ Disallow: /magento/ Disallow: /pkginfo/ Disallow: /report/ From here, the formatting looks a little awkward. What's going on is that you're telling Roger bot to only look at these:
Allowable Index
Allow: /*?p=
Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/
Allow: /catalogsearch/result/
Allow: /media/While the syntax is OK, not every crawler out there will follow the allow directive. Here's an example something you can use.
# Crawlers Setup User-agent: * Crawl-delay: 30 Disallow: / Disallow: /404/ Disallow: /app/ Disallow: /cgi-bin/ Disallow: /downloader/ Disallow: /errors/ Disallow: /includes/ Disallow: /js/ From here you're telling the crawler to disallow nothing except these directories. Please let us know once you implement this method is that will actually fix the crawl. Thanks for reaching out! Best, Peter Li SEOmoz Help Team ```
-
Hi Mark,
This sounds like a bug or issue with the SEOmoz software.
Contact help@seomoz.org and ask one of the help associates to look into this for you.
If you do not have many pages, it definitely shouldn't take that long.
The help team responds extremely quickly!
Good luck.
Mike
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Main Site and eCommerce Site URLs for SEO
My client currently has a main website on a url and an eCommerce site on a subdomain. The eCommerce site is currently not mobile friendly, has images that are too small and are problematic - and I believe it negates some of the SEO work we do for them. I had to turn off Google Shopping ads because the quality score was so low. That being said, they are rebuilding a shopping cart on a new platform that will be mobile friendly BUT the images are going to be tiny until they slowly replace images over several months. Would you keep the shopping cart on a subdomain, or make it part of the main website URL? Can it negatively impact the progress we have made on the main site SEO.
Technical SEO | | jerrico10 -
Google webmaster is not crawling links and site cache still in old date
Hi guys, I have been trying to get my page indexed in Google with new title and descriptions but it is not getting indexed. I have checked in many tools but no useful. Can you please tell me what could be the issue? Even I have set up And Google webmaster is not crawling links I have built so far. Few links are indexed but others do not. Why this is happening. My url is: https://www.paydaysunny.com thanks
Technical SEO | | ksmith880 -
We are migrating a site and are seeing alot of 301s and 302s already in the old site is it ok to leave those as is?
For the 3xx’s I’m not sure if it’s okay for us to redirect to these so please advise on that
Technical SEO | | lina_digital0 -
Site Blacklisted
Good morning. Just done my WMT ritual morning check and one of my sites has been blacklisted for malware. It's a wordpress site - I've run various scans, e.g. http://sitecheck.sucuri.net/scanner/ and also installed wordfence and scanned with that and wordfence produced some offending files which I have now deleted. I've also installed website defender in the hope that it wont happen again. I'm pretty good with staying on top of updates and rarely let a few days pass without upgrading new version of wordpress or plugins etc. I've also checked my users to make sure no new admins or anything and also changes passwords. I've asked for a review from Google and just wondered how long these reviews take? Also, has anybody got any advice, is there anything else I should be doing? Thanks
Technical SEO | | littlesthobo0 -
Schema.org how long does it take?
Since 5 days ago I changed my html template in my ecommerce to comply with schema.org for products. How long does it take to be seen on google, our keyword 4700DN, its one that appears in the top 10 results, but it still doesnt show it with schema.org (price+starts+ratings) What should I do? http://www.google.com/webmasters/tools/richsnippets?view=cse&url=http%3A%2F%2Fwww.theprinterdepo.com%2Fhp-color-laser-4700dn-printer-q7493a
Technical SEO | | levalencia10 -
Why did our site drop in Google rankings?
My site's URL (web address) is: http://tinyurl.com/3svn2l9 Hi there, We operate a travel site that lists numerous tours, accommodation and activities. Since 6th August 2011 we have dropped from top 10 SERP rankings of our pages to around result number 100 (page 10) and losing massive amount of visitors via Google Search. Our Yahoo and Bing rankings are still in the top10. We need your advice and quick! The last changes we have made are the following: -redirected the non-www version to the www version on the 1st August -bought advertising with a follow link in a sidebar that is being populated across the site (+4000 pages) about 2 months ago -added a blog to the website 2 weeks ago and posted 2 posts to date. Additionally, our website structure allows visitors (and bots) to see the same listings via different URLs which caused duplicate content. This has been the case since the launch of our website about 1 year ago. To prevent this duplicate content we have placed canonical tags on the individual listings pages. Why did our site all of a sudden plummet in the rankings?
Technical SEO | | Robbern0 -
Which is more accurate? site: or GWT?
when viewing urls in google's index, is it more accurate to refer to site:www.domain.com or google webmaster tools (urls in web index)?
Technical SEO | | nicole.healthline0 -
The course of action to move my macro site to some mini sites- justin if you can help
We have a site that we want to break up into mini sites but keep the old site for the major brands. Empirecovers.com is the major and we want to break it off into Empire Truck Covers and Empire Boat covers. What I am thinking of doing is linking from the home to Empiretruckcovers.com instead of a mini page on the site and 301 redirect the mini page to empiretruckcovers.com. Than (there wont be duplicate content) making a small page for truck covers on empire just so people do not get confused. Is this the best way to go or what do you suggest? We are doing this because I feel there is seo value in having mini sites and also the user experience will be cleaner and people will trust it a lot more than inside a big site. The other problem is I have some great rankings on the pages so I want to do it so there is as little damage as possible. I guess once I start I will do all the free directories, yahoo directory and try to get links as fast as I can. Any suggestions would be great. I am going to do a/b testing to see if my adwords convert better on mini site or on the big site for certain keywords too
Technical SEO | | goldjake17880