High Number of Crawl Errors for Blog
-
Hello All,
We have been having an issue with very high crawl errors on websites that contain blogs. Here is a screenshot of one of the sites we are dealing with: http://cl.ly/image/0i2Q2O100p2v .
Looking through the links that are turning up in the crawl errors, the majority of them (roughly 90%) are auto-generated by the blog's system. This includes category/tag links, archived links, etc. A few examples being:
http://www.mysite.com/2004/10/
http://www.mysite.com/2004/10/17/
As far as I know (please correct me if I'm wrong!), search engines will not penalize you for things like this that appear on auto-generated pages. Also, even if search engines did penalize you, I do not believe we can make a unique meta tag for auto-generate pages. Regardless, our client is very concerned seeing these high number of errors in the reports, even though we have explained the situation to him.
Would anyone have any suggestions on how to either 1) tell Moz to ignore these types of errors or 2) adjust the website so that these errors now longer appear in the reports?
Thanks so much!
- Rebecca
-
Hi Rebecca
What are the crawl errors exactly? From that report screenshot it looks like you have a variety of them, so the fixes will all be different.
Let me know, and in the meantime you might want to check out my article on Moz about setting up WordPress
-Dan
-
It is true that you will most likely not be penalized for these pages, Google is pretty good at figuring out common canonicalization problems in my opinion and would most likely not penalize you for having duplicate content. I would encourage you to dig a little deeper and see what additional problems these pages could create though.
Consider that Google will waste valuable crawl bandwidth crawling these meaningless pages, rather than focusing on the important content you want them too. If Google is crawling them, you can most likely bet that PageRank is flowing through these pages as well, diluting the link equity of your site.
Are you using Wordpress? There are a lot of great plug ins that can help you manage these pages. You could control how Google crawls these pages with your robots.txt, by placing meta robots tags on the pages using a plug in, or by placing rel=canonical tags on the pages pointing back to the page that is the original source.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?
Hi guys, When I try to run a site crawl on Moz it returns an error saying that it has failed due to an error with the robots.txt file. However, my site can be crawled by SEM Rush with no mention of problems with roots.txt file issues. My developer has looked into it and insists their is no problem with my robots.txt and I've tried the Moz crawl at least 6 times over an 8 week period. Has anyone ever seen such a large discrepancy between Moz and SEM Rush or have any ideas why Moz has this issue with my site?? TIA everyone
Getting Started | | Webreviewadmin0 -
Moz Not Crawling Angular SPA
I have a client that just launched a redesigned website using Angular as a single page app. Google appears to be able to crawl the site just fine, but Moz crawl is only finding one page. We have updated the htaccess to allow for Rogerbot and Dotbot, but still unable to crawl any pages other than the home page. Does anyone have experience with this or ideas of why it won't crawl all pages, and how to allow for Moz to crawl all pages? There is a sitemap with approx. 390 pages. Thanks!
Getting Started | | PIN_Celler1 -
We recently switched from HTTP to HTTPS and we are having crawling issues!
We switched our website from HTTP to HTTPS and we started to get an email from Moz about the robots.txt being unable to crawl our website. The website is hosted through wordpress but we haven't had any issues until we switched. We have no idea what to do or even what the problem is! If you have had a similar problem and fixed it, we need your help! Thank you.
Getting Started | | DrInfinity0 -
Crawl issues, how to see a referring link?
Hi There, We've got two crawl issues for pages that don't exist (and never existed). The links are strange and judging by the code in them, appear to be coming from our own CMS. How can we see which pages the links are on in Moz? Cheers Ben
Getting Started | | cmscss0 -
Error Code 612: Error response for robots.txt
Hi, We are getting Error Code 612: Error response for robots.txt in our crawl but everything looks to be ok with the robots file. Can you confirm what is wrong? Thanks
Getting Started | | david.weston0 -
My website does not allow all crawler to crawl, Now my question is that whether i need to give permission to moz crawler if yes then whaat is moz bot name?
My website does not permit all crawler to crawl website. Whether ii need to give permission to moz bot to crawl website or not? If yes what is the moz bot name?
Getting Started | | irteam0 -
Crawl Diagnostics
Hello Experts, today i was analyse one of my website with moz and get issue overview and get total 212 issue 37 high all derive to this same url http://blogname.blogspot.com/search?updated-max=2013-10-30T17:59:00%2B05:30&max-results=4&reverse-paginate=true so can anyone help me how to find this url and remove all high priority error. and even on page website get A grade then why not performing well in SE ?
Getting Started | | JulieWhite0 -
MOZ Starter Crawl not happeneing
Hi I added a new site 48hours + ago and the starter crawler has not even begun collecting data. Any help would be appreciated. cheers Isaac
Getting Started | | sodafizz0