High Number of Crawl Errors for Blog
-
Hello All,
We have been having an issue with very high crawl errors on websites that contain blogs. Here is a screenshot of one of the sites we are dealing with: http://cl.ly/image/0i2Q2O100p2v .
Looking through the links that are turning up in the crawl errors, the majority of them (roughly 90%) are auto-generated by the blog's system. This includes category/tag links, archived links, etc. A few examples being:
http://www.mysite.com/2004/10/
http://www.mysite.com/2004/10/17/
As far as I know (please correct me if I'm wrong!), search engines will not penalize you for things like this that appear on auto-generated pages. Also, even if search engines did penalize you, I do not believe we can make a unique meta tag for auto-generate pages. Regardless, our client is very concerned seeing these high number of errors in the reports, even though we have explained the situation to him.
Would anyone have any suggestions on how to either 1) tell Moz to ignore these types of errors or 2) adjust the website so that these errors now longer appear in the reports?
Thanks so much!
- Rebecca
-
Hi Rebecca
What are the crawl errors exactly? From that report screenshot it looks like you have a variety of them, so the fixes will all be different.
Let me know, and in the meantime you might want to check out my article on Moz about setting up WordPress
-Dan
-
It is true that you will most likely not be penalized for these pages, Google is pretty good at figuring out common canonicalization problems in my opinion and would most likely not penalize you for having duplicate content. I would encourage you to dig a little deeper and see what additional problems these pages could create though.
Consider that Google will waste valuable crawl bandwidth crawling these meaningless pages, rather than focusing on the important content you want them too. If Google is crawling them, you can most likely bet that PageRank is flowing through these pages as well, diluting the link equity of your site.
Are you using Wordpress? There are a lot of great plug ins that can help you manage these pages. You could control how Google crawls these pages with your robots.txt, by placing meta robots tags on the pages using a plug in, or by placing rel=canonical tags on the pages pointing back to the page that is the original source.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz not able to crawl our site - any advice?
When I try and crawl our site through Moz it gives this message: Moz was unable to crawl your site on Aug 7, 2019. Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. Update these tags to allow your page and the rest of your site to be crawled. If this error is found on any page on your site, it prevents our crawler (and some search engines) from crawling the rest of your site. Typically errors like this should be investigated and fixed by the site webmaster. I have been through all the help and doesn't seem to be any issues. You can check the site and robots.txt here: https://myfamilyclub.co.uk/robots.txt. Anyone got any advice on where I could go to get this sorted?
Getting Started | | MyFamilClubLtd1 -
Can't Crawl Site - but deducting crawls.
Why am I being deducted crawls if MOZ keeps telling me that it can't crawl my site?
Getting Started | | BloggyMoms1 -
Moz only crawling one page of a campaign, please help
Today I set up a new campaign for a client, however the crawl has only found the home page and is saying that the URL is unavailable. The site is definitely live and the URL is correct. I have set up the campaign 3 times one with the full address (http://www.) one with www. and with just the domain name. All three of these have come page with one page crawled and "unavailable" above the URL. It is picking up the crawl issues on the page and showing domain authority but I don't know why it's not crawling other pages. Prior to setting up the campaign I did a site crawl and Moz found everything then, so I don't know why it isn't now. Please help. Thanks
Getting Started | | Wrapped0 -
Error help for newbie please
Hi, I signed up after seeing the videos on Udemy and YouTube (white board Friday) So I've started the free trial and am looking forward to getting my site ranking higher. I've crawled my site www.sussexchef.com and its come back with the following errors (please see below. 608 I'm sure this information is very important but I have no idea how to fix the 608, I found a robots.txt in my directory and deleted it as I think that maybe the problem? I crawled the site twice by accident so will have to wait till tomorrow to find out? 404 I found it quite hard to find the broken links at first but once I realized all the information I needed was in the table I think I got them all. did I miss a tutorial or am I just a little out of my depth here? 503 I have no idea how to fix these, I can click the links and it takes my to that page or file. so how can it be the server down? Or is this because they are links to PDF's? should i convert them to jpegs and give them meta data? I'd be grateful for any help anyone has to offer as I'm keen to learn how to promote my site better. Crawl Error Moz encountered an error on one or more pages on your site608 Page not Decodable as Specified Content EncodingInvestigate the cause of this issue on the Help Hub.Discovered: Sep 2 - 8Crawl Diagnostics Crawl Issue Found: 404 Errors 10% of site pages served 404 errors during the last crawlA high percentage of 404 pages can indicate a problem with the internal link structure.Crawl Diagnostics 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/wedding-caterers.aspx4041215http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/dinner-party-catering.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/christmas-party-catering.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/wedding-cakes.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/outdoor-catering-specialists.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/hen-party-cupcake-classes.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/funeral-caterers.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/drinks-service.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/private-party-catering.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/corporate-catering.aspx404115http://sussexchef.comN/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/caterers.aspx40401http://sussexchef.com/wedding-catering/N/AView Issue 404 : Received 404 (Not Found) error response for page. http://sussexchef.com/funeral-caterers-brighton.aspx Crawl Issue Found: 500 Errors More than 5% of site pages served 500 errors during the last crawlExcessive 500 errors impact search engine indexation. Double check that your website is serving pages properly to both users and crawlers. 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Wedding-Packages-2014.pdf141N/A50302N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Vegetarian-BBQ-Menu.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/AllInclusiveMenuPrices1.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/Finger%20Buffet.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/Dessert.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/SummerMenu.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/HotorColdBuffet.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/Canape%20Menu.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Susex-Chef-Xmas-Dinner-artwork-file.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/BBQ.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Fun-Finger-Buffets.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/HogRoast.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Private-Chef-Dinner-Packages.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/Salad%20Menu.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Childrens-Menu.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/HotForkBuffetDelivererd.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/Afternoon%20Tea.pdf141N/A50301N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Christmas-Promo.pdf141N/A50301N/A 500 : Received 500 (Internal Server Error) error response for page. http://sussexchef.com/?attachment_id=51110N/A50000N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2013/08/Biography.pdf10N/A50300N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/?attachment_id=62810N/A50300N/A 503 : Received 503 (Service Unavailable) error response for page. http://sussexchef.com/wp-content/uploads/2014/02/2014-02-01-11.12.00.jpg
Getting Started | | SussexChef830 -
'not a valid url' error in campaign set up
I get the error not a valid url when I'm trying to set up a campaign. I know it's a valid url. I have tried with www, non-www, http://, https:// when I do the https it lets me start, but then I get an error that https is forwarding to http and I need to use that. When I then put in the http, I get the original error. thanks in advance for your help.
Getting Started | | HighVoltage0 -
Setting our Blog as a campaign page
Hi, I have created a campaign for our divelife/blog page to monitor success with the KW's we are targeting with articles. It is connected to the same Google and social account as our main domain campaign that is working just fine but 2 weeks in and still no data is appearing.
Getting Started | | Divelife0 -
I am new to MOZ, I set up one tracking campaign two weeks ago, I have tracked no keywords, I have done some keyword research for ranking difficulty and in two weeks I have already hit 50K pages crawled, I'm maxed out, is this common?
I am a startup and can't afford the higher plans yet. And even their highest plan is 600K pages crawled, which seems really low considering how lightly I used the tool and how quickly I hit 50K. Does anyone have any advice or information on how they use the tool on lower packages? Did I do something wrong to hit 50K pages crawled that fast? Does this pricing make any sense, it seems like an incredibly high price, I love the tool, any help is appreciated.
Getting Started | | Daedilus1 -
How to get moz to crawl a staging domain that is blocked by robots.txt
Is it possible to get Moz to do a crawl report on a domain blocked by robots.txt and actually display all errors instead of only one saying the domain was blocket in robots.txt? Anything i can add to robots.txt to make moz able to do the crawl report but still hinder google from crawling a staging domain?
Getting Started | | classifiedtech0