7,608 High Priority Crawl Diagnostic problems
-
Hey There,
I have an e-commerce site that is showing 7,608 High Priorities to fix - 7,536 are duplicate content. What's the most effective process to start with?
I'm open to outsourcing some of the work to an expert - email me on dave@emanbee.com
Thanks for your time,
Dave
-
Cheers Kate.
From doing more reading, MOZ/ Google views thin content (300 words or less) or webpages with 95% of the same HTML code as duplicate. That will be the majority of what is showing in my crawl diagnostics.
That means I'm back to your original advice of fixing up duplicate page titles from GWT.
Currently, the canonical tags are generated sitewide through a template function. Without full control over the canonical tag I can't fix or structure things as easily as I'd like so I will see if a web dev can help out with this. We should be able to add the whole link too.
Thanks again,
Dave
-
Looks like moz isn't taking the canonical into effect, as long as it's there, you're fine. But I'd warn you not to use relative canonical links ( /directory/page/ vs http://www.domain.com/directory/page/), link to the whole thing. I've seen this go wrong in the past. It's not causing issues now but could in the future.
-
Hi Katemorris,
Thanks again for getting back to me.
I have started going through and fixing up pages. I'm hoping you can clarify something from MOZ for me?
In MOZ > crawl diagnostics> duplicate page content (the largest and only high priority issue listed for me) > the first link in the list > show the duplicate pages
Below is an example of 4 links that are all showing as duplicates of http://www.mooloolabamusic.com.au/page/brands in the moz software:
http://www.mooloolabamusic.com.au/live-sound-lighting/lighting/atmospheric-effects/?pr=72-82&rf=pr
http://www.mooloolabamusic.com.au/live-sound-lighting/lighting/atmospheric-effects/?pr=0-72&rf=pr
http://www.mooloolabamusic.com.au/studio-production/?pr=1732-1828&rf=pr
http://www.mooloolabamusic.com.au/studio-production/?pr=1770-1827&rf=pr
Can you please clarify how these pages have duplicate content and how to fix this? There are thousands like this.
When I have a look at them using the moz search bar there is already a cononical tag in the header which is either not working or the moz software does not pick it up or is the site template creating 'duplicate content'?
Thanks so much for your time,
Dave
-
Start in Google Webmaster Tools or in the Moz Crawl. Identify those pages with the same title tag and work through that list. The title tag is usually a good indication of duplicate content.
If the content is duplicate for sure, determine if it's a useful duplicate. If so, use a canonical from the duplicate to the original. If it's just duplicated with no real reason, find out how to get rid of the duplicate. This can be anything from unnecessary parameters, to tag pages, and so many more.
You'll start to see trends in the data, try to fix the bigger problems as you see them.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What to do with a site of >50,000 pages vs. crawl limit?
What happens if you have a site in your Moz Pro campaign that has more than 50,000 pages? Would it be better to choose a sub-folder of the site to get a thorough look at that sub-folder? I have a few different large government websites that I'm tracking to see how they are fairing in rankings and SEO. They are not my own websites. I want to see how these agencies are doing compared to what the public searches for on technical topics and social issues that the agencies manage. I'm an academic looking at science communication. I am in the process of re-setting up my campaigns to get better data than I have been getting -- I am a newbie to SEO and the campaigns I slapped together a few months ago need to be set up better, such as all on the same day, making sure I've set it to include www or not for what ranks, refining my keywords, etc. I am stumped on what to do about the agency websites being really huge, and what all the options are to get good data in light of the 50,000 page crawl limit. Here is an example of what I mean: To see how EPA is doing in searches related to air quality, ideally I'd track all of EPA's web presence. www.epa.gov has 560,000 pages -- if I put in www.epa.gov for a campaign, what happens with the site having so many more pages than the 50,000 crawl limit? What do I miss out on? Can I "trust" what I get? www.epa.gov/air has only 1450 pages, so if I choose this for what I track in a campaign, the crawl will cover that subfolder completely, and I am getting a complete picture of this air-focused sub-folder ... but (1) I'll miss out on air-related pages in other sub-folders of www.epa.gov, and (2) it seems like I have so much of the 50,000-page crawl limit that I'm not using and could be using. (However, maybe that's not quite true - I'd also be tracking other sites as competitors - e.g. non-profits that advocate in air quality, industry air quality sites - and maybe those competitors count towards the 50,000-page crawl limit and would get me up to the limit? How do the competitors you choose figure into the crawl limit?) Any opinions on which I should do in general on this kind of situation? The small sub-folder vs. the full humongous site vs. is there some other way to go here that I'm not thinking of?
Moz Pro | | scienceisrad0 -
Ajax4SEO and rogerbot crawling
Has anyone had any experience with seo4ajax.com and moz? The idea is that it points a bot to a html version of an ajax page (sounds good) without the need for ugly urls. However, I don't know how this will work with rogerbot and whether moz can crawl this. There's a section to add in specific user agents and I've added "rogerbot". Does anyone know if this will work or not? Otherwise, it's going to create some complications. I can't currently check as the site is in development and the dev version is noindexed currently. Thanks!
Moz Pro | | LeahHutcheon0 -
In my crawl diagnostics, there are links to duplicate content. How can I track down where these links originated in?
How can I find out how SEOMOz found these links to begin with? That would help fix the issue. Where's the source page where the link was first encountered listed at?
Moz Pro | | kirklandsl0 -
404 errors in SEOMoz crawl tool
I currently have several 404 errors in the latest crawls from SEOMoz. Here is an example of the error. http://dealerplatform.com/blog/2011/10/23/videos-for-auto-dealers/www.dealerplatform.com/ In all cases the error is a result of www.dealerplatform.com being added to the real url. Anyone seen this before? The site is a wordpress mutlisite. I don't see where this incorrect link is showing up anywhere on the website. Any advice would be helpful. Thanks
Moz Pro | | Chris_Gregory0 -
Links listed in MozPro Crawl Diagnostics
Ok, seeing as I'm getting to the end of my first week as a Pro Member, I'm getting more and more feedback regarding the pages on my site. I'm slightly concerned though that, having logged in this morning, I'm being shown 407 warnings for pages with 'Too Many On Page Links.' According to the blurb at the top of the page, 'Too Many' is generally defined as being over 100 links on a page ... but when I look at the pages which are being thrown up in the report, none of them contain anywhere near 100 links. I seriously doubt there is a glitch with the tool which has led me to think that maybe there's an issue with the way my site is coded. Is anyone aware of a coding problem that may lead Google and SEOMoz to suspect that I have a load of links across my site? P.S. As an aside, when this tool mentions 'Too Many Links' is it referring purely to OBL or does it count links to elsewhere on my domain too? Cheers,
Moz Pro | | theshortstack0 -
Pages Crawled: 250 | Limit: 250
One of my campaigns says: Pages Crawled: 250 | Limit: 250 Is this because it's new and the limit will go up to 10,000 after the crawl is complete? I have a pro account, 4 other campaigns running and should be allowed 50,000 pages in total
Moz Pro | | MirandaP0 -
Wild fluctuation in number of pages crawled
I am seeing huge fluctuations in the number of pages discovered the crawl each week. Some weeks the crawl discovers > 10,000 pages and other weeks I am seeing 4-500. So, this week for example I was hoping to see some changes reflected for warnings from last weeks report (which discovered > 10,000 pages). However, the entire crawl this week was 448 pages. The number of pages discovered each week seems to go back and forth between these two extremes. The more accurate count would be nearer the 10,000 mark than the 400 range. Thanks. Mark
Moz Pro | | MarkWill0 -
Can I exclude pages from my Crawl Diagnostics?
Right now my crawl diagnostic information is being skewed because it's including the onsite search from my website. Is there a way to remove certain pages like search from the errors and warnings of the crawl diagnostic? My search pages are coming up as: Long URL Title Element Too Long Missing Meta Description Blocked by meta-robots (Which is how I want it) Rel Canonical Here is what the crawl diagnostic thinks my page URL looks like: website.com/search/gutter%25252525252525252525252525252525252525252525252525252525 252525252525252525252525252525252525252525252525252525252525252 525252525252525252525252525252525252525252525252525252525252525 252525252525252525252525252525252525252525252525252525252525252 52525252525252525252525252525252525252525252525252Bcleaning/ Thank you, Jonathan
Moz Pro | | JonathanGoodman0