7,608 High Priority Crawl Diagnostic problems
-
Hey There,
I have an e-commerce site that is showing 7,608 High Priorities to fix - 7,536 are duplicate content. What's the most effective process to start with?
I'm open to outsourcing some of the work to an expert - email me on dave@emanbee.com
Thanks for your time,
Dave
-
Cheers Kate.
From doing more reading, MOZ/ Google views thin content (300 words or less) or webpages with 95% of the same HTML code as duplicate. That will be the majority of what is showing in my crawl diagnostics.
That means I'm back to your original advice of fixing up duplicate page titles from GWT.
Currently, the canonical tags are generated sitewide through a template function. Without full control over the canonical tag I can't fix or structure things as easily as I'd like so I will see if a web dev can help out with this. We should be able to add the whole link too.
Thanks again,
Dave
-
Looks like moz isn't taking the canonical into effect, as long as it's there, you're fine. But I'd warn you not to use relative canonical links ( /directory/page/ vs http://www.domain.com/directory/page/), link to the whole thing. I've seen this go wrong in the past. It's not causing issues now but could in the future.
-
Hi Katemorris,
Thanks again for getting back to me.
I have started going through and fixing up pages. I'm hoping you can clarify something from MOZ for me?
In MOZ > crawl diagnostics> duplicate page content (the largest and only high priority issue listed for me) > the first link in the list > show the duplicate pages
Below is an example of 4 links that are all showing as duplicates of http://www.mooloolabamusic.com.au/page/brands in the moz software:
http://www.mooloolabamusic.com.au/live-sound-lighting/lighting/atmospheric-effects/?pr=72-82&rf=pr
http://www.mooloolabamusic.com.au/live-sound-lighting/lighting/atmospheric-effects/?pr=0-72&rf=pr
http://www.mooloolabamusic.com.au/studio-production/?pr=1732-1828&rf=pr
http://www.mooloolabamusic.com.au/studio-production/?pr=1770-1827&rf=pr
Can you please clarify how these pages have duplicate content and how to fix this? There are thousands like this.
When I have a look at them using the moz search bar there is already a cononical tag in the header which is either not working or the moz software does not pick it up or is the site template creating 'duplicate content'?
Thanks so much for your time,
Dave
-
Start in Google Webmaster Tools or in the Moz Crawl. Identify those pages with the same title tag and work through that list. The title tag is usually a good indication of duplicate content.
If the content is duplicate for sure, determine if it's a useful duplicate. If so, use a canonical from the duplicate to the original. If it's just duplicated with no real reason, find out how to get rid of the duplicate. This can be anything from unnecessary parameters, to tag pages, and so many more.
You'll start to see trends in the data, try to fix the bigger problems as you see them.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why did Moz crawl our development site?
In our Moz Pro account we have one campaign set up to track our main domain. This week Moz threw up around 400 new crawl errors, 99% of which were meta noindex issues. What happened was that somehow Moz found the development/staging site and decided to crawl that. I have no idea how it was able to do this - the robots.txt is set to disallow all and there is password protection on the site. It looks like Moz ignored the robots.txt, but I still don't have any idea how it was able to do a crawl - it should have received a 401 Forbidden and not gone any further. How do I a) clean this up without going through and manually ignoring each issue, and b) stop this from happening again? Thanks!
Moz Pro | | MultiTimeMachine0 -
Crawl Diagnostics saids a page is linking but I can't find the link on the page.
Hi I have just got my first Crawl Diagnostics report and I have a questions. It saids that this page: http://goo.gl/8py9wj links to http://goo.gl/Uc7qKq which is a 404. I can't recognize the URL on the page which is a 404 and when searching in the code I can't find the %7Blink%7D in the URL which gives the problems. I hope you can help me to understand what triggers it 🙂
Moz Pro | | SebastianThode0 -
Block Moz (or any other robot) from crawling pages with specific URLs
Hello! Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future. I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt: User-agent: dotbot
Moz Pro | | Blacktie
Disallow: /*numberOfStars=0 User-agent: rogerbot
Disallow: /*numberOfStars=0 My questions: 1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact? 2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?) I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there. Thank you for your help!0 -
Problem to log into moz
Every time the moz logs me out from the account and then I can not log in. It shows on the left side my name like I am logged in and then when I want go to community suddenly I am not logged in. It offen shows 502 error. It was first doing on firefox, then I manage to log in chrome and now I had to log in private browsing.
Moz Pro | | Rebeca11 -
Campaign Crawl
I have a site with 8036 pages in my sitemap index. But the MozBot only Crawled 2169 pages. It's been several months and each week it crawls roughly the same number of pages. Any idea why I'm not getting fully crawled?
Moz Pro | | JMFieldMarketing0 -
Unable to crawl pages
Hi, I am trying to set up a campaign for our website - www.salvationarmy.org.au however, I can't seem to get a scan of more than three pages. I have tried the following: www.salvationarmy.org.au (only 2 pages) www.salvationarmy.org.au/home (only 1 page) salvationarmy.org.au (only 3 pages) There is a geo IP redirect on www.salvationarmy.org.au but the second domain listed above should resolve the full site. I'm a newbie to SEOmoz so any help would be appreciated! Thanks, Mel
Moz Pro | | KingPings0 -
Only one page has been crawled
I am running a campaing for three weeks now and first two crawls was ok but the last one is showing only one page crawled. the subdomain I am tracking is: www.cubaenmiami.com I have everything correct in my site. Regards Alex
Moz Pro | | esencia0 -
Errors on my Crawl Diagnostics
I have 51 errors on my Crawl Diagnostics tool.46 are 4xx Client Error.Those 4xx errors are links to products (or categories) that we are not selling them any more so there are inactive on the website but Google still have the links. How can I tell Google not to index them?. Can those errors (and warnings) could be harming my rankings (they went down from position 1 to 4 for the most important keywords) thanks,
Moz Pro | | cardif0