Crawl Diagnostics 2261 Issues with Our Blog
-
I just recently signed up for MOZ, so much information. I've done the walk through and will continue learning how to us the tools. But I need your help.
Our first moz crawl indicated 2261 issues (447 404's, 803 duplicate content, 11 502's, etc). I've reviewed all of the crawls issues and they are linked to our Yahoo hosted WordPress blog. Our blog is over 9 years old. The only issue that I'm able to find is our categories are not set up correctly. I've searched for WordPress assistance on this topic and cant find any issues with our current category set up. Every category link that I click returns Nothing Found Apologies, but no results were found for the requested archive. Perhaps searching will help find a related post.
http://site.labellaflorachildrensboutique.com/blog/
Any assistance is greatly appreciated.
-
Go Dan!
-
While what Matt and CleverPHD (Hi Paul!) have said is correct - here's your specific issue:
Your categories are loading with "ugly" permalinks like this: http://site.labellaflorachildrensboutique.com/blog/?cat=175 (that loads fine)
But you are linking to them from the bottom of posts with the "clean" URLs --> http://screencast.com/t/RIOtqVCrs
The fix is that Catgory URLs need to load with "clean" URLs and the ugly one should redirect to the clean one.
Possible fixes:
- Try updating wordpress (I see you're on a slightly older version)
- See if you .htaccess file has been modified (ask a developer or your hosting for help with this perhaps)
Found another linking issue:
This link to Facebook in your left sidebar --> http://screencast.com/t/EqltiBpM it's just coded incorrectly. It adds the current page URL so you get a link like this http://site.labellaflorachildrensboutique.com/blog/category/unique-baby-girl-gifts/www.facebook.com/LaBellaFloraChildrensBoutique instead of your Facebook page: http://www.facebook.com/LaBellaFloraChildrensBoutique
You can fix that Facebook link probably in Appearance->Widgets.
That one issue is causes about 200 of your broken URLs
-
One other thing I forgot. This video by Matt Cutts
It explains why Google might show a link even though the page was blocked by robots.txt
https://www.youtube.com/watch?v=KBdEwpRQRD0
Google really tries not to forget URLs and this video reminds us that Google uses links not just for ranking, but discovery so you really have to pay attention to how you link internally. This is especially important for large sites.
-
Awesome! Thanks for straightening it out.
-
Yes, the crawler will avoid the category pages if they are in robots.txt. It sounded like from the question that this person was going to remove or change the category organization and so you would have to do something with the old URLs (301 or noindex) and that is why I would not use robots.txt in this case so that those directives can be seen.
If these category pages had always been blocked using robots.txt, then this whole conversation is moo as the pages never got in the index. It is when unwanted pages get in the index that you potentially want to get rid of that things get a little tricky, but workable.
I have seen issues where there are pages on sites that got into the index and ranking but they were the wrong pages and so the person just blocked with robots.txt. Those URLs continued to rank and cause problems with the canonical pages that should be ranking. We had to unblock, let Google see the 301, rank the new pages then put the old URLs back into robots to prevent the old URLs from getting back into the index.
Cheers!
-
Oh yeah, that's a great point! I've found that the category pages rarely rank directly, but you'll definitely want to double-check before outright blocking crawlers.
Just to check my own understanding, CleverPhD, wouldn't crawlers avoid the category pages if they were disallowed by robots.txt (presuming they obey robots.txt), even if the links were still on the site?
-
One wrinkle. If the category pages are in Google and potentially ranking well - you may want to 301 them to consolidate them into a more appropriate page (if this makes sense) or if you want to get them out of the index, use a meta noindex robots tag on the page(s) to have them removed from the index, then block them in robots.txt.
Likewise, you have to remove the links on the site that are pointing to the category pages to prevent Google from recrawling and reindexing etc.
-
Category pages actually turn up as duplicate content in Crawl Diagnostics _really _often. It just means that those categories are linked somewhere on your site, and the resulting category pages look almost exactly like all the others.
Generally, I recommend you use robots.txt to block crawlers from accessing pages in the category directory. Once that's done and your campaign has re-crawled your site, then you can see how much of the problem was resolved by that one change, and consider what to do to take care of the rest.
Does that make sense?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Moz can't crawl my site
Moz is being blocked from crawling the following site - https://www.cleanchain.com. When looking at Robot.txt, the following is disallowing access but don't know whether this is preventing Moz from crawling too? User-agent: *
Moz Pro | | danhart2020
Disallow: /adeci/
Disallow: /core/
Disallow: /connectors/
Disallow: /assets/components/ Could something else be preventing the crawl?0 -
WEbsite cannot be crawled
I have received the following message from MOZ on a few of our websites now Our crawler was not able to access the robots.txt file on your site. This often occurs because of a server error from the robots.txt. Although this may have been caused by a temporary outage, we recommend making sure your robots.txt file is accessible and that your network and server are working correctly. Typically errors like this should be investigated and fixed by the site webmaster. I have spoken with our webmaster and they have advised the below: The Robots.txt file is definitely there on all pages and Google is able to crawl for these files. Moz however is having some difficulty with finding the files when there is a particular redirect in place. For example, the page currently redirects from threecounties.co.uk/ to https://www.threecounties.co.uk/ and when this happens, the Moz crawler cannot find the robots.txt on the first URL and this generates the reports you have been receiving. From what I understand, this is a flaw with the Moz software and not something that we could fix form our end. _Going forward, something we could do is remove these rewrite rules to www., but these are useful redirects and removing them would likely have SEO implications. _ Has anyone else had this issue and is there anything we can do to rectify, or should we leave as is?
Moz Pro | | threecounties0 -
Ranking issues with my local business website.
Hi, I have local business website "asapgaragedoorrepair.com" which im seeing weird ranking issues during last few months . i would like to share it with you guys and get help from you. first days i have indexed the website ,it was ranking good with google and then after 2-3 weeks i have noticed that none of the pages ranking anymore. i have looked for any issues in the optimization and i fixed them and indexed again. right now i have weird issues. 1- the website was ranking for another page in my website other that the one that i wanted for keyword" Garage Door Repair Pasadena " .(it was ranking with garage door installation page). i changed the content ,optimization on the other page and indexed. still the same issue. then i had no ways other that 301 to the right page . 2- when i check the rankings with "ahrefs" website or "Moz" or other tools that i have, it shows that im ranking for 16th position for " Garage door repair Pasadena ". but when i search it by myself im getting no results. (this is bothering) maybe i dont know something . pls help me 3- in Moz or ahref website shows the ranking like the image i have attached. 2 different results for each keywords. why is that? and why i can not see them when i search them by my self? htPQR
Moz Pro | | Mishel2980 -
Crawl results on keyword rankings differ from actual search rankings
For the second week in a row, the weekly keyword rankings in my MOZ account show "not in top 50" for most keywords. However, when I check some of my rankings I still broadly have the rankings that I used to have. It seems that the MOZ crawl is experiencing some difficulties. Does anyone have suggestions how to find out what's going on? If I can't rely on the crawl outcomes, then the tool is not very useful anymore!
Moz Pro | | matteis0 -
SERP control panel issues
Hi I use the SERP control panel every day, but now it hardly works and only shows info for the companies who have Google Places and show in the local results. I use Google Chrome on my (old) Toshiba laptop. Are there any technical issues with the SERP control panel at the moment? Kind regards, Barnaby
Moz Pro | | FreeRangeUK1 -
If i have just started a campaign, and the crawl is happening - can i log off and shut down my PC
If i have just started a campaign, and the crawl is happening - can i log off and shut down my PC or will this cause the crawl to stop. My campaign was showing as "crawl in process". I then shut down and logged onto my SEOmoz account on another PC. When I did the crawl didnt seem to be in process, even though the total time was still about 30 min from the crawl start...... sorry if this is a stupid question, just a bit new to this
Moz Pro | | duff1010 -
Crawl Diagnostics shows two title and meta tag errors but they are false positives.
I got one hit each on "Missing Meta Description Tag" and "Title Missing or Empty" but in the source of my page they are clearly there: <title>Protein Powder | Compare and Get the Best Prices</title> <meta name="keywords" content="protein powder, whey protein, protein supplement, whey protein isolate, hydrolyzed whey" /> I understand there are conventions which may or may not be followed by Drupal (I read an earlier question where ordering and W3C conventions were suggested) but i'm not sure how to fix them given Drupal will just overwrite any hand editing the next time something is built and importantly, I can't get the crawl to work on cue - it works on the automatic once a week crawl in the main campaign summary but every time I've specifically used the Crawl Test tool it gives me a "There was an error submitting your request to the crawler. Please try again later" so I can't really test any changes. Given Google seems to be recognising the title tag - ie showing it in the results - Do I put this down as seomoz just not working? Kind Regards, Brian
Moz Pro | | btrr690 -
How do I get my crawl report?
I received a message that my crawl report is complete with a link - went to the link however - when I click on the icon that has the report name and the complete check mark nothing happens looked around can't find the results. Need to bid on this job so it would be helpful to know where to get it. Thanks for all you do. Wickey
Moz Pro | | Wickey0