Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do you create tracking URLs in Wordpress without creating duplicate pages?
I use Wordpress as my CMS, but I want to track click activity to my RFQ page from different products and services on my site. The easiest way to do this is through adding a string to the end of a URL (ala http://www.netrepid.com/request-for-quote/?=colocation) The downside to this, of course, is that when Moz does its crawl diagnostic every week, I get notified that I have multiple pages with the same page title and the dup content. I'm not a programming expert, but I'm pretty handy with Wordpress and know a thing or two about 'href-fing' (yeah, that's a thing). Can someone who tracks click activity in WP with URL variables please enlighten me on how to do this without creating dup pages? Appreciate your expertise. Thanks!
Moz Pro | | Netrepid0 -
Finding Related Keywords and/or Phrases
I'm diving into unknown territory and looking for keywords and phrases related to a prospective new clients business. He knows some of the general terms, for example, water removal. What tool works best to find related search terms and the number of searches for those keywords/search terms. I usually use the Google Keyword Tool but going to give moz a try. Love to be able to get local search data as well. Thanks!
Moz Pro | | DFLsports1 -
Does SeoMoz realize about duplicated url blocked in robot.txt?
Hi there: Just a newby question... I found some duplicated url in the "SEOmoz Crawl diagnostic reports" that should not be there. They are intended to be blocked by the web robot.txt file. Here is an example url (joomla + virtuemart structure): http://www.domain.com/component/users/?view=registration and the here is the blocking content in the robots.txt file User-agent: * _ Disallow: /components/_ Question is: Will this kind of duplicated url errors be removed from the error list automatically in the future? Should I remember what errors should not really be in the error list? What is the best way to handle this kind of errors? Thanks and best regards Franky
Moz Pro | | Viada0 -
URL paramters and duplicate content
Hello, I have a 2-fold question: Crawl Diagnostics is picking up a lot of Duplicate Page Title errors, and as far as I can tell, all of them are cause by URL parameters trailing the URL. We use a Magento store, and all filtering attributes, categories, product pages etc are tagged on as URL parameters. example: Main URL:
Moz Pro | | yacpro13
/accessories.html Duplicated Title Page URLs: /accessories.html?dir=asc&order=position
/accessories.html?mode=list
/accessories.html?mode=grid
...and many others How can I make the Crawl Diagnostics not identify these as errors? Now from an SEO point of view, all these URL parameters are been picked up by google, and are listed in WedMaster Tools -> URL parameters. All URL parameters are set to "let google decide". I remember having read that Google was smart enough here to make the right decision, and we shouldn't have to worry about it. Is this true, or is there a larger issue at hand here? Thankas!0 -
Canonical URLs for Search Parameters
Hi Guys Our seomoz campaign report is returning a lot or Rel Canonical issues similar to this for each page. The non / version redirects to the / version but how do I get the ones with search parameters ie '?datefrom&nights' to redirect. http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78
Moz Pro | | JohnTulley
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom&nights
http://www.lamangaclubresort.co.uk/accommodations/las-brisas-78/?datefrom=&nights= Any help would be welcome, thanks0 -
How to read Crawler downloaded report
I am trying to seperate the duplicate title and description URLs, by looking at the report i am not getting how to find all urls which contain same title and description. Is there any video link on the site which walk me through each part of the report. Thanks, Punam
Moz Pro | | nonlinearcreations0 -
Domain Authority is up, how do I find out what changed?
Hi, I’m a noob so if this is a really obvious question, I apologize. Our domain authority increased this week. I wonder if there is any easy way to find out what changed. (We didn’t do anything.) Thank you! Ann
Moz Pro | | anns0 -
Why are SEOmoz Pro Keyword Ranking reports different between two 301-linked URLs?
Hi, My main domain is www.dancenut.com. I have this 301 URL redirected to www.dancenut.com/boston/. I set up an SEOmoz Pro campaign for each of these URLs in order to see if they were being treated differently in any way. In most cases the report results are identical, or the small differences are understandable. However, there is one big difference between the two sets of campaign reports. In the Keyword Ranking reports, the data for the Bing and Yahoo! reports are identical, but the data for the Google reports are dramatically different. Out of 21 keywords, 9 are listed in the top 50 for www.dancenut.com, but only 2 are listed in the top 50 for www.dancenut.com/boston/ (the specific positions are the same for the 2 keywords that are listed for both). Does this make any sense? Could the SEOmoz Pro data be wrong? If not, then I'm suspicious that Google may not be interpreting the 301 redirect properly. I don't think this could be fully explained by a 1-10% reduction in link juice due to the 301 because I have one keyword for which my site ranks #1 in Google, Bing, and Yahoo! with www.dancenut.com, but it doesn't even rank in the top 50 in Google with www.dancenut.com/boston/. And why would these differences only exist for Google? Any insight would be much appreciated! Andrew
Moz Pro | | dancenut0