Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the best way to treat URLs ending in /?s=
Hi community, I'm going through the list of crawl errors visible in my MOZ dashboard and there's a few URLs ending in /?s= How should I treat these URLs? Redirects? Thanks for any help
Moz Pro | | Easigrass0 -
Crawlers reporting upper case letter url versions although these have been 301'd to lower case !?
Hi I have a client e-com site who's dev platform is on a windows server Their product pages have been auto-named after the product title, with the first letter in each word being upper case, which has hence translated to the URL having upper cases instances too. I asked them to set up 301 redirects for all url's that had upper case instances to lower case versions, which they say they have done. However I'm still seeing url's with upper case instances showing up in webmaster tools and moz crawl reports but when I copy & paste them into a browser they do redirect to, & resolve in, the lower case version. Its also upper case versions reported in the Google cache! So how come webmaster tools & Moz etc are reporting the upper case versions, surely if redirected it should be the lower case versions All Best Dan
Moz Pro | | Dan-Lawrence0 -
Magento creating odd URL's, no idea why. GWT reporting 404 errors
Hi Mozzes! Problem 1 GWT and Moz, both are reporting approximately one hundred  404 errors for certain URL's. Examples shown below. We have no idea why or how these URL's are being created in Magento. Any hypothesis on the matter would be appreciated. The domain name in question is http://www.artorca.com/ These are valid URL's if /privacy is removed. The first URL is for a product, second for an artist profile and third for a CMS page 1. semi-abstract-landscape/privacy 2. jose-de-la-barra/privacy 3. seller-guide/privacy What may be the source for these URL's? What solution should we implement to fix existing 404's? 301 redirects should be fine? Problem 2 Website pages seem to also be accessible with index.php in the domain name. Example Artorca.com/index.php/URL's. Will this cause a duplicate content issue? Should we implement 301's, canonicals, or just leave as is? Cheers! MozAddict
Moz Pro | | MozAddict0 -
Page Ranking by URL / Keyword
Needing to know how to find out the page rank of a URL that is NOT within the top 50 or top 100. Â Need to know that specific page's rank, not what our overall site's ranking for the keyword is. Â Can't seem to find any tool that goes beyond the top 100. Any ideas?
Moz Pro | | leankit0 -
Overly Dynamic URLS
I should be able to set URL Parameters in my Google Webmasters Tool that allows be to stop my overly dynamic page URL problem. Please help me on how to do this.
Moz Pro | | pinksgreens0 -
How do i fix the problem of having 2 url's splitting my rankings?
please excuse my noobness. i have a nice  site, www.soundsenglish.com which I built from scratch and learned by doing. It has lots of nice content and it does ok, my rankings are woeful mostly cos of all the mistakes i made building it...i'll fix that stuff. This stuff i don't know about. from my adsense i get 2 listings www.soundsenglish.com and soundsenglish.com wierdly the second one gets consistently higher paying ads although most of the visitors come through the first but they are both the same landing page same content -as far as i can tell. when i try to find rankings, use the seo tools etc i get diferent scores, so whatever it is, it is splitting the sites -  can't be a good thing. i have no idea why this happens and i have some inkling that maybe i need something to do with cannonical redirects or maybe a 301 redirect. both of which i have little idea how to do. If that isn't enough naive blundering about for you, i have a little more... it occurs to me that this prpoblem is probably happening with every page on my site, i.e. the 'juice ' is not getting credited onto that one page. this surely means cannonical redirects but even afterreading up on them idon't quite get it. or rather ido but idon;t get  how to apply it to my context.
Moz Pro | | soundsenglish0 -
Crawl Diagnostics Shows thousands of 302's from a single url. I'm confused
Hi guys I just ran my first campaign and the crawl diagnostics are showing some results I'm unfamiliar with.. In the warnings section it shows 2,838 redirects.. this is where I want to focus. When I click here it shows 5 redirects per page. When I go to click on page 2, or next page, or any other page than page 1 for that matter... this is where things get confusing. Nothing shows. Downloading the csv reveals that 2,834 of these are all showing: URL: http://www.mydomain.com/401/login.php url: http://www.mydomain.com/401/login.php referrer: http://www.mydomain.com/401/login.php location_header: http://www.mydomain.com/401/login.php I guess I'm just looking for an explanation as to why it's showing so many to the same page and what possible actions can be taken on my part to correct it (if needed). Thanks in advance
Moz Pro | | sethwb0 -
Why is it that certain keywords in my seomoz report card are for the wrong urls
Hi Guys, why is it that seomoz's On Page Optimization Reports for Google TH are attributing certain keywords with certain urls which are wrong? What mean is an example keyword - 'chiang mai villas for rent' has been scored an F against my home page url rather than using our 'Chiang Mai' url, why is this, is there a coding issue on my site? Is it that seomoz is finding something on my home page to suggest I want it to rank for this keyword?
Moz Pro | | ewanTHH0