Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
MOZ Crawler
Hi, how much time it will take MOZ crawler to take entire site? In 24 hours it crawled only 500 pages isn't it too slow? My website has almost 50k pages.
Moz Pro | | macpalace0 -
No more than one canonical url Tag.
I just got the "no more than one canonical url TAG" for this page http://www.vacuumadvisers.com/1/electrolux-ultra-active-deep-clean-bagless-canister-vacuum-cleaner-review. I have no idea how to Fix that. Tried google it but none for Tag in particular. PS. I have changed the Theme recently therefore so did the URL Anyone?
Moz Pro | | bishop230 -
Why is it that certain keywords in my seomoz report card are for the wrong urls
Hi Guys, why is it that seomoz's On Page Optimization Reports for Google TH are attributing certain keywords with certain urls which are wrong? What mean is an example keyword - 'chiang mai villas for rent' has been scored an F against my home page url rather than using our 'Chiang Mai' url, why is this, is there a coding issue on my site? Is it that seomoz is finding something on my home page to suggest I want it to rank for this keyword?
Moz Pro | | ewanTHH0 -
Duplicate page titles are the same URL listed twice
The system says I have two duplicate page titles. The page titles are exactly the same because the two URLs are exactly the same. These same two identical URLs show up in the Duplicate Page Content also - because they are the same. We also have a blog and there are two tag pags showing identical content - I have blocked the blog in robots.txt now, because the blog is only for writers. I suppose I could have just blocked the tags pages.
Moz Pro | | loopyal0 -
How Can I Find All Backlinks
How can I use site explorer to find out which sites are linking to us thousands of times. It says we have over 300,000 total links pointing to our site. I'm thinking there are some sitewide links from other site(s) making up most of that, but I can't seem to locate it/them?
Moz Pro | | poolguy0 -
Can overly dynamic URLs be overcome with canonical meta tags?
I tried searching for questions regarding dynamic URLs and canonical tags, but I couldn't find anything s hopefully this hasn't been covered. There are a large number of overly dynamic URLs reported in our site crawl (>7,000). I haven't looked at each of these, but most of these either have a canonical meta tag or have are indicated as FOLLOW, NO INDEX pages. Will these be enough to overcome any negative SEO impact that may come from overly dynamic URLs? We are down to almost 0 critical errors and this is now the biggest problem reported by the site crawl after too many on page links.
Moz Pro | | afmaury0 -
What SeoMoz tools should I use to track the sucess of my SEO efforts and find possible linking partners?
HI All, I am a new user to SEO moz pro and I have a few questions I hope you can help me with. We are adding a new product category to an existing site and I am wondering what tools to use to track our SEO efforts to build ranking for these new products. I have already started a campaign that tracks the ranking of our primary keywords but I am wondering if there are other tools I should be using as well. Currently our SEO strategy is centered around link building. I have been approaching sites that have content related to our products and offering to write articles, guest blogs, and create educational resources that benefit the sites users and have specific anchor text linking back to our site. Since we are trying to build rank for a new category of products should I be linking to the general product category page or should I try to link to sub categories or individual products? Also I have used the juicy link finder to help me identify related sites that I can approach but I haven't received very many good leads. Is there a more specific way to use the tool or another tool that I should try? If you have any suggestions about other SEO strategies/activities that we should be pursuing please share your ideas! Thanks!
Moz Pro | | AndrewY0 -
Does anyone know what the %5C at the end of a URL is?
I've just had a look at the crawl diagnostics and my site comes up with duplicate page content and duplicate titles. I noticed that the url all has %5C at the end which I've never seen before. Does anybody know what that means?
Moz Pro | | Greg800