Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why is MOZ crawl is returning URLs with variable results showing Missing Meta Desc? Example: http://nw-naturals.net/?page_number_0=47
Can you help me dive down into my website guts to find out why the MOZ crawl is returning URLs with variable results? And saying this is missing a description when it's not really a page? Example: http://nw-naturals.net/?page_number_0=47. I've asked MOZ but it's a web development issue so they can't help me with it. Has anyone had an issue with this on their website? Thank you!
Moz Pro | | lewisdesign0 -
Do we get "Removal of "nofollow" from first custom URL on profile" when we cross 200 Moz Points? I have not received it yet, anything I can do?
Though I have only recently subscribed to Moz Pro, I have been using Moz Blog for quite some time. I recently crossed 200 Moz Points. As per Moz Points, it says "Removal of "nofollow" from first custom URL on profile" for crossing 200 points. I still dont see any links from Moz when I am using OSE. Can anyone suggest what i need to do?
Moz Pro | | vinodh-spintadigital2 -
My index URL was removed from Google, but all others remain in the search engines
HI All, My site was ranking very well and was in 1st page of google for most of my keywords. Couple of weeks back we did some update to the site and moved it to new hosting and from then onwards I dont see my site home page in Google ranking . My Website Name is : royalevents.com.au. It used to be in 1st of Google for keywords like wedding Mandaps, Indian Wedding Mandaps etc, Would be great if some one helps us to figure out whats gone wrong .. I also did Webmaster Fetch as Google but nothing happened. Thanks
Moz Pro | | Verve-Innovation0 -
Seomoz crawl: 4XX (Client Error) How to find were the error are?
I got eight 404 errors with the Seomoz crawl, but the report does not says where the 404 page is linked from (like it does for dup content), or I'm I missing something? Thanks
Moz Pro | | PaddyDisplays0 -
I can't find any inbound links that I know I have on open site, ahref or majestic. Does anyone know why?
Since the linkscape update before Christmas I have built a couple of links to fairly high quality sites. I can't find them on open site though or the other tools I mentioned. I'm a bit concerned there is an issue with my site. Does anyone have any idea why? I'm stumped. My site: www.emporiumofmanliness.co.uk
Moz Pro | | EmpofMan0 -
Difference in data between http://pro.seomoz.org/tools/keyword-difficulty and http://lsapi.seomoz.com/linkscape/url-metrics/
Hi, Has any once else experienced any difference in data between http://lsapi.seomoz.com/linkscape/url-metrics/ and http://pro.seomoz.org/tools/keyword-difficulty Please look at the attached image. For "http://www.webmd.com/diet/guide/choosing-weight-loss-program" and "http://www.freedieting.com/" page authority and domain authority match exactly. But for "http://www.fitnessmagazine.com/weight-loss/plans/" data does not match. The data from "http://lsapi.seomoz.com/linkscape/url-metrics/" was retrieved brely 60 seconds latter after data from "http://pro.seomoz.org/tools/keyword-difficulty". We used our custom app for retrieve data from "http://lsapi.seomoz.com/linkscape/url-metrics/". The columns were matched against the specs given in "http://apiwiki.seomoz.org/w/page/13991153/URL-Metrics-API". We are retrieving following columns 1)ut(Title) 2)ueid(External Links) 3)uid(Links) 4)umrp(mozRank) 5)upa(Page Authority) 6)pda(Domain Authority) Any help will be greatly appreciated. zvFif.jpg
Moz Pro | | claytons0 -
Handling long URLs and overly-dynamic URLs on eCommerce site
Hello Forum, I've been optimizing an eCommerce site and our SEOmoz crawls are favorable for the most part, except for long URLs and overly-dynamic URLs. These issues stem from two URL types: Layered navigation (faceted search) and non-Google internal search results. I outline the issues for each below. We use an SEO-friendly URL structure for our product category pages, but once bots start "clicking" our layered navigation options, all the parameters are appended to our SEO-friendly urls, causing the SEOmoz crawl warnings. Layered Navigation :
Moz Pro | | pano
SEO-Friendly Category Page: oursite.com/shop/meditation-cushions.html Effects of layered navigation: oursite.com/shop/meditation-cushions.html?bolster_material_quality=414&bolsters_appearance=206&color=12&dir=asc&height=291&order=name As you can see the parameters include product attributes and page sorts. I should note that all pages generated by these parameters use the element to point back to the SEO-friendly URL We have also set up Google's Webmaster Tools to handle these parameters. Internal Search Function:
Our URLs start off simple: oursite.com/catalogsearch/result/?q=brown. Then the bot clicks all the layered navigation options, yielding oursite.com/catalogsearch/result/index/?appearance=54&cat=67&clothing_material=83&color=12&product_color=559&q=brown. Also, all search results are set to noindex,follow. My question is: Should we worry about these overly-dynamic and long ULR warnings? We have set up canonical elements, "noindex,follow" solutions, and configured Webmaster Tools to handle our parameters. If these are a concern, how would you resolve these issues?0 -
Open Site Explorer Missing URL's
I see a link to my site on a couple different url's, but they are not listed in OSE. The links have been active for a long time too. Does OSE not track all inbound links from all sites? Thanks, Stephen
Moz Pro | | stats440