Where does the crawler find the urls?
-
The SEO Moz crawler has found a number of 500 error pages, and 404s etc which is very useful
however some of the urls are weird/broken formats we don't recognise and nobody remembers ever using - not weird enough to imply hacking, but something broken in the CMS
Is there anyway to find out where the crawler found these urls? I can patch up and redirect the end result as best I can but I would prefer to fix plug the leak
thanks
-
If you export the crawl diagnostics to a CSV, we do have this information in the last column.
-
thanks for the tips. It is a little frustrating that the information I need has passed through seomoz's system but I guess they don't have the inclination or resources to show us the info
Xenu reckons it can handle 1m urls, we are in the position of not really knowing how many pages our site has!
-
You can pop the links into the free Xenu Link Sleuth* - after you've done a crawl just right-click on the URL you're interested in and click 'URL Properties' - you'll see any inlinks it finds listed there. Depending on the size of your site, it could take a while for the crawl to complete.
You could try the link: property in Google first, though it won't be as thorough as Xenu.
*If you haven't seen it before, don't worry about how the Xenu website looks - the software is kosher - as recommended by many SEOmoz staff. Screaming Frog is a paid alternative (with a limited free version).
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Question on changing URL structures
We have a lot of "Long URL" errors in moz, and our URLs have no helpful format to them. For example, our blog URLs currently have a date and the title so they end up looking like this: blog/year/month/day/long title that never ends. If I were to setup a new URL structure using MOZ best practices, can I just make the change going forward and redirect a few high trafficked links to this new structure? Or do I really need to make the change for the website (specifically blog) as a whole to see a positive impact? I know this means there may be an initial drop in traffic which I'd like to avoid.
Moz Pro | | ETerika0 -
Source page showsI have 2 h1 tags on my page. I can only find one.
When I grade my page it says I have more than one h1 tag. I view the source page and it shows there are two h1 headings with the same wording. If I delete the one h1 heading I can find, the page source shows I have deleted both of them. I don't know how to get to the other heading to delete it. And I'm off page one of google! Can anybody help? Clay Stephens
Moz Pro | | Coot0 -
Is there a way for me to find out how a keyword would rank if it were on a specific site?
Is there a way for me to find out how a keyword would rank if it were on a specific site? For example, lets say that XYZ.com does not have the keyword "ABC". Is there a way for me to find out how the keyword "ABC" would rank if it were on XYZ.com?
Moz Pro | | TurboH0 -
Rogerbot's crawl behaviour vs google spiders and other crawlers - disparate results have me confused.
I'm curious as to how accurately rogerbot replicates google's searchbot I've currently got a site which is reporting over 200 pages of duplicate/titles content in moz tools. The pages in question are all session IDs and have been blocked in the robot.txt (about 3 weeks ago), however the errors are still appearing. I've also crawled the page using screaming frog SEO spider. According to Screaming Frog, the offending pages have been blocked and are not being crawled. Webmaster tools is also reporting no crawl errors. Is there something I'm missing here? Why would I receive such different results. Which one's should I trust? Does rogerbot ignore robot.txt? Any suggestions would be appreciated.
Moz Pro | | KJDMedia0 -
Why do crawlers still track meta keywords if it is not needed in my site?
I have crawled three sites already and it returns more than 5000 errors most of which are MIssing Meta Keywords tags. The sites are on Wordpress and using my SEO plugin I can easily edit the meta keywords of each page, but I am having second thoughts. Well should I?
Moz Pro | | jernest0020 -
Ways to Find Potential Backlink Providers
We are doing a backlink campaign for a personal development training company that is a pretty good authority with lots of good long articles. So far I found competitors ranking for our keywords - about 10 of them - and analyzed their backlinks We found about 40 possible backlink providers. What's the next step to find more? I know 2 possibilities so far: do inurl commands in google to find related sites with resource sections - how do I do this? Analyze the backlinks of the sites that make up competitor's backlinks.] - how do I organize this huge task? What else is there?
Moz Pro | | BobGW0 -
Can someone explain why I have been seeing an increase in the number of Linking Page URLs in OSE that link directly to downloads?
Ever since the last couple Linkscape updates when doing competitive back link analysis I have noticed a large increase in the number of URLs of Linking Pages in OSE that result in an immediate file download. The majority of the time these downloads are not common files ie PDF, DOC files. For example, these were all in a competitors back link profile: http://download.unesp.br/linux/debian/pool/main/i/isc-dhcp/isc-dhcp-relay-dbg_4.1.1-P1-17_ia64.deb http://snow.fmi.fi/data/20090210_eurasia_sd_025grid.mat http://www.rose-hulman.edu/class/me/HTML/ES204_0708_S/working model examples/Le25 mad hatter.wm?a=p&id=145880&g=5&p=sia&date=iso&o=ajgrep These are just a few I came across for a single competitor. Is this sketchy black hat SEO, some sort of error, actual links, or something else? Any information on this subject would be helpful. Thank you.
Moz Pro | | Gyi0 -
How do I delete a url from a keyword campaign
I have a couple of urls that are associated with the keywords in my campaign. They are no longer valid so how do I remove them?
Moz Pro | | PerriCline0