Broken Inner Links - Tool Recommendations?
-
Do you have any recommendations for tools that scan an entire website and report broken inner links?
I run several UGC centered websites and broken inner links, and external, is an issue.
Being that these websites are several hundred thousand pages large, I am not really all that excited about running software on my desktop (xenu link sleuth for example). Any online solutions you could recommend would be great!
-
If it happens to be a wordpress site, there is a plugin called something like "Broken Link Checker." If I recall correctly, that checks internal and outbound links. Otherwise, not too sure.
-
Ideric, did any of these suggestions answer your questions, or have you been able to otherwise find a tool for this? I know others would find the information useful.
At a previous company, we had a custom-written solution to check external links, and made it check response headers until a 200 OK showed, or it got five levels deep. What we'd often find is that we'd have a 301 for an external link, and it'd go from non-www to www. Wouldn't necessarily worry about fixing that, but then later realized that from there, the www link was a 404, OR went to a 200 OK category landing page that said "we've reorganized our site, search here for that individual resource".
-
Well you've found the best solution right here at SEOmoz! Instead of wasting time learning new systems to find out if they'll work or not, just solve your problem. Sign with PRO Elite and you can crawl 100,000 pages.
-
I have used this in the past http://www.auditmypc.com/free-sitemap-generator.asp - (Click on the image in the top right of the instructions) a free tool for site map generation that will show broken internal links in the process. I don't think it has any limits to it, although I have not tried it on a site as large as you are suggesting. Just ensure you are not logged into your site when you run it. Although Google webmaster tools is ok, you can't verify changes made very quickly.
-
I think Xenu is your best option here. The size of the site nearly cuts out the chance a web tool could handle it.
Just recently on a site review I had to run Xenu on a site with 160,000 pages. It only took 4 hours running at 30 threads to complete. Any modern PC should handle it fine.
-
WMT is alright, apart from the fact you can't force Google to crawl all your pages. I would doubt that even a majority of the pages were crawled and indexed by Google (though I don't know what the site is).
Plus, as you say, it only deals with internal links and 404s coming in.
Do you know what the upper limit is on how many crawl errors WMT will display?
-
I might be wrong, but I think Google WMT can accomplish this with ease. I'm looking at 1000 right now. Externally you'll probably have to use xenu =/
-
You might be out of luck on a site that size.
I think WebCEO can do this with their online version but to get 100,000 urls crawled I think it'll cost you a bomb (the sort of money that it'd be cheaper to buy a second PC to run Xenu, lol).
Anyway - http://www.webceo.com/ - I think it may also be possible to install the download version to a server and run it that way.
-
I use Google webmaster tools. Go to diagnostics, then crawl errors.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disavow links and domain of SPAM links
Hi, I have a big problem. For the past month, my company website has been scrape by hackers. This is how they do it: 1. Hack un-monitored and/or sites that are still using old version of wordpress or other out of the box CMS. 2. Created Spam pages with links to my pages plus plant trojan horse and script to automatically grab resources from my server. Some sites where directly uploaded with pages from my sites. 3. Pages created with title, keywords and description which consists of my company brand name. 4. Using http-referrer to redirect google search results to competitor sites. What I have done currently: 1. Block identified site's IP in my WAF. This prevented those hacked sites to grab resources from my site via scripts. 2. Reach out to webmasters and hosting companies to remove those affected sites. Currently it's not quite effective as many of the sites has no webmaster. Only a few hosting company respond promptly. Some don't even reply after a week. Problem now is: When I realized about this issue, there were already hundreds if not thousands of sites which has been used by the hacker. Literally tens of thousands of sites has been crawled by google and the hacked or scripted pages with my company brand title, keywords, description has already being index by google. Routinely everyday I am removing and disavowing. But it's just so much of them now indexed by Google. Question: 1. What is the best way now moving forward for me to resolve this? 2. Disavow links and domain. Does disavowing a domain = all the links from the same domain are disavow? 3. Can anyone recommend me SEO company which dealt with such issue before and successfully rectified similar issues? Note: SEAGM is company branded keyword 5CGkSYM.png
Technical SEO | | ahming7770 -
Paid Links - How does Google classify them?
Greetings All, I have a question regarding "Paid Links." My company creates custom websites for other small businesses across the country. We always have backlinks to our primary website from our "Dealer Sites." Would Google and other search engines consider links from our "dealer sites" to be "paid links?" Example:
Technical SEO | | CFSSEO
http://www.atlanticautoinc.com/ is the "dealer site." Would Google consider the links from Atlantic Auto to be a "paid link," and therefor have less of an impact for page rankings, due to it not being organic? Any insight on this matter would be greatly appreciated. Thank you!!!0 -
Deeper Anchor Link Finding Tool?
Hi! We are still working on removing some paid backlinks from an old SEO company. Used Removeem.com which uses OpenSiteExplorer. Thought we had them all cleaned up. Removed many and disavowed the rest. Submitted a reconsideration request, but our amigos at Google said, "No, no no...." So, the example offending remaining link they gave us was xyzs.blogspot.com. However, the anchor text link that Open Site Explorer was just xyzs.com. Interestingly enough, Google told us to look at their list of links and this sample links was not even there. (neither version.) So, we signed up for majesticseo as it integrates wtih removem via api if you have an account, but that didn't really add any links. Is there a deeper tool we can use? Any ideas on how to locate some of these hard to find anchor links that Google is talking about? Any other tools out there? Thanks!! Craig
Technical SEO | | TheCraig0 -
Linking without loosing link equity.
Hi, I was wondering if anyone had a solution to linking without loosing link equity? From what I have read using 'no follow' on both internal and external links DOES NOT pass any equity across the link to the link target, but also, the latest thought goes that it DOES loose link equity (as if it were a FOLLOW' link). So is there a method of retaining link equity using another method? Thanks
Technical SEO | | James770 -
Internal links of my website is taken as inbound link ?
Hi, I was checking my links in Open Site Explorer (http://www.opensiteexplorer.org/links?site=www.bons-plans-vacances.fr) this morning and i came up with this: My main domain is taken as outbound links ...! This link : www.bons-plans-vacances.fr/ Anchor Text : (img alt)100% Bons Plans Voyages From this URL : www.bons-plans-vacances.fr/ I have the same problem with my subdomains : voyage.bons-plans-vacances.fr/sejour/Toutes-Destinations I have that HTML code on the link : Any help ? This is very strange .. i have the same result in google webmaster tools. Thanks 🙂 eDE9b.jpg
Technical SEO | | BonsPlansvacances0 -
Too Many On Page LInk
The analysis of my site is showing that I have a problem with too many on-page links. Most of this is due to our menu, and wanting users to be able to quickly get to the shopping category they are looking for. We end up with over 200 links in order to get the menu we want. How are other people dealing with a robust menu, but avoiding getting dinged for too many links? One of our pages in question is: http://www.milosport.com/category/2176-snowboards.aspx
Technical SEO | | dantheriver0 -
Webmaster tools question
Hello i have a doubt. in my webmaster tools my sitemap is showing like this | /sitemap.xml | OK | Images | Nov 27, 2011 | 2,545 | 1,985 | i am not sure why the type is showing like Images i have one blog attached to the same webmaster account and it is showing correctly.. | /blog/sitemap.xml | OK | Sitemap | Nov 28, 2011 | 695 | 449 |
Technical SEO | | idreams0 -
How Does Link Juice Pass?
Say there is a link on an authoritative site to my site, and the link points to www.mysite.com. However, I have set all URL variations (https://mysite.com, www.mysite.com, mysite.com, etc.) to redirect to http://mysite.com automatically. Does the link juice from this authoritative site pass through the www.mysite.com URL to http://mysite.com automatically due to the automatic redirect? I guess my question is does the link juice automatically pass on to the destination URL, even though it is not the original URL the authoritative site pointed to?
Technical SEO | | NiallTom0