Broken Inner Links - Tool Recommendations?
-
Do you have any recommendations for tools that scan an entire website and report broken inner links?
I run several UGC centered websites and broken inner links, and external, is an issue.
Being that these websites are several hundred thousand pages large, I am not really all that excited about running software on my desktop (xenu link sleuth for example). Any online solutions you could recommend would be great!
-
If it happens to be a wordpress site, there is a plugin called something like "Broken Link Checker." If I recall correctly, that checks internal and outbound links. Otherwise, not too sure.
-
Ideric, did any of these suggestions answer your questions, or have you been able to otherwise find a tool for this? I know others would find the information useful.
At a previous company, we had a custom-written solution to check external links, and made it check response headers until a 200 OK showed, or it got five levels deep. What we'd often find is that we'd have a 301 for an external link, and it'd go from non-www to www. Wouldn't necessarily worry about fixing that, but then later realized that from there, the www link was a 404, OR went to a 200 OK category landing page that said "we've reorganized our site, search here for that individual resource".
-
Well you've found the best solution right here at SEOmoz! Instead of wasting time learning new systems to find out if they'll work or not, just solve your problem. Sign with PRO Elite and you can crawl 100,000 pages.
-
I have used this in the past http://www.auditmypc.com/free-sitemap-generator.asp - (Click on the image in the top right of the instructions) a free tool for site map generation that will show broken internal links in the process. I don't think it has any limits to it, although I have not tried it on a site as large as you are suggesting. Just ensure you are not logged into your site when you run it. Although Google webmaster tools is ok, you can't verify changes made very quickly.
-
I think Xenu is your best option here. The size of the site nearly cuts out the chance a web tool could handle it.
Just recently on a site review I had to run Xenu on a site with 160,000 pages. It only took 4 hours running at 30 threads to complete. Any modern PC should handle it fine.
-
WMT is alright, apart from the fact you can't force Google to crawl all your pages. I would doubt that even a majority of the pages were crawled and indexed by Google (though I don't know what the site is).
Plus, as you say, it only deals with internal links and 404s coming in.
Do you know what the upper limit is on how many crawl errors WMT will display?
-
I might be wrong, but I think Google WMT can accomplish this with ease. I'm looking at 1000 right now. Externally you'll probably have to use xenu =/
-
You might be out of luck on a site that size.
I think WebCEO can do this with their online version but to get 100,000 urls crawled I think it'll cost you a bomb (the sort of money that it'd be cheaper to buy a second PC to run Xenu, lol).
Anyway - http://www.webceo.com/ - I think it may also be possible to install the download version to a server and run it that way.
-
I use Google webmaster tools. Go to diagnostics, then crawl errors.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Disavow links and domain of SPAM links
Hi, I have a big problem. For the past month, my company website has been scrape by hackers. This is how they do it: 1. Hack un-monitored and/or sites that are still using old version of wordpress or other out of the box CMS. 2. Created Spam pages with links to my pages plus plant trojan horse and script to automatically grab resources from my server. Some sites where directly uploaded with pages from my sites. 3. Pages created with title, keywords and description which consists of my company brand name. 4. Using http-referrer to redirect google search results to competitor sites. What I have done currently: 1. Block identified site's IP in my WAF. This prevented those hacked sites to grab resources from my site via scripts. 2. Reach out to webmasters and hosting companies to remove those affected sites. Currently it's not quite effective as many of the sites has no webmaster. Only a few hosting company respond promptly. Some don't even reply after a week. Problem now is: When I realized about this issue, there were already hundreds if not thousands of sites which has been used by the hacker. Literally tens of thousands of sites has been crawled by google and the hacked or scripted pages with my company brand title, keywords, description has already being index by google. Routinely everyday I am removing and disavowing. But it's just so much of them now indexed by Google. Question: 1. What is the best way now moving forward for me to resolve this? 2. Disavow links and domain. Does disavowing a domain = all the links from the same domain are disavow? 3. Can anyone recommend me SEO company which dealt with such issue before and successfully rectified similar issues? Note: SEAGM is company branded keyword 5CGkSYM.png
Technical SEO | | ahming7770 -
Linking to CMS page ID
Hi all, Is it that detrimental to SEO if you link to the CMS page ID of a URL rather than the text URL of a page even if when you look at the source code Google sees it as a text URL? Thanks! 🙂
Technical SEO | | Diana.varbanescu0 -
Why my external links are zero
What could be the possibility that my Moz crawler showing zero external link for my website http://ultimatecharter.com, i have build many links from different website and when i click them it goes to the website. My website is multi language and the landing page is http://ultimatecharter.com/en/home can this be a possible issue? regards Aqeel
Technical SEO | | Aqeel0 -
Broken Instant Preview on SERPS
When I check on my SERPS the preview is all mixed up. The images are over each other and it is all like broken. What could cause the problem? Thank you very much in advance for any help! The website is http://villasdiani.com
Technical SEO | | VillasDiani0 -
Persistent Unnatural Links in Webmaster tools
We recently were notified about unnatural links from two websites (totalling a few thousands links each). We went to the websites and asked them to remove the links, which they apparently did. After this we applied for reconsideration to Google, explaining the situation, however they came back and said we still have links. We noticed there were still links, however there were less than before, and so we once again asked the sites to remove all the links. Now we are sure all the links are gone as when we click a random link and view the page source there is no reference to our site, however WebMaster tools is not updating the link list, claiming we still have thousands of links. Do we have to apply for another reconsideration request to get them to re-crawl the sites to get rid of the links, or should it happen automatically?
Technical SEO | | eXia0 -
Redirect not picking up any link juice
Hi, We recently had a domain name change, as we had an established site we had all pages redirected to the new domain. This was over a month ago but despite the redirect SEOmoz doesn't recognise any links to and from the site. Is this due to simply time duration and SEOmoz can't pick up on any redirected info, or could there be a problem with the redirect? Thanks, Adam
Technical SEO | | adamgthorndike0 -
Linking from and to pages
My website, www.kamperen-bij-de-boer.com, tells people what campingssites can be found in The Netherlands for recreational purposes. In order for a campingsite to be mentioned on our website we ask them to place a link to our website (either using a text link or image link) and then we make a page for that campsite on our website with in the end a link to ther website, e.g. http://www.kamperen-bij-de-boer.com/Minicamping-In-t-Oldambt.html -> they in return link back to us. Since this comes natural will this or won't this be penalized by Google and so on for linkfarming. At this moment we have about 600 camping sites on our website alone linking to us (not all of them) and we are linking to them. Since this can be explained as link trading which is not as good for your ranking as one-way-linking what should be wise? Should i include a nofollow? I already have many links from other sites linking to mine without having to link back, is there anything else i can do with linking to ensure better ranking?
Technical SEO | | JarnoNijzing0 -
Too Many On-Page Links
Hello. My Seomoz report this week tells me that I have about 500 pages with Too Many On-Page Links One of the examples is this one: https://www.theprinterdepo.com/hp-9000mfp-refurbished-printer (104 links) If you check, all our products have a RELATED products section and in some of them the related products can be UP to 40 Products. I wonder how can I solve this. I thought that putting nofollow on the links of the related products might fix all of these warnings? Putting NOFOLLOW does not affect SEO?
Technical SEO | | levalencia10