Automated checking for broken links within content pieces
-
Hi, I am wondering if anyone can send me in the right direction on a system suggestion.
We have currently grown out amount of content pieces on our website and our manual checking if the links in the content pieces are still 200 status is becoming extremely time consuming. Does anyone have a recommendation of a system that will crawl your pages and check both the internal and external links within the content for a status code (404,200,etc)? Preferably something server side so it can just run on a schedule but really anything would be fine.
I have tried things like Screaming frog, etc and it just doesn't seem to be the right tool.
-
Try ScreamingFrog again Jonathan, it works great for these kind of things and should also be able to solve your use case.
-
Jonathan, I'm not sure why you're saying that Screaming Frog isn't the right tool--we use it with great success to check the internal links on the site. There are other tools that you can use, such as Integrity (on a Mac), or Xenu, which is an older link checker but still works.
-
Have you tried http://www.link-assistant.com/website-auditor/ as it checks for broken links and can be scheduled to run automatically. You can sit it on your own server or something like AWS. We ran it on a free instance of AWS for quite a while before upgrading and never had issues. We upgraded as we run quite a bit of software on there - still isn't huge costs involved.
Hope this helps!
Matt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Social engineering Content detected
Hi there I am Facing Social Engineering Content Detected on www.domain.com from long time, we have Removed All Bad Java script, unnecessary links, bad content. After removing we Did Review also in Google, But still again & again we are getting this Notification in webmaster, is This harmful for Our web traffic?? how do I permanently Clear This Notification ? please any body can help ? Thanx in advance
On-Page Optimization | | iepl20 -
Duplicate Content on Category Pages
Hi Everyone, I have a few category pages within a category for my eCommerce store and I've recently started writing a short description for each. However a lot of these paragraphs can be replicated for the same category. For instance '1 Inch thickness' I'll show all the information, and it'll be very similar to '2 inch thickness' but obviously one is 1 inch and one is 2 inch so I would only be changing one keyword and that is the thickness. I feel that this is helping customers because it has all the information in each category e.g. how to filter your choices. But it might be duplicate content. What would you recommend?
On-Page Optimization | | EcomLkwd0 -
Two different domains with the same content
Hi all, I just figured out that my client has two different domains with the same content, so the site will be penalized by Google: www.piensapiensa.es www.piensapiensa.com Should I do a 301 redirection from one domain to the other one using the htaccess file as usual chosing one of them as canonical? redirect 301 piensapiensa.es www.piensapiensa.com? Thanks.
On-Page Optimization | | juanmiguelcr0 -
Page content length...does it matter?
As I begin developing my website's content, does it matter how long or short the actual text found in the is? I heard someone say before "a minimum of 250 words", but is that true? If so, what is the maximum length I should use?
On-Page Optimization | | wlw20090 -
Prevent indexing of dynamic content
Hi folks! I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding: dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a: within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page? Dino
On-Page Optimization | | SCW0 -
Do NoFollow links still split link equity?
So I realize that Google will split link equity between all links on any given page. Example, if a landing page has 10 links then the authority from the landing page is split into 10 and each link given its own smaller amount of equity from that landing page. My question is if I were to turn 9 of the 10 links on this page to NoFollow links would the equity still remain split 10 ways or would it simply pass all of it to the one DoFollow link left on the page?
On-Page Optimization | | PageOnePowerGang0 -
Internal links
Hi, I am working on a project with very few really important pages - its now about 20 pages and this number will not increase to more than 30 by the end of the year. Today the majority of these Pages is 2 clicks away from the Homepage for navigational reasons. Now I am wondering: 1. If I should set links from the Homepage directly to these Pages to pass more juice?
On-Page Optimization | | Naturalmente
2 If you would suggest so - how would you recommend me to do this in Terms of usability and SEO? thanks in advance for your help.0 -
Finding 404 links
My first campaign is nearly complete and I have like 40 - 404's. How can I find where these links are on my site? We recently moved over to WP from html and that's what happened.
On-Page Optimization | | azguy0