Automated checking for broken links within content pieces
-
Hi, I am wondering if anyone can send me in the right direction on a system suggestion.
We have currently grown out amount of content pieces on our website and our manual checking if the links in the content pieces are still 200 status is becoming extremely time consuming. Does anyone have a recommendation of a system that will crawl your pages and check both the internal and external links within the content for a status code (404,200,etc)? Preferably something server side so it can just run on a schedule but really anything would be fine.
I have tried things like Screaming frog, etc and it just doesn't seem to be the right tool.
-
Try ScreamingFrog again Jonathan, it works great for these kind of things and should also be able to solve your use case.
-
Jonathan, I'm not sure why you're saying that Screaming Frog isn't the right tool--we use it with great success to check the internal links on the site. There are other tools that you can use, such as Integrity (on a Mac), or Xenu, which is an older link checker but still works.
-
Have you tried http://www.link-assistant.com/website-auditor/ as it checks for broken links and can be scheduled to run automatically. You can sit it on your own server or something like AWS. We ran it on a free instance of AWS for quite a while before upgrading and never had issues. We upgraded as we run quite a bit of software on there - still isn't huge costs involved.
Hope this helps!
Matt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Use External Links
Hey 🙂 I noticed when analysing my pages that Moz gives the following advice about adding external links to my articles; "On any page specifically targeting a keyword, link externally to at least one (and possibly more than one) relevant, trusted resources as a best practice." As a small business I work pretty damn hard to get visitors to my website, so why on earth would I want to go to all that trouble just to send them away again to a trusted resouce? Secondly, what exactly is a "trusted resource"? Can I simply use search and use the top competitor, for example Moz or Wikipedia and does the anchor need to be an exact match or will a partial suffice. I say this because I already have the top spot for my longtail, so an exact match would be pointless. And lastly, I notice that pretty much all quality sites use external links to open in the same window i.e. not target=_blank, I never thought of it before today, but now that I'm considering using external linking in my articles I guess it's important to know the answer - i.e. Is this a best practice and does this give any seo benefit? Cheers, Lee :)
On-Page Optimization | | LeeC0 -
Duplicate content penalty
when moz crawls my site they say I have 2x the pages that I really have & they say I am being penalized for duplicate content. I know years ago I had my old domain resolve over to my new domain. Its the only thing that makes sense as to the duplicate content but would search engines really penalize me for that? It is technically only on 1 site. My business took a significant sales hit starting early July 2013, I know google did and algorithm update that did have SEO aspects. I need to resolve the problem so I can stay in business
On-Page Optimization | | cheaptubes0 -
Duplicate content on events site
I have an event website and for every day the event occurs the event has a page. For example: The Oktoberfest in Germany the event takes 16 days. My site would have 16 (almost)identical pages about the Oktoberfest(same text, adres, photos, contact info). The only difference between the pages is the date mentioned on the page. I use rich snippets. How does google treat my pages and what is the best practice.
On-Page Optimization | | dragonflo0 -
How to solve duplicate content issue???
I have 5 websites with different domain names, every website have same content, same pages, same website design. Kindly let me know how to solve this issue.
On-Page Optimization | | ross254sidney0 -
Pagination on related content within a subject
A client has come to us with new content and sections for their site. The two main sections are "Widget Services" - the sales pages, and "Widget Guide" - a non-commercial guide to using the widgets etc. Both the Services and Guide pages contain the same pages (red widgets, blue widgets, triangle widgets), and - here's the problem - the same first paragraph. i.e. ======== Blue widget services Blue widgets were invented in 1906 by Professor Blue. It was only a coincidence that they were blue. We stock a full range of blue widgets, we were voted best blue widget handler at widgetcon 2013. Buy one now See our guide to blue widgets here Guide to blue widgets Blue widgets were invented in 1906 by Professor Blue. It was only a coincidence that they were blue. The thing about blue widgets as they're not at all like red widgets at all. For starters, they're blue. Find more information about our blue widgets here ======== In all of these pages, the first paragraph is ~200 words and provides a great introduction to the subject, and the rest of the page is 600-800 words, making these pages unique enough to justify being different pages. We want to deal with this by declaring each page as a paginated version of a two page article on each type of widget (using rel=prev/next). Our thinking is that Google probably handles introuctions/headers on paginated content in a sensible way. Has anyone experienced this before? Is there any issues on using rel="prev" and rel="next" when they're not strictly paginated?
On-Page Optimization | | BabelPR0 -
Help: my WordPress Blog generates too many onpage links and duplicate content
I have a WordPress Blog since November last year (so I'm pretty new to WordPress) and the effects on ranking for some keywords are really good. So I thought tag clouds are good. Crawl Diagnostics tell me now that I have too many onpage links for example my author page breaks the record: 256
On-Page Optimization | | inlinear
http://inlinear.com/blog/author/inlinear/ I think thats because there are links for each word in the tag cloud generated ... On this page (and many other pages) WordPress displays (teasers) the beginning of each post (read more ...) producing duplicate content and even new canonical tags.... The page titles are also too long because I installed "All in One SEO Pack" and now this plugin and wordpress itself mixes titles together ... But what can I do to avoid all this. Is there a PlugIn that can help... I think millions of blogs will have the same problems... I my blog yet has very few content. Thanks for your answers :))0 -
Issue: Duplicate Page Content
For duplicate page content, how different should pages be? For example, I have seven locations and on each location page, we offer a discount. The discounts are the same currently and open into a pop-up window. So it looks something like this: mysite.com/locationA/dicount mysite.com/locationB/discount mysite.com/locationX/discount The pages are identical. Should I change the verbiage on each page or let it be? I noticed that our organic search rankings have dropped since our site upgrade and this is one item that SEOMOZ has noted. Thanks! DHO
On-Page Optimization | | DougHoltOnline0 -
Does putting content in tabs devalue it at all?
Hello! Still very new to the SEO world and just trying to soak in as much information as I can. The site I work for took a substantial hit with the panda update, so we are looking into adding as much quality content as we can in the upcoming months. With our current site layout, space will quickly become an issue. Assuming the content is relevant and useful for the page, will putting the content into tabs be counter productive or devalue it at all?
On-Page Optimization | | davegtt0