Automated checking for broken links within content pieces
-
Hi, I am wondering if anyone can send me in the right direction on a system suggestion.
We have currently grown out amount of content pieces on our website and our manual checking if the links in the content pieces are still 200 status is becoming extremely time consuming. Does anyone have a recommendation of a system that will crawl your pages and check both the internal and external links within the content for a status code (404,200,etc)? Preferably something server side so it can just run on a schedule but really anything would be fine.
I have tried things like Screaming frog, etc and it just doesn't seem to be the right tool.
-
Try ScreamingFrog again Jonathan, it works great for these kind of things and should also be able to solve your use case.
-
Jonathan, I'm not sure why you're saying that Screaming Frog isn't the right tool--we use it with great success to check the internal links on the site. There are other tools that you can use, such as Integrity (on a Mac), or Xenu, which is an older link checker but still works.
-
Have you tried http://www.link-assistant.com/website-auditor/ as it checks for broken links and can be scheduled to run automatically. You can sit it on your own server or something like AWS. We ran it on a free instance of AWS for quite a while before upgrading and never had issues. We upgraded as we run quite a bit of software on there - still isn't huge costs involved.
Hope this helps!
Matt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Do follow links
Is it good(according to SEO) to give dofollow rel to external links of highly trusted websites from our content?
On-Page Optimization | | Obbserv0 -
Internal and Link Juice Analysis - Too Many Links Error
Howdy! I have an analysis question related to internal links/link juice. Here is the general link set up of our site: 1. All Site Pages (Including Home Page): We have drop down "mega" menus in the header of everypage linking to various sub-categories on the site. So, because of this, in our header, we have a few hundred links to various pages on our site and these show up on every page of the site. 2. Product Pages: Header pages as mentioned above, but on top of that, we list out the keywords for that particular product and each keyword is linked back to our search results pages for that particular keyword. In General Moz is telling us we are having between 200-300 links on each product page. Currently, our Search Results pages are ranking higher and showing up in search more than our actual product pages. So, based on the above info, here are some thoughts: 1. Should we ajax in the Header links so that they aren't showing up for the search engines? Or, should we ajax them in only on all pages that are not the Home Page? 2. Should we get rid of the keyword links back to the Search Results pages that are on the product pages? What effect would these changes "actually" have? Does this just improve crawling? Or are there other positive results that would come of changes like these? We have hundreds of thousands of products, so if we were to make changes like these, could we experience negative results? Thanks for your help! Craig
On-Page Optimization | | TheCraig0 -
Page rank check
Hello everyone, How long should I wait to see if page rank for optimized pages have improved? cheers
On-Page Optimization | | PremioOscar0 -
Duplicate Content Again
Hello Good People. I know that this is another duplicate post about duplicate content (boring) but i am going crazy with this.. SeoMoz crawl and other tools tells me that i have a duplicate content between site root and index.html. The site is www.sisic-product.com i am going crazy with this... the server is IIS so cannot use htaccess please help... thanks
On-Page Optimization | | Makumbala0 -
Internal link question
Hello, I was wondering if internal links should be full urls? for instance my coder might put "page-title.html" but I was wondering if its better for seo to have it be the full url "http://www.blah-blah.com/page-title.html" thanks in advance..... I love this place!
On-Page Optimization | | Superflys1 -
Duplicate Content - Potential Issue.
Hello, here we go again, If I write an article somewhere, lets say Squidoo for instance, then post it to my blog on my website will google see this as duplicate content and probably credit Squidoo for it or is there soemthing I can do to prevent this, maybe a linkk back to Squidoo from my website or a dontfollow on my website? Im not sure so any help here would be great, Also If I use other peoples material in my blog and link back to them, obviously I dont want the credit for the original material I am simply collating some of this on my blog for others to have a specific library if you like. Is this going to damage my websites reputation? Thanks again peeps. Craig Fenton IT
On-Page Optimization | | craigyboy0 -
Prevent indexing of dynamic content
Hi folks! I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding: dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a: within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page? Dino
On-Page Optimization | | SCW0 -
Site Wide Link
I have just run up the link explorer on my site and discovered that every page home page link points back with the text home - I assume this is bad in terms of SEO , my site name is ccie and I assumed that it put the site wide link of ccie to the entire site, however it seems to be the breadcrumb default of home which is doing it/. www.rogerperkin.co.uk/ccie Should I be looking to change this so my top keyword points back from each page to the home page. I am running wordpress and assumed the site name was the home link on all pages. Can anyone advise the best practice? Thanks
On-Page Optimization | | rogerp0070