Tool for scanning the content of the canonical tag
-
Hey All,
question for you. What is your favorite tool/method for scanning a website for specific tags? Specifically (as my situation dictates now) for canonical tags?
I am looking for a tool that is flexible, hopefully free, and highly customizable (for instance, you can specify the tag to look for). I like the concept of using google docs with the import xml feature but as you can only use 50 of those commands at a time it is very limiting (http://www.distilled.co.uk/blog/seo/how-to-build-agile-seo-tools-using-google-docs/).
I do have a campaign set up using the tools which is great! but I need something that returns a response faster and can get data from more than 10,000 links. Our cms unfortunately puts out some odd canonical tags depending on how a page is rendered and I am trying to catch them quickly before it gets indexed and causes problems. Eventually I would also like to be able to scan for other specific tags, hence the customizable concern. If we have to write a vb script to get it into excel I suppose we can do that.
Cheers,
Josh
-
No idea on that one - it's still pretty new. The developers actually chimed in on the post, so you could ask them in the comments.
-
Thanks Dr. Pete and Marcus.
I just finished reading the post. I have looked at Screaming Frog before but was hoping to be able to find a way to do it myself. Just didn't want to plop money down on something that seemed like it should be able to be done using tools I already had. But the software does look good. Any thought on if they will come out with a one time purchase instead of a yearly subscription?
Cheers!
Josh
-
Hey Dr. Pete, Joshua
I was just coming here to say that I had read the Dr. Pete post and this may do the job. It's a paid bit of a software but I will be picking it up later. I have my guys knocking up a canonical checker that will be free for all but that may take a day or so to get perfect.
Let me know if you have a play with Screaming Frog!
Marcus
-
I'm pretty sure that Screaming Frog SEO Spider will do it, but you need the paid version to custom-filter on the canonical tag. I've got a post going up about it tomorrow.
-
Great, really appreciate it! Many thumbs up
-
Hey Josh,
Right, cool. I have got a few jobs to sort out but I am going to have a bash at knocking this up this afternoon. Should be easy enough (he said, damning himself to hours of problems).
Leave it with me for 24 hours.
Marcus
-
Hey Marcus,
thanks for the quick response. That is exactly what I would be looking for. I do have a list of url's and that is also simple enough to get from something like xenu. Would love to work with you on this.
Thanks.
Josh
-
Hey, I am not aware of any such tool, but it should not be too hard to put one together, maybe a useful little tool as well.
If you have all of your pages in spreadsheet or database, it should be easy enough to write a little script that cycles through them.
Start Loop
-
request page
-
parse code to get canonical URL
-
compare page to canonical
-
output problem URLs
End Loop
Slightly over simplified and requires a list of all your URLs but would be willing to help put something like this together, could be useful for all of us, especially for those (like me) that work with a lot of CMS sites.
Cheers
Marcus
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate Content in WordPress Taxonomies & Noindex, Follow
Hello Moz Community, We are seeing duplicate content issues in our Moz report for our WordPress site’s Tag pages. After a bit of research, it appears one of the best solutions is to set Tag pages to “no index, follow” within Yoast. That makes sense, but we have a few questions: In doing this, how are we affecting our opportunity to show up in search results? Are there any other repercussions to making this change? What would it take to make the content on these pages be seen as unique?
Moz Pro | | CoreyHicks1 -
Links being reported in Webmaster Tools
Hi Are the Total Links To Your Site, as reported in GWT, purely external inbound links ? Since these links are usually, as far as i can tell, much higher in number than any other link reporting tool and hence, i presume, more accurate, why don't services such as Moz etc include this in reporting ? I know its just a total number and link quality is whats important not quantity, but i would have thought interesting to show in reporting in conjunction with link quality info such as is already reported. Since most backlink reporting tools do show a total but always much much lower than that reported in gwt (i think) All Best Dan
Moz Pro | | Dan-Lawrence0 -
Where is the link prospecting tool?
I feel like last year when Moz was still SEOMoz I used a beta tool that would help you build a bunch of queries for link prospecting. You would tell it what your sites about and a few other pieces of information, and it would give you a list of google queries to get you started (ie inurl:directory <industryname> +"add site"</industryname>). However, I'm having a hard time finding this tool, or any reference to it on the web. It's also possible that it wasn't a moz tool but I'm misremembering where I found it. Does anybody have any idea what I'm talking about?
Moz Pro | | brettgus1 -
How does SEOmoz pull its duplicate page title and content information?
I ask because I am getting errors based on URLs that do not even exist on our site. For example: http://www.robots.com/applications/abb/panasonic/robots this URL does not even exist for our site, but somehow it is listed in the error section of page title duplication tool. http://www.robots.com/applications/ exists, but there is no place to get to an ABB or a Panasonic robot from this page, not to mention an ABB/Panasonic (which for sure does not exist). ?? We have quite a few of these out there and just wondering how to find out where the link is coming from. When we checked our URLs through Integrity, links like the one listed above (which we had 29 of them listed) that do not show up. Thoughts? Thanks! Janelle
Moz Pro | | jwanner0 -
Keyword rankings tool not fully populated today?
I logged in today and my rankings are normally crawled on Thurs, and of my 66 keywords only 6 of them are showing any data - the rest are all blank. It's been going for weeks so it's not like it doesn't have it's initial data yet. I presume this is some issue or artifact that the results haven't been input yet. Can someone help answer this?
Moz Pro | | mlm12
Thanks,
Michelle0 -
Upper and lower case spelling = dupe content?
Hi All, I've looking at my Crawl Diagnostics Summary and working on getting my site errors down as low as possible. One thing I'm noticing is that in the "Other URLs" column I'm seeing a lot of 1s. When I click on the number, it is showing me the exact URL with an upper case category title. For example, it appears like it's telling me that these two URLs are considered duplicate content: http://mysite.com/Category http://mysite.com/category Is that right? Does google care about upper and lower case spelling?
Moz Pro | | shawn810 -
Broken Links and Duplicate Content Errors?
Hello everybody, I’m new to SEOmoz and I have a few quick questions regarding my error reports: In the past, I have used IIS as a tool to uncover broken links and it has revealed a large amount of varying types of "broken links" on our sites. For example, some of them were links on my site that went to external sites that were no longer available, others were missing images in my CSS and JS files. According to my campaign in SEOmoz, however, my site has zero broken links (4XX). Can anyone tell me why the IIS errors don’t show up in my SEOmoz report, and which of these two reports I should really be concerned about (for SEO purposes)? 2. Also in the "errors" section, I have many duplicate page titles and duplicate page content errors. Many of these "duplicate" content reports are actually showing the same page more than once. For example, the report says that "http://www.cylc.org/" has the same content as "http://www.cylc.org/index.cfm" and that, of course, is because they are the same page. What is the best practice for handling these duplicate errors--can anyone recommend an easy fix for this?
Moz Pro | | EnvisionEMI0 -
Clearing the Tool Reports You've Run
Is there anyway of clearing the 'Tool Reports You've Run' list on the Pro Dashboard, or setting certain reports not to be saved here?
Moz Pro | | BigMiniMan0