Tool for scanning the content of the canonical tag
-
Hey All,
question for you. What is your favorite tool/method for scanning a website for specific tags? Specifically (as my situation dictates now) for canonical tags?
I am looking for a tool that is flexible, hopefully free, and highly customizable (for instance, you can specify the tag to look for). I like the concept of using google docs with the import xml feature but as you can only use 50 of those commands at a time it is very limiting (http://www.distilled.co.uk/blog/seo/how-to-build-agile-seo-tools-using-google-docs/).
I do have a campaign set up using the tools which is great! but I need something that returns a response faster and can get data from more than 10,000 links. Our cms unfortunately puts out some odd canonical tags depending on how a page is rendered and I am trying to catch them quickly before it gets indexed and causes problems. Eventually I would also like to be able to scan for other specific tags, hence the customizable concern. If we have to write a vb script to get it into excel I suppose we can do that.
Cheers,
Josh
-
No idea on that one - it's still pretty new. The developers actually chimed in on the post, so you could ask them in the comments.
-
Thanks Dr. Pete and Marcus.
I just finished reading the post. I have looked at Screaming Frog before but was hoping to be able to find a way to do it myself. Just didn't want to plop money down on something that seemed like it should be able to be done using tools I already had. But the software does look good. Any thought on if they will come out with a one time purchase instead of a yearly subscription?
Cheers!
Josh
-
Hey Dr. Pete, Joshua
I was just coming here to say that I had read the Dr. Pete post and this may do the job. It's a paid bit of a software but I will be picking it up later. I have my guys knocking up a canonical checker that will be free for all but that may take a day or so to get perfect.
Let me know if you have a play with Screaming Frog!
Marcus
-
I'm pretty sure that Screaming Frog SEO Spider will do it, but you need the paid version to custom-filter on the canonical tag. I've got a post going up about it tomorrow.
-
Great, really appreciate it! Many thumbs up
-
Hey Josh,
Right, cool. I have got a few jobs to sort out but I am going to have a bash at knocking this up this afternoon. Should be easy enough (he said, damning himself to hours of problems).
Leave it with me for 24 hours.
Marcus
-
Hey Marcus,
thanks for the quick response. That is exactly what I would be looking for. I do have a list of url's and that is also simple enough to get from something like xenu. Would love to work with you on this.
Thanks.
Josh
-
Hey, I am not aware of any such tool, but it should not be too hard to put one together, maybe a useful little tool as well.
If you have all of your pages in spreadsheet or database, it should be easy enough to write a little script that cycles through them.
Start Loop
-
request page
-
parse code to get canonical URL
-
compare page to canonical
-
output problem URLs
End Loop
Slightly over simplified and requires a list of all your URLs but would be willing to help put something like this together, could be useful for all of us, especially for those (like me) that work with a lot of CMS sites.
Cheers
Marcus
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I have an issue with hubspot's blog platform and duplicate content.
It is redirecting all https to http and I am unable to change it, resulting in a lot of duplicate content. Has anyone else experienced this? If so, did you find a solution? Or does anyone have any suggestions?
Moz Pro | | KurtzGro0 -
Dynamic contents causes duplicate pages
Technical help required - please!
Moz Pro | | GBCweb
In our Duplicate Content Pages Report I see a lot of duplicate pages that are created by one URL plus several versions of the same page with the dynamic content, for example,
http://www.georgebrown.ca/immigranteducation/programs
http://www.georgebrown.ca/school-program.aspx?id=1909&Sortid=Study
http://www.georgebrown.ca/school-program.aspx?id=1909&Sortid=Term
http://www.georgebrown.ca/school-program.aspx?id=1909&Sortid=Certification
http://www.georgebrown.ca/school-program.aspx?id=1909&Sortid=Title How do we solve it?0 -
Duplicate Content, Canonicalization may not work in our scenario.
I'm new to SEO (so please excuse the lack of terminology), and will be taking over our companies inbound marketing completely, I previously just did data analysis and managed our PPC campaigns within Google and Bing/Yahoo, now I get all three, Yipee! But I digress. Before I get started here, I did read: http://moz.com/community/q/new-client-wants-to-keep-duplicate-content-targeting-different-cities?sort=most_helpful and I found both the answers there to be helpful, but indirect for my scenario. I'm conducting our companies first real SEO audit (thanks MOZ for the guide there), and duplicate content is going to be our number one problem to tackle. Our companies website was designed back in 2009, with the file structure /city-name/product-name. The problem with this is, we are open in over 50 cities now (and headed to 100 fast), and we are starting to amass duplicate content. Five products (and expanding), times the locations... you get it. My Question(s): How should I deal with this? The pages are almost identical, except listing the different information for each product depending upon it's location. However, for one of our products, Moz's own tools (PRO) did not find all the duplicate content, but did find some (I'm assuming it's because the pages have different course options and the address for the course is different, boils down to a different address on the very bottom of the body and different course options on the right sidebar). The other four products duplicate content were found and marked extensively. If I choose to use Canonicalization to link all the pages to one main page, I believe that would pass all the link juice to that one page, but we would no longer show in a Google search for the other cities, ex: washington DC example product name. Correct me if I'm wrong here. **Should I worry about the product who's duplicate content only was marked four times out of fifty cities? **I feel as if this question answers itself, but I still would like to have someone who knows more than me shed some light on this issue. The other four products are not going to be an issue as they are only offered online, but still follow the same file structure with /online in place of /city-name. These will be Canonicalized together under the /online location. One last thing I will mention here, having the city name in the url gives us a nice advantage (I think) when people are searching for products in cities we offer our product. (correct me again) If this is not the case, I believe I could talk our team into restructuring the files (if you think that's our best option). Some things you need to know about our site: We use a cookie for the location. Once you land on a page that has a location tied to it, the cookie is updated and saved. If the location does not exist, then you are redirected to a page to chose a location. I'm pretty sure this can cause some SEO issues too, but once again not sure. I know this is a wall of text, but I cannot tell you enough how appreciative I am in advance for your informative answers. Thanks a million, Trenton
Moz Pro | | PM_Academy0 -
Export & Email Keyword Report for Selected Tags
Is there a way to get a keyword ranking summary report emailed to me for specific tags?
Moz Pro | | CPollock0 -
Reliable Social Metrics Tool
Hi, Does anyone know of a reliable social metrics tool? So far I've tried Open Site Explorer, Tom Anthony's tools and SEO Quake With each one of them I get very, very different numbers. Cheers, Carlos
Moz Pro | | Carlos-R1 -
Do you miss SEOmoz's Term Extractor Tool?
Do you miss SEOmoz's Term Extractor Tool and want them to bring it back? If you do, please go to this link and submit a feature request form. Here is a brief description for those that are not familiar with what the SEOmoz Term Extractor Tool is;This tool analyzes the content of a given page and extracts the terms that appear to be targeted at search engines. It applies certain weights to HTML elements and other on-page factors to determine what it thinks is a targeted term.
Moz Pro | | brianhughes1 -
Why do I keep getting "more than one canonical URL tag" on-page factor when, in fact, there is always only one?
The following are pages that SEOMOZ says have "more than one canonical URL tag" but they all have only one. Can someone help me understand this?http://www.lasercenterny.com/Laser-Hair-Removal-Binghamton/tabid/1950/Default.aspxhttp://www.lasercenterny.com/Hair-Removal-Binghamton-NY/tabid/1949/Default.aspxhttp://www.lasercenterny.com/Hair-Removal-Binghamton/tabid/1948/Default.aspx
Moz Pro | | SmartWebPros0 -
Upper and lower case spelling = dupe content?
Hi All, I've looking at my Crawl Diagnostics Summary and working on getting my site errors down as low as possible. One thing I'm noticing is that in the "Other URLs" column I'm seeing a lot of 1s. When I click on the number, it is showing me the exact URL with an upper case category title. For example, it appears like it's telling me that these two URLs are considered duplicate content: http://mysite.com/Category http://mysite.com/category Is that right? Does google care about upper and lower case spelling?
Moz Pro | | shawn810