Scanning For Duplicate Canonical Tags
-
I'm looking for a solution for identifying pages on a site that have either empty/undefined canonical tags, or duplicate canonical tags (meaning the tag occurs twice within the same page).
I've used Screaming Frog to view sitewide canonical values, but the tool cannot identify when pages use the tag twice, nor can it differentiate between pages that have an empty canonical tag and pages that have no canonical tag at all.
Any help finding a tool of some sort that can assist me in doing this would be much appreciated, as I'm working with tens of thousands of pages and can't do this manually.
-
Paul,
Thanks for your reply! I have used the paid version of Screaming Frog with regex to exclude pages with certain parameters, but I have not tried the custom queries.
Could you give me an example of a custom query that would find empty canonical tags? That would be extremely helpful.
-
I think Screaming Frog is still the solution you want, John, but it's not configured to do what you need "out of the box". You're going to need to write a custom query for Screaming Frog to run while it's indexing your site.
This capability is only available in the paid version of the tool, but you'll need the paid version anyway to be able to crawl 10,000 page sites as the free tool cuts out at 500 pages.
You'll find the Custom settings link under the Configuration tab in the top navigation bar of the tool. Essentially what you're doing is writing custom filters.
You'll need to write a regex (regular expression) that is capable of finding pages with no canonical tag at all, and another which is capable of finding empty canonical tags. If your regex-fu is really strong, you may be able to write a single expression to capture both these states.
Had you already tried the custom queries with Screaming Frog?
Paul
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Canonical and Alternate Advice
At the moment for most of our sites, we have both a desktop and mobile version of our sites. They both show the same content and use the same URL structure as each other. The server determines whether if you're visiting from either device and displays the relevant version of the site. We are in a predicament of how to properly use the canonical and alternate rel tags. Currently we have a canonical on mobile and alternate on desktop, both of which have the same URL because both mobile and desktop use the same as explained in the first paragraph. Would the way of us doing it at the moment be correct?
Intermediate & Advanced SEO | | JH_OffLimits3 -
Multiple H2 tags
Is it advisable to use only one H2 tag? The template designs for some reason is ended up with multiple H2 tags, I realise if any think it's that each one is that are important and it is all relative. Just trying to assess if it's worth the time and effort to rehash the template. Has anyone done any testing or got any experience? Thanks
Intermediate & Advanced SEO | | seoman101 -
Canonical or No-index
Just a quick question really. Say I have a Promotions page where I list all current promotions for a product, and update it regularly to reflect the latest offer codes etc. On top of that I have Offer announcement posts for specific promotions for that product, highlighting very briefly the promotion, but also linking back to the main product promotion page which has a the promotion duplicated. So main page is 1000+ words with half a dozen promotions, the small post might be 200 words, and quickly become irrelevant as it is a limited time news article. Now, I don't want the promotion page indexed (unless it has a larger news story attached to the promotion, but for this purpose presume it is doesn't). Initially the core essence of the post will be duplicated in the main Promotion page, but later as the offer expires it wouldn't be. Therefore would you Rel Canonical or just simply No-index?
Intermediate & Advanced SEO | | TheWebMastercom0 -
Google Indexing Duplicate URLs : Ignoring Robots & Canonical Tags
Hi Moz Community, We have the following robots command that should prevent URLs with tracking parameters being indexed. Disallow: /*? We have noticed google has started indexing pages that are using tracking parameters. Example below. http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html http://www.oakfurnitureland.co.uk/furniture/original-rustic-solid-oak-4-drawer-storage-coffee-table/1149.html?ec=affee77a60fe4867 These pages are identified as duplicate content yet have the correct canonical tags: https://www.google.co.uk/search?num=100&site=&source=hp&q=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&oq=site%3Ahttp%3A%2F%2Fwww.oakfurnitureland.co.uk%2Ffurniture%2Foriginal-rustic-solid-oak-4-drawer-storage-coffee-table%2F1149.html&gs_l=hp.3..0i10j0l9.4201.5461.0.5879.8.8.0.0.0.0.82.376.7.7.0....0...1c.1.58.hp..3.5.268.0.JTW91YEkjh4 With various affiliate feeds available for our site, we effectively have duplicate versions of every page due to the tracking query that Google seems to be willing to index, ignoring both robots rules & canonical tags. Can anyone shed any light onto the situation?
Intermediate & Advanced SEO | | JBGlobalSEO0 -
Canonical tag - but Title and Description are slightly different
I am building a new SEO site with a "Silo" / Themed architecture. I have a travel website selling hotel reservations. I list a hotel page under a city page - example, www.abc.com/Dallas/Hilton.html Then I use that same property under a segment within the city - example www.abc.com/Dallas/Downtown/Hilton.html, so there are two URLs with the same content Both pages are identical, except I want to customize the Title and Description. I want to customize the title and description to build a consistent theme - for example the /Downtown/Hilton page will have the words "Near Downtown" in the Title and Description, while the primary city Hilton page will not. So I have two questions about this. First, is it okay to use a canonical tag if the Title and Description are slightly different? Everything else is identical. If so, will Google crawl and comprehend the unique Title and Description on the "Downtown" silo? I want Google to see that I have several "supporting" pages to my main landing page(s). I want to present to Google 5 supporting pages in each silo that each has a supporting keyword theme. But I'm not sure if Google will consider content of pages that point to a different page using the canonical tag. Please see this supporting example: http://d.pr/i/aQPv Thanks for your insights. Rob
Intermediate & Advanced SEO | | partnerf0 -
Appropriate use of rel canonical
Hey Guys,I'm a bit stuck. My on-page grade indicated the following two issues and I need to find how how to fix both issues.If you have a solution, could you please let me know how to address these issues? It's all a bit intimidating at the moment!!Thank you so much..****************************************************************************************************************************************Appropriate Use of Rel Canonical If the canonical tag is pointing to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. Make sure you're targeting the right page (if this isn't it, you can reset the target above) and then change the canonical tag to reference that URL. Recommendation: We check to make sure that IF you use canonical URL tags, it points to the right page. If the canonical tag points to a different URL, engines will not count this page as the reference resource and thus, it won't have an opportunity to rank. If you've not made this page the rel=canonical target, change the reference to this URL. NOTE: For pages not employing canonical URL tags, this factor does not apply. No More Than One Canonical URL Tag The canonical URL tag is meant to be employed only a single time on an individual URL (much like the title element or meta description). To ensure the search engines properly parse the canonical source, employ only a single version of this tag. Recommendation: Remove all but a single canonical URL tag
Intermediate & Advanced SEO | | StoryScout1 -
Are tags an issue in SEO
SEOMoz saw that my tags were duplicate pages. Are tags a serious issue in SEO? Should I remove it entirely to prevent the duplicate pages?
Intermediate & Advanced SEO | | visualartistics0 -
Duplicate Content Through Sorting
I have a website that sells images. When you search you're given a page like this: http://www.andertoons.com/search-cartoons/santa/ I also give users the option to resort results by date, views and rating like this: http://www.andertoons.com/search-cartoons/santa/byrating/ I've seen in SEOmoz that Google might see these as duplicate content, but it's a feature I think is useful. How should I address this?
Intermediate & Advanced SEO | | andertoons0