Sitemap generator which only includes canonical urls
-
Does anyone know of a 3rd party sitemap generator that will only include the canonical url's? Creating a sitemap with geo and sorting based parameters isn't the most ideal way to generate sitemaps. Please let me know if anyone has any ideas. Mind you we have hundreds of thousands of indexed url's and this can't be done with a simple text editor.
-
You can use Screaming Frog for this (and much more). It's not free but is great tool to check the SEO health of your site as well.
Trial is free (up to 500 url's).
To generate the sitemap - crawl your site with following settings:
Configuration > Spider > Advanced tab: select:- always follow redirects
- respect noindex
- respect canonical
After crawl - under the Sitemaps you can create the XML & image sitemaps.
Dirk
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Editing A Sitemap
Would there be any positive effect from editing a site map down to a more curated list of pages that perform, or that we hope they begin to perform, in organic search? A site I work with has a sitemap with about 20,000 pages that is automatically created out of a Drupal plugin. Of those pages, only about 10% really produce out of search. There are old sections of the site that are thin, obsolete, discontinued and/or noindexed that are still on the sitemap. For instance, would it focus Google's crawl budget more efficiently or have some other effect? Your thoughts? Thanks! Best... Darcy
Intermediate & Advanced SEO | | 945010 -
Intra-linking to pages with a different Canonical url ?
Hello Moz Community! I'm hoping to get some advice around intra-linking practices and the benefits when a page that is being linked to has a different canonical tag than it's own URL. Confused? Allow me to elaborate. Scenario: Background: Ecommerce Company is trying to increase its organic ranking for key, broad terms in the cycling industry. Ecommerce company is trying to rank its category pages for a main term. To help this, the company focusing on increasing the quality of its intra-linking structure (the links and anchor texts that link to another page within the site). Example goal: to have it's Road Cassettes category page rank for 'Road Cassettes' Company's 'cassettes' main category page is here: /Components/Drivetrain/Cassettes/ And the company uses filtered navigation logic to drill down into 'road cassettes' specifically: /Components/Drivetrain/Cassettes/?page_no=1&fq=ATR_RoadBiking:True SEOs are instructed to include occasional links back to this page, with SEO friendly anchor text, to help strengthen it's authority for the main term. The Issue / Question: Main category URL: /Components/Drivetrain/Cassettes/ Road Cassettes category URL: /Components/Drivetrain/Cassettes/?page_no=1&fq=ATR_RoadBiking:True Road Cassettes Canonical URL: /Components/Drivetrain/Cassettes/ The canonical URL of the filtered Road Cassettes category is its main category URL. Will Company be able to effectively rank its Road Cassettes category URL for 'Road Cassettes' if the canonical URL is the main category? Should the canonical URL not be the main category? OR Will increasing the intra-linking to the Road Cassettes URL help the main category URL rank for 'Road Cassettes' - by passing all it's authority?
Intermediate & Advanced SEO | | Ray-pp0 -
Canonical Issue with urls
I saw some urls of my site showing duplicate page content, duplicate page title issues on crawl reports. So I have set canonical url for every urls , that has dupicate content / page title. But still SeoMoz crawl test is showing issue. I am giving here one url with issue. The below given urls shown duplicate content and duplicate page title with some other urls all are given below. Checked URL http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7635 dup page content http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622&category_id=270&colors=Black_Tones&click=colors&ci=1
Intermediate & Advanced SEO | | trixmediainc
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 dup page Title http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7636&category_id=270&sizes=12x15,12x18&click=sizes
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7636
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622&category_id=270&colors=Black_Tones&click=colors&ci=1
http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 But I have set canonical url for all these urls already , that is :- http://www.cyrusrugs.com/bridge-traditional-area-rug-item-7622 This should actually solve the problem right ? Search engine should identify the canonical url as original url and only should consider that. Thanks0 -
Video Sitemap Creation Question
I have created a sitemap file as per Google Web Master Tools instructions. I have it saved as a .txt file. Am I right in thinking that this needs to be uploaded as a .xml file? If so, how do I convert this to a XML? I have tried but it seems to corrupt - there must be a simple way to do this?!
Intermediate & Advanced SEO | | DHS_SH0 -
Need Perfect URLs
I'm redesigning a site's structure from the ground up, and am having issues with the URLs. I'd love to have them be perfect, but kept finding conflicting advice online. 1. For my services blog, is it best to have it set up like www.example.com/services/keyword or
Intermediate & Advanced SEO | | Stryde
www.example.com/keyword There seems to be conflicting advice as to keep it short and keep the keyword as far to the left as possible, but also that including the word services would help with long tail phrases and site organization. 2. For my blog section, is it best to have it set up like
www.example.com/blog/keyword or
www.example.com/keyword or
www.example.com/blog-post-title-with**-keyword**-in-it It's similar to the first question, but also adds the question of including the entire post title in the URL or just the keyword. Your help would be greatly appreciated!1 -
Video XML Sitemap
I've been recently been information by our dev team that we are not allowed legally to make our raw video files available in a video XML sitemap...This is one of the required tags. Has anyone run into a similar situation and has figured out a way around it? Any ideas would be greatly appreciated. Thanks! Margarita
Intermediate & Advanced SEO | | MargaritaS0 -
URL blocked
Hi there, I have recently noticed that we have a link from an authoritative website, however when I looked at the code, it looked like this: <a <span="">href</a><a <span="">="http://www.mydomain.com/" title="blocked::http://www.mydomain.com/">keyword</a> You will notice that in the code there is 'blocked::' What is this? has it the same effect as a nofollow tag? Thanks for any help
Intermediate & Advanced SEO | | Paul780 -
Rel canonical element for different URL's
Hello, We have a new client that has several sites with the exact same content. They do this for tracking purposes. We are facing political objections to combine and track differently. Basically, we have no choice but to deal with the situation given. We want to avoid duplicate content issues, and want to SEO only one of the sites. The other sites don't really matter for SEO (they have off-line campaigns pointing to them) we just want one of the sites to get all the credit for the content. My questions: 1. Can we use the rel canonical element on the irrelevent pages/URL's to point to the site we care about? I think I remember Matt Cutts saying this can't be done across URL's. Am I right or wrong? 2. If we can't, what options do I have (without making the client change their entire tracking strategy) to make the site we are SEO'ing the relevant content? Thanks a million! Todd
Intermediate & Advanced SEO | | GravitateOnline0