Is this a good sitemap hierarchy for a big eCommerce site (50k+ pages).
-
Hi guys, hope you're all good.
I am currently in the process of designing a new sitemap hierarchy to ensure that every page on the site gets indexed and is accessible via Google. It's important that our sitemap file is well structured, divided and organised into relevant sub-categories to improve indexing.
I just wanted to make sure that it's all good before forwarding onto the development team for them to consider. At the moment the site has everything thrown into /sitemap.xml/ and it exceeds the 50k limit. Here is what I have came up with:
A primary sitemap.xml referencing other sitemap files, each of the following areas will have their own sitemap of which is referenced by /sitemap.xml/. As an example, sitemap.xml will contain 6 links, all of which link to other sitemaps.
- Product pages;
- Blog posts;
- Categories and sub categories;
- Forum posts, pages etc;
- TV specific pages (we have a TV show);
- Other pages.
Is this format correct? Once it has been implemented I can then go ahead and submit all 6 separate sitemaps to webmaster tools + add a sitemap link to the footer of the site.
All comments are greatly appreciated - if you know of a site which has a good sitemap architecture, please send the link my way!
Brett
-
Have a read of what Google say about them here.
And yes, image search is huge. As for the way it's used, I can't comment on what everyone else does.
-Andy
-
Interesting, I haven't ever came across someone who said that I should put image URL's in a sitemap. Do users really search via Google images though - if they do aren't they just looking to copy an image / and or download it?
I can't see the site generating qualified leads through image based searches.
-
Duplicate content is when two or more URLs show the same content.
I referred to the fact that sometime categories, tags or subcategories show the same content. By the latter, i mean the same posts.Just to clarify, imagine that you have a category: Dogs and the subcategory: Puppies. And the last 5 articles/posts have both, category and subcategory.
When visiting the main page fo both(cat and subcat) will show the same content, the same 5 posts/articlesDid I make myself clear?
-
Thanks for getting back to me so quickly Gaston, I appreciate it.
You mentioned duplicate content - what do you mean? If the page has already been indexed, Google will skip/re-crawl the page. Not too sure what you mean by that?
Brett
-
Hi Brett,
Don't forget to add an images sitemap, as Google is pretty hot on those, and make sure you do some good image marketing as well.
But what you suggest is absolutely fine. From the main Sitemap, Google will find all of the others as well.
Just as a note, do make sure you know which pages need more crawling through using the last modified date. This will help them know which pages they should be recrawling more often.
-Andy
-
Hi brett,
Yeap, the hierarchy is ok. You should keep in mind to only submit to index the pages that are of yout interest and dont generate duplicate content, just a reminder.
Then, just submit every sitemap to search console.
Hope it helps.
GR.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Archive pages structure using a unique hierarchical taxonomy, could be good for SEO?
Hi, Preamble:
Intermediate & Advanced SEO | | danielecelsa
We are creating a website where people look for professionals for some home working. We want to create a homepage with a search bar where people write the profession/category (actually it is a custom taxonomy) that they need, like ‘plumbers’, and a dropdown/checkbox filter where they can choose the city where they need the plumber.
The result page is a list of plumber agencies in the city chosen. Each agency is a Custom Post Type for us. Furthermore, we are hardly working to make our SEO ranking as high as possible.
So, for example, we know that it is important to have a well-done Archive Page for each Taxonomy term, besides a well-done Results Page.
Also, we know it is bad for SEO to have duplicated pages or (maybe) similar pages, ranking for the same (or maybe also similar) keywords. Proposed Structure:
So, what we are thinking is to have this structure:
A unique hierarchical taxonomy that INCLUDES the City AND the profession! That means that our taxonomy ‘taxonomy_unique’ has terms like: ‘Rome’, ‘Paris’, ‘Dublin’ as father and also terms like ‘Plumbers’, ‘Gardeners’, ‘Electricians’ which are sons of some City father! So we will have the term 'Plumbers' son of 'Rome' and we will have also the term 'Plumbers' son of 'Paris'. Each of these two taxonomy terms (Rome/Plumbers and Paris/Plumbers) will have an archive page that we want to make ranking for the keywords ‘Plumbers in Rome’ and ‘Plumbers in Paris’ respectively. It is easier to think of it imagining the breadcrumbs. They will be:
Home > Rome > Plumbers
and
Home > Paris > Plumbers Both will have: a static content (important for SEO), where we describe the plumber profession with a focus on the city, like ‘Find the best Plumbers in Rome’ vs ‘Find the best Plumbers in Paris' a 'dynamic' content - below - that is a list of Custom Post Types which have that taxonomy term associated. Furthermore, also 'Rome' and 'Paris' are taxonomy terms that have their own archive page. In those pages, we are thinking to show the Custom Post Types (agencies) associated with that taxonomy term as a father OR maybe just a list of the 'sons' of that father, so links to those archive pages 'sons').
In both cases, there should be also a static content talking maybe about the city and the professionals it offers in general. Questions:
So what we would like to understand is: Is it bad from an SEO perspective to have 2 URLs that look like this:
www.mysite.com/Rome/Plumbers
and
www.mysite.com/Naples/Plumbers
where the static content is really similar and it is something like that:
“Are you looking for the best plumbers in the city of Rome”
and
“Are you looking for the best plumbers in the city of Naples”? Also, these kinds of pages will be much more than 2, one for each City.
We are doing that because we want the two different pages to rank high in two different cities, but we are not sure if Google likes that. On the other hand, each City will have one page for each kind of job, so:
www.mysite.com/Rome/Plumbers
www.mysite.com/Rome/Gardeners
www.mysite.com/Rome/Electricians
So the same question, does Google like this or not? About 'Rome' and 'Paris' archive pages, does Google prefer a list of Custom Post Types that have that father term associated as taxonomy, or a list of the archive pages 'sons', with links to those pages? What do you think about this approach? Do you think this structure could be good from an SEO perspective, or maybe there could be something better alternatively? Hoping everything is clear, we really appreciate anyone dedicating its time and leaving feedback.
Daniele0 -
E-Commerce Site Collection Pages Not Being Indexed
Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots:
Intermediate & Advanced SEO | | Ben-R
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9545580/checkouts
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /apple-app-site-association
Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0 -
How get an image on a third party site rank high on goode images?
Hello, I have sometimes articles written about my product online, is there anything else I can do except make a good file name for it, perhaps I can ask the site owner to modify in the article to make it rank higher? Also on some small websites I can see that images rank very high for the specific search term that is difficult to rank for in images, if I were to contact the site with a sponsored post request, what I should make sure the site adds except filename to that sponsored post... I think there are also some other methods such as reddit to make images rank high on third party page, just need to find out how... thanks a lot
Intermediate & Advanced SEO | | bidilover0 -
Date of page first indexed or age of a page?
Hi does anyone know any ways, tools to find when a page was first indexed/cached by Google? I remember a while back, around 2009 i had a firefox plugin which could check this, and gave you a exact date. Maybe this has changed since. I don't remember the plugin. Or any recommendations on finding the age of a page (not domain) for a website? This is for competitor research not my own website. Cheers, Paul
Intermediate & Advanced SEO | | MBASydney0 -
How do I know what pages of my site is not inedexed by google ?
Hi I my Google webmaster tools under Crawl->sitemaps it shows 1117 pages submitted but 619 has been indexed. Is there any way I can fined which pages are not indexed and why? it has been like this for a while. I also have a manual action (partial) message. "Unnatural links to your site--impacts links" and under affects says "Some incoming links" is that the reason Google does not index some of my pages? Thank you Sina
Intermediate & Advanced SEO | | SinaKashani0 -
Are prices shown in search results good for e-commerce sites?
Hello here. I own an e-commerce website (virtualsheetmusic.com) and with the fact we have implemented structured data for our product pages, now our search results on Google appear with pricing information whereas most of our competitors don't have that information displayed (yet). I am wondering: Do you think is that good? What side effects could that cause? Less CTR? Less bounce rate? Less traffic? Any thoughts on this issue are very welcome. Thanks!
Intermediate & Advanced SEO | | fablau0 -
Consolidating MANY separate domains into a much better, single URL: Should I point a landing page or redirect to the new site?
I am consolidating a site for a client who previously, and very foolishly, broke up their domains like so: companyparis.com companyflorence.com companyrome.com etc... I am now done with the new site, which will be at: company.eu with pages as appropriate: company.eu/paris company.eu/florence company.eu/rome This domain, although not entirely new, does not have much authority or rank. In terms of SEO and link-building, is it better to redirect the old domain to the specific page on the new domain: companyparis.com --> company.eu/paris or... is it better to put a landing page at the old domain LINKING to the page on the new domain: companyparis.com --> landing page linking to --> company.eu/paris
Intermediate & Advanced SEO | | thongly0 -
How do Google Site Search pages rank
We have started using Google Site Search (via an XML feed from Google) to power our search engines. So we have a whole load of pages we could link to of the format /search?q=keyword, and we are considering doing away with our more traditional category listing pages (e.g. /biology - not powered by GSS) which account for much of our current natural search landing pages. My question is would the GoogleBot treat these search pages any differently? My fear is it would somehow see them as duplicate search results and downgrade their links. However, since we are coding the XML from GSS into our own HTML format, it may not even be able to tell.
Intermediate & Advanced SEO | | EdwardUpton610