Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I reduce the amount of internal links on my site?
Hi, Can someone help me with reducing the amount of internal links on our site please? https://www.thepresentfinder.co.uk Thanks Charlie
On-Page Optimization | | The-Present-Finder0 -
Can Javascript Links Be Used to Reduce Links per Page?
We are trying to reduce the number of links per page, so for the low-value footer links we are considering coding them as javascript links. We realize Google can read java, but the goal is to reduce level of importance assigned to those internal links. Would this be a valid approach? So the question is would converting low-value footer links to js like below help reduce the number of links per page in google's eyes even though we're reasonably sure they can read javascript. <a <span="" class="html-tag">href</a><a <span="" class="html-tag">="</a><a class="html-attribute-value html-external-link" target="_blank">javascript:void(0);</a>" data-footer-link="/about/about">About Us
On-Page Optimization | | Jay-T0 -
Which version of the homepage on the sitemap?
We have been wondering this for a while now. When we build our sitemaps, or when the yoast plugin does in WP we are often left with www.yourdomain.co.uk/ and www.yourdomain.co.uk/index.html in our sitemaps. Surely it isn't healthy to have both in the sitemap. Which one should we take out? Thanks
On-Page Optimization | | EveronSEO0 -
Noindex pages being indexed
Hi all Wondering if anyone could offer a pointer on a problem i am having please. I am developing an affiliate store and to prevent problems with duplicate content I have added name="robots" content="NOINDEX,FOLLOW" /> to all the product pages to avoid google penalties. However, Google appears to be indexing product pages. When I do a site: search I see a few hundred product pages in the engine. This is odd as the site has always had noindex on these pages. Even viewing the cache of the indexed page shows the noindex meta tag to be in place. I'm at a loss as to why these pages are being indexed and could do with removing them asap to stop any penalties on the site. Many thanks for any help.
On-Page Optimization | | carl_daedricdigital0 -
I want to place reviews from a review site also on my client's website, what's the right way to do that?
My client is doing a good job fitting rooftops out with solar panels. his customers are happy and ready to show this in online reviews. He has a partnership with a review site that gives him new leads, but I just don't want him to lose this valuable reviews just to this one review site.. Here's what i came up with and I love to hear your opinion on this;
On-Page Optimization | | JoostDerks
I'll copy the content of the solarpanel review site, place it on my client's website in hreview. the contents canonical will be rightfully set to the review site, this way I want to avoid the duplicate content thing, but get the hard earned yellow stars in my client's SERPs. and show the visitors of my clien't s site what a great job he is doing, on his own website.. This is the first time I thought of this solution and I wonder if I'm forgetting something.. Is this solution safe with Google/Bing/Yahoo?0 -
Do contextual links hold more weight?
Hi, Say you have an article, does a link in the content itself hold more weight then including it in say the byline? I have read so many times a link higher up the page, contextual has much more benefit than a link way below the fold separated from the main content within a byline. Thoughts?
On-Page Optimization | | Bondara0 -
Sitemap for webshop?
Had a client today who contacted me regarding having no sitemap on his webshop (with about 2000 products). He told me that his design company, who had made his webshop, adviced him against putting up a site since it would be giving google to much information at once, hence making it stop visiting his webshop as much? Well my first through was -okay this must be some kind of joke.
On-Page Optimization | | zarkas
But with seo, there are always something, now and then, that still surprises you, So are there anyone else here, who could share some info on this?0 -
No index parts of a page?
Little bit of an odd question this, but how would one go about getting Google to not index certain content on a page? I'm developing an online store for a client and for a few of the products they will be stocking they will be using the manufacturers specs and descriptions. These descriptions and specs, therefore, will not be unique as they will be also used by a number of other websites. The title tag, onpage h1 etc will be fine for the seo of the actual pages (with backlinks, of course) so the impact of google not counting the description should be slight. I'm sure this can be done but for the life of me I cannot remember how. Thanks Carl
On-Page Optimization | | Grumpy_Carl0