Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the perfect way to handle multiple sitemaps index in Search Console?
Hello friends, I have this doubt for a long and i want to share it with you. In our agency many clients have a PHP template for the home page of their sites, and also have a blog with wordpress as CMS. When i am optimizing sitemaps, I have two separate files, an index of Sitemaps created with Wordpress SEO by Yoast (which inside has separate Sitemaps tags, categories, posts, pages, authors, etc.) and on the other hand the home page sitemap with the subsections. As you know the sitemap generated by "Wordpress SEO by Yoast" is dynamic as it creates the sitemap according to current site content, and is updated every time a new entry is raised or modify any URL. This makes it very practical. I can not have a unique index sitemap sitemaps nesting inside another, as it is not allowed by Google or Sitemap protocol. I read in the Google Support you can upload multiple sitemaps to Search Console but does not say anywhere on upload multiple sitemaps index, or a combination thereof. In my case, I would have to upload two separately files, the dynamically generated with wordpress and the manual created for the PHP template. In my opinion there is no problem and Google will index everything properly performing it this way, but I wanted to share it with you to see how you solve this problem and what experiences had. Thanks and best regards.
On-Page Optimization | | NachoRetta1 -
Problem with internal links.
Hello,I am trying to do an audit of the internal links of my site at zenplugs.com. I am having great difficulty simply trying to establish how many internal links there are on the home page. Off the top of my head I think there are probably 20-25 but Screaming Frog tells me there are 574, the MozBar is listing zero and Open Site Explorer is telling me my site hasn't been indexed yet. I have tried several web based services but most of them don't work. Can anyone recommend a tool which has given them a number they trust? My second query is that one of the tools told me that there are 4 links on the home page with no anchor text, linking to http://zenplugs.com/#. Is this a problem? Many thanks, in advance. Toby
On-Page Optimization | | T0BY0 -
No index, or no index no follow?
Wondering if I could garner some views on this issue please. I'm about to add an affiliate store to a website I own, the site has a couple of pages of unique content (blogs, articles, advice etc on home improvement - all written by my team). Obviously, the affiliate store will not be unique content, it will be made using the datafeeds from cj.com et al, and so I don't want to get any duplicate content type penalties from Google for this store. Should I add a no index to the pages and allow the bots to still crawl them, or should I add no index and no follow? Ideally I would like to get the affiliate store category pages indexed as they will be a mixture of lots of different merchants and be fairly unique. Can Google still mark the site down for duplicate content if it can crawl it, even if it is noindex? Thanks, Carl
On-Page Optimization | | Grumpy_Carl0 -
Site Not Ranking for Key Term
Question for my fellow Mozers I have a ranking question that I cannot put my finger on. I have a site (visitplano.com) where the client wants to rank for the keyword "Plano". I can't say if the site was previously ranking for this keyword, but I looked into the basic SEO practices and found that the keyword is incorporated in: Domain Title Content There is a lack of internal linking and anchor text within the content External links - 1,558 DA - 46 PA - 55 Currently, the website does not rank for the keyword "Plano". Could someone shed some light on why they aren't ranking or what I may be missing? I would greatly appreciate your help.
On-Page Optimization | | flcity150 -
Disallow indexing of ALL subdomains
I'm using www.domain.com as my development hosting. Each website that i'm developing get's a temporary URL like this: project1.domain.com
On-Page Optimization | | conversal
project2.domain.com
project3.domain.com
... Now i'd like to set that ALL these subdomains can not be indexed in Google. Now I manually have to do this for each subdomain's site, and when I go online I have to change the robots.txt again. So I would like to make things a bit easier for me. Is this possible?0 -
Two sites, one with a ccTLD domain, the other with TLD domain, same content
Hi there! I have a site which can be accessed with two different domains: one ccTLD for Spain: www.piensapiensa.es one TLD www.piensapiensa.com Should I take care of something regarding SEO? I have also a redirection from www.piensapiensa.com to piensapiensa.com. I have set up them in webmasters tools individually, with the same sitemap obviously. Thanks in advanced.
On-Page Optimization | | juanmiguelcr0 -
One site, one location, multiple languages - best approach?
Hey folks, Has anyone created a multilingual site targeted at a single location? I have a site that I need to create which is targeting users in Spain. There are going to need to be English and Spanish versions of the text. My thoughts would be to handle it this way: 1. Geolocate the entire site to spain 2. Have the english content in a folder /en/ 3. Have the spanish content in a folder /es/ As far as I am aware the same content in another language is not considered duplicate content and Google should handle folks searching in spanish or english and show them the correct landing page. Sounds easy enough in principle but I also have these other options to seemingly solidify the approach: 4. Add: rel="alternate" hreflang="x" (3) 5. Add language information to a sitemap (4) Again, none of that seems terribly difficult but would welcome any feedback and particularly experience of multilingual sites targeting a single location. Thanks all Marcus References and info 1. Multi Regional:
On-Page Optimization | | Marcus_Miller
http://googlewebmastercentral.blogspot.co.uk/2010/03/working-with-multi-regional-websites.html 2. Multi Language:
http://googlewebmastercentral.blogspot.co.uk/2008/08/how-to-start-multilingual-site.html 3. http://support.google.com/webmasters/bin/answer.py?hl=en&answer=189077 4. http://support.google.com/webmasters/bin/answer.py?hl=en&answer=26208650 -
How do I Avoid Excessive Internal Links on an eCommerce site?
I think I'm getting dinged for this on Term Target because the page is full of products, which have links to their product page, but I'm not sure.
On-Page Optimization | | PageLogic0