Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
No index vs removal - Subdomains
One of my clients has a subdomain - docushare.***.edu (vs ***.edu) that they would like to not influence SEO.
On-Page Optimization | | Crescent_Sense
the question is: should they no -index these pages or remove as a subdomain? Thank You! jeremy0 -
Internal links are not indexed of the website
Some internal links are indexed and some not of the same page of the website, what is the so and what is the reason behind?
On-Page Optimization | | renukishor10 -
Working on this site...
and wondering what is wrong in terms of on page SEO (basically just want some feedback on tips/changes to make) http://www.stevenholmesstudio.com/ I'm assuming that the title shouldn't be just the img file name..any suggestions for what it should be?
On-Page Optimization | | callmeed0 -
How do I Avoid Excessive Internal Links on an eCommerce site?
I think I'm getting dinged for this on Term Target because the page is full of products, which have links to their product page, but I'm not sure.
On-Page Optimization | | PageLogic0 -
Is it best to optimize your site for just one or two keywords?
My company/website makes and sells a product that's not that competitive but still has about 20 key words/phrases that people search for. My site is not a huge site maybe 35 pages after you include the blog posts.We sell samples off the site but it's mostly used as a brochure but we also want it to be a successful tool at bringing in leads. Should I optimize for the most popular key word phrases focusing on only one or two per page and forget about the rest or should I try to optimize for as many keywords as possible on all pages or should I optimize for just the few (3-5) heavy hitting keywords but on all pages? Right now I've got it optimized for around 3 keyword phrases for the whole site and only 1 or 2 per page with the most popular phrases on the most important pages.
On-Page Optimization | | JAARON0 -
My site has been dropping, not sure why!
My site has been dropping in the rankings, not sure - my metrics seem better than my competitors. Historically I have been a very stable #2 for my main term, but now it's down to 7! According to SEO Moz, my domain authority is 32, while my better performing competitors are are 26, 11, and 1! Have more links than they do. Trying to think it through, not sure what is happening. My home page bounces at a low 20%-ish, other Google Analytics are good. I have a company Facebook account, occasionally upload YouTube vids, do online press releases, etc. I do have to target several metros scattered across the state, while my competitors usually focus on one major metro. I do have some SEO Moz errors, which focus on dup content due to our web editor's naming system. An example would be domain.com/keyword-keyword-i-14 vs. domain/differnet keyword-different better keyword-i-14. 14 would be the actual page number. Our system lets me change the page title keywords, as I've added new links and pages over the years there are some dupes. The only major change is I've added a password protected section for sales rep materials. The hosting/web guru firm we use has assured me Google doesn't see pages behind the password portection. Not sure if Google is testing a new SERP formula. All social media or non-website results seem to have dropped out of search for my terms. Just local business sites like mine and some directory sites remain. Any advice or private consult would be greatly appreciated as I am a ... self taught 'OneManBand' for high tech marketing in our company. Thanks
On-Page Optimization | | OneManBand0 -
Links in header tags
Hello Seomozzers I have a query, is it good to have links in h2, h3, h4 tags. Does it have a positive factor over on page optimisation or a negative factor. Thanks
On-Page Optimization | | usef4u0 -
Too many on page links
I'm having trouble interpreting this data. It says several of my blog pages have too many on page links, some as high as 140 and there is no example of a blog post that they are referring to. What am I missing? I never post more than a handful (5-7) in our 600-1000wd blogs. When I drill down, it doesn't give me very much information except "Found over 41 years ago" off to the right. When I click on the "too many on page links" URL, it provides a long list of website pages that are renamed with the blog name. huh? A lot of this stuff isn't very intuitive, SEOMoz.
On-Page Optimization | | amandahx20