Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Meta Titles On Site Different From Google Index Page
This is very embarrassing but I hope someone can help. The Meta Titles on my site are not being shown correctly on the Google site index. For example, when I got directly to the page the appropriate page title is shown obviously. However when I go view that page on Google, the title in completely different. Page title: <a class="attribute-value">Web Design, SEO, PPC, Mobile Development In Philadelphia, Bucks| Infinity Digital Agency</a> Google Shows: Our Services - Infinity Digital Agency Here is the result page. I am currently running WordPress and Yoast. Any thoughts would be greatly appreciated. https://www.google.com/search?q=site%3Awww.infinitydigitalagency.com%2Fservices%2F&ie=utf-8&oe=utf-8
On-Page Optimization | | infinitydigitalagency0 -
What on-site issue could be causing Moz to not detect internal links?
Hey guys, We've done a crawl and none of our internal links are showing up. Are there any on-page factors that would prevent Moz from being able to detect our internal links? Thanks!
On-Page Optimization | | ATMOSMarketing560 -
Title Tags for Index Pages
What tactics do you use to change the title tags of your index page so they're not all the same? For example, if you have an index page that has 100 pages, each with the same title, what tactics do you use to give each page a unique title and how important is it?
On-Page Optimization | | felt0 -
The correct way to go from PHP site to HTML site?
I have a website fully coded in PHP and I am doing a re-design over to an HTML site. I searched through the Q&A and there were some conflicting answers. Some said you will need to 301 all the pages. Others said to use the .htaccess to parse all the files as html. What is the correct way I should go about this? Thanks in advance!
On-Page Optimization | | reliabox0 -
Anyone know how long it takes Google to Index new site?
Could anyone let me know how long it takes for a NEW site to be indexed in Google please? Am having some robots.txt issues and am keen to see if it got indexed. Thanks!
On-Page Optimization | | Wallander0 -
I have one page on my site... but still get duplicate name and content errors.
i have only the index.html page. my domain has a permanent 301 to the root. why am i getting duplicate problems? i only have one page the index .html???
On-Page Optimization | | one4u2see0 -
Problem with left navigation links on an e-commerce site diluting pagerank
I'm trying to decide how to deal with left navigation links on my e-commerce website diluting the amount of link juice passed to other links on the page. Any suggestions? Only options I can think of are: Nofollow the links use javascript (I'm assuming googlebots are still able to find these) Leave them as they are as followed links
On-Page Optimization | | Ralzaider0 -
Content for ecommerce site
How important on site/page contents are for ecommerce site. Keeping in mind the page layout. Its not that important to have page copy/content at all for ecommerce sites If yes, does position of content is an important factor? if putting page copy/content in upper fold of a page then the most important thing which is product itself will have less exposure if putting near the footer of the page, does that seem like doing just for the sake of SEs and ranking. How important internal linking form that content would be compare to left panel links or links at the header of a website Thanks Rick
On-Page Optimization | | RickGa0