Sitemap include all site links or just ones we want indexed?
-
Got a quick sitemap question. We have a clients site built in opencart and are getting ready to submit the sitmap. The default sitemap setting generates urls right off of the root. For example site.com/product. These urls are also accessible through the site itself. We prefer to give the site some depth and have structured the products so the urls are site.com/category/product. All of the product pages have canonicals including the category so we should not have to worry about duplicate content on the /product page vs the /category/product page. My question is both types of product pages are included in the sitemap at the moment. Since we don't want google to index the /product urls should we leave them off of the sitemap even though they are readily accessible from the frontend(though not linked)? Or just leave them and let the canonical tag be used in directing google as to which urls to index. Thanks in advance.
-
Hi again JS,
I think it's great that you continue to evaluate your platform from all perspectives and evaluate its strengths/weaknesses. Many times, a platform can do a lot of the basics well, but fall short on the details that differentiate us from our competition. For example, opencart may do the basic SEO requirements well, but not include ecommerce microdata (schema.org) which have a high impact on our search listings.
You can do a lot of harm/good with the robots.txt file - like deindex entire website (probably not a good thing) or block certain directories (your /product issue). I would gain some deeper knowledge about what you can do with the robots.txt file and how you need it to perform for your business.
-
Hey Raymond,
Thanks for the response, feel like I'm over thinking this a bit, as usually we just leave our opencart setups as is, other then a few minor tweaks. Lately I've really been scrutinizing opencart's SEO setup and how to improve it, since it seems there are a lot of gaps in he way it handles this.
I thought the robots.txt would have been a good way to block the pages, but the issue is I would need to block every single product page as opencart automatically creates a page for every product that is site.com/product and since we are adding lots of products there should be a better way to handle this. After I posted I came across this tidbit from a 6 year old google webmaster central blog post. Basically it states that 'While we can't guarantee that our algorithms will display that particular URL in search results, it's still helpful for you to indicate your preference by including that URL in your Sitemap. '. I think going this route along with the canonical should do the trick.
-
Hi JStrong,
Great question to be asking and an important topic to be doing your due diligence on, especially when dealing with an eCommerce related website.
Google uses a sitemap as a guideline for crawling your site. So, just because you put a URL in your sitemap, doesn't mean that they URL will actually be indexed. You can see those stats in your Google Webmaster Tools account, under the Sitemap area. It will display how many URLs are in the sitemap and how many out of those URLs are indexed.
If you do not want certain pages to be indexed by Google, then you would need to adjust your robots.txt file to give Google those instructions.
As long as you have the correct Canonical configurations, you should avoid any duplicate content issues from the URLs you've described above.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Link to social network
I have very powerfull social network but i do not link to them from my website to improve page speed and avoid external links on main page. Althought i do link from my twitter acounts , facebook and google to my site. Should i link to my twitter or social on main page. I do not want to send people to there. I want to send people from there to my page.
On-Page Optimization | | maestrosonrisas0 -
Schema.org for a rental site with more than one apartment per address
I am looking for advices on how to best start adopting schema.org for an apartment rental site with more than one apartment per address. I would like to get feedback and suggestions on my initial thoughts. Here are the obvious ones: http://schema.org/Place for the address of the building an apartment is in. http://schema.org/ApartmentComplex for the unique page for each apartment. Any thoughts or experiences you would like to share? Thanks, Adrien O'Leary
On-Page Optimization | | AdrienOLeary0 -
Does Google follow link path or url path when it comes to passing link juice
I noticed something with one of my sites and now I am thinking I made a boo boo (I think) here is what I have On my homepage I have 5 links Link1
On-Page Optimization | | cbielich
Link2
Link3
Link4
Link5 Links 1 - 4 go to a page and stops there. So my URL structure is www.mydomain.com/Link1
www.mydomain.com/Link2
www.mydomain.com/Link3
www.mydomain.com/Link4 So naturally my link juice passes down to these links evenly. Link5 also goes to another page, but on that page I have more links that go down further. www.mydomain.com/Link5 -> more links On page Link5 I have links that go to more pages, BUT my URL structure for these pages go like this Lets say on Link5 page I have another link that goes to AnotherLink1, AnotherLink2 and AnotherLink3 When you click on those links it takes you to those pages just fine, BUT my URL structure is like this www.mydomain.com/AnotherLink1
www.mydomain.com/AnotherLink2
www.mydomain.com/AnotherLink3 Basically I put all the "AnotherLink1-3" in the root directory as well. My question is concerning how Google passes the link Juice from my pages and if it is passing based on the path of the links and how they point to those pages, or do they pass link juice based on the URL structure. So since "AnotherLink1-3" is located in the root directory am I dividing my link juice from my home page to all the links as well based on the URL structure. For instance www.mydomain.com/Link1
www.mydomain.com/Link2
www.mydomain.com/Link3
www.mydomain.com/Link4
www.mydomain.com/Link5
www.mydomain.com/AnotherLink1
www.mydomain.com/AnotherLink2
www.mydomain.com/AnotherLink3 Do I need to change my path for Link5 page to www.mydomain.com/Link5/AnotherLink1
www.mydomain.com/Link5/AnotherLink2
www.mydomain.com/Link5/AnotherLink3 ?0 -
No link data in GWT
Has anyone had this problem - no data in the links section of GWT? I haven't set a preferred domain, so I don't understand what's up.
On-Page Optimization | | HeatherBakerTopLine0 -
Impact of nofollow links
Does anyone know what the impact of a nofollowed link is on the ranking value any given page has to distribute? For example, if I have 2 links on a page, both followed, I know those links each distribute nearly 50% of the total ranking value the current page has to offer. However, if one of those links is nofollowed, does that automatically mean the other link gets the ranking value cast off by the nofollowed link? In other words, the single followed link now distributes nearly 100% of the ranking value the page has to offer? It seems to me I remember hearing this was not the case and that the ranking value a nofollowed link would have if it were followed just evaporates. This would mean the single followed link still only passes on around 50%...not 100%. Is the effect different if the links are internal vs. external? If any citations are available to justify knowledge here, that would be great. I know a lot of people have opinions about this subject, but I'm not sure anyone knows Google's position. Thanks!
On-Page Optimization | | RyanOD0 -
Do we have too many links in our footer?
Hi guys, we have 41 links on our holiday(vacation) rental website, this seems too many when looking at best practice. 24 of these are links to community pages while 8 link to activities pages. The community and activity pages are also accessible from links on the top menu so they are not strictly necessary but do get 10% of site clickthroughs according to Google in-page analytics. I therefore do not want to remove the links if there is no good evidence that google will penalize us for this. What do you think would be best for our site? Thanks, John Tulley. footer.jpg
On-Page Optimization | | JohnTulley0 -
Too many on page links
I'm having trouble interpreting this data. It says several of my blog pages have too many on page links, some as high as 140 and there is no example of a blog post that they are referring to. What am I missing? I never post more than a handful (5-7) in our 600-1000wd blogs. When I drill down, it doesn't give me very much information except "Found over 41 years ago" off to the right. When I click on the "too many on page links" URL, it provides a long list of website pages that are renamed with the blog name. huh? A lot of this stuff isn't very intuitive, SEOMoz.
On-Page Optimization | | amandahx20 -
Three Sites or One?
I have a client who provides three distinct, although related, services. Some of his competitors only provide one of those services, and thus their sites are more saturated with that particular service. Would it be best to develop three different sites optimized for each particular service, or could I achieve the same effect by optimizing different sections of one site for each service?
On-Page Optimization | | kscotbarr0