Crawl Budget vs Canonical
-
Got a debate raging here and I figured I'd ask for opinions. We have our websites structured as
site/category/product
This is fine for URL keywords, etc. We also use this for breadcrumbs. The problem is that we have multiple categories into which a category fits. So "product" could also be at
site/cat1/product
site/cat2/product
site/cat3/productObviously this produces duplicate content. There's no reason why it couldn't live under 1 URL but it would take some time and effort to do so (time we don't necessarily have). As such, we're applying the canonical band-aid and calling it good. My problem is that I think this will still kill our crawl budget (this is not an insignificant number of pages we're talking about). In some cases the duplicate pages are bloating a site by 500%.
So what say you all? Do we just simply do canonical and call it good or do we need to take into account the crawl budget and actually remove the duplicate pages. Or am I totally off base and canonical solves the crawl budget issue as well?
-
agreed! we ran into the same problem with content (articles, etc). if you think of it in the same way as blog posts, they each have a unique URL, but with tags (i.e. categories) you are able to get them posted to the appropriate category landing pages.
have a somewhat related issue that i posted here
-
Another great way to go is to not put the category in the product URL. That was usually the best solution when I work on e-commerce sites.
-
Hi Highland,
I would defiantly work on making sure that your product only lives in one category. The canonical tag is a nice little band-aid but it still fix the root of the problem. I would suggest you can have it listed in many different categories but it only lives in one category at the product level. So for instance:
It's displayed here
site/cat1
site/cat2
site/cat3But it only displays product details at a url like this
site/category/product
I'm not a huge fan of having Google crawl 4 or 5 extra pages per product just to find a canonical tag when you could just spend the extra programming time to make it work correctly.
Casey
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I am Using <noscript>in All Webpage and google not Crawl my site automatically any solution</noscript>
| |
Web Design | | ahtisham2018
| | <noscript></span></td> </tr> <tr> <td class="line-number"> </td> <td class="line-content"><meta http-equiv="refresh" content="0;url=errorPages/content-blocked.jsp?reason=js"></td> </tr> <tr> <td class="line-number"> </td> <td class="line-content"><span class="html-tag"></noscript> | and Please tell me effect on seo or not1 -
Regarding rel=canonical on duplicate pages on a shopping site... some direction, please.
Good morning, Moz community: My name is David, and I'm currently doing internet marketing for an online retailer of marine accessories. While many product pages and descriptions are unique, there are some that have the descriptions duplicated across many products. The advice commonly given is to leave one page as is / crawlable (probably best for one that is already ranking/indexed), and use rel=canonical on all duplicates. Any idea for direction on this? Do you think it is necessary? It will be a massive task. (also, one of the products that we rank highest for, we have tons of duplicate descriptions.... so... that is sort of like evidence against the idea?) Thanks!
Web Design | | DavidCiti0 -
Hi, I have a doubt. If we want to hide unwanted text in a web page its possible with "" tag. And my question "does a search engine crawl those text? help me.
I want to hide a lot of text behind my site page. I know its possible with that tag. But in what way a search engine looks at those text? Hidden or they are crawled and indexed.
Web Design | | FhyzicsBCPL0 -
Do you know any tool(s) to check if Google can crawl a URL?
Our site is currently blocking search bots that's why I can't use Google Webmaster Tools' URL fetch tool. In Screamingfrog, there are dynamic pages that can't be found if I crawl the homepage. Thanks in advance!
Web Design | | esiow20130 -
Crawl Diagnostics Summary - Duplicate Content
Hello SEO Experts, I am a developer at www.bowanddrape.com and we are working on improving the SEO of the website. The SEOMoz Crawl Diagnostics Summary shows that following 2 URL have duplicate content. http://www.bowanddrape.com/clothing/Tan+Accessories+Calfskin+Belt/50_5142 http://www.bowanddrape.com/clothing/Black+Accessories+Calfskin+Belt/50_5143 Can you please suggest me ways to fix this problem? Is the duplicate content error because of same "The Details", "Size Chart" and "The Silhouette" and "You may also like" ? Thanks, Chirag
Web Design | | ChiragNirmal0 -
Pagenation - Crawl Issue
Hi,
Web Design | | semvibe
We have a site with large number of products (6000 +) under each categories and so we have made a page under each category to list out all products (View all page), which lists out product in pagenation setup built on Ajax. The problem is only our 1st page is crawlable and all the other pages beyond 1st page remains hidden,
We need make all our pagenation URL’s crawlable, our requirements are we never want a change in URL as user goes to next page, want to show the user the same URL for all the pagenation numbers. Is there a perfect solution?0 -
Question on Breadcrumb and Canonical
Hi SEOmozers, I have another question. =] Thanks in advance. First question: How important is the breadcrumb for SEO? I know that breadcrumb makes better UX because it shows how the visitor landed on this page and the breadcrumb may show up in the search engine. But other than that, how important is it? Second Question: If I have a page that can be found via 2 locations, how should I handle this in regards to breadcrumb? For example, I have page A. You can access page A via Category A and Category B. Therefore, what I did was list Page A under Category A and when someone visit Category B and click on Page A, it will redirect to the page A that was found via Category A. The problem is on page A, the breadcrumb is Home > Category A > Page A. So if someone visit Category B and click on Page A, it redirects and the breadcrumb shows Home > Category A > Page A. What should I do with the breadcrumb for Category B > Page A? Should I create another page A and just use canonical on it? Should I create another page A but do not index it? or leave it as is? 1 Page A, can be access via 2 categories. Please advise. Thank you!
Web Design | | TommyTan0 -
Infinite Scrolling vs. Pagination on an eCommerce Site
My company is looking at replacing our ecommerce site's paginated browsing with a Javascript infinite scroll function for when customers view internal search results--and possibly when they browse product categories also. Because our internal linking structure isn't very robust, I'm concerned that removing the pagination will make it harder to get the individual product pages to rank in the SERPs. We have over 5,000 products, and most of them are internally linked to from the browsing results pages in the category structure: e.g. Blue Widgets, Widgets Under $250, etc. I'm not too worried about removing pagination from the internal search results pages, but I'm concerned that doing the same for these category pages will result in de-linking the thousands of product pages that show up later in the browsing results and therefore won't be crawlable as internal links by the Googlebot. Does anyone have any ideas on what to do here? I'm already arguing against the infinite scroll, but we're a fairly design-driven company and any ammunition or alternatives would really help. For example, would serving a different page to the Googlebot in this case be a dangerous form of cloaking? (If the only difference is the presence of the pagination links.) Or is there any way to make rel=next and rel=prev tags work with infinite scrolling?
Web Design | | DownPour0