Crawl Budget vs Canonical
-
Got a debate raging here and I figured I'd ask for opinions. We have our websites structured as
site/category/product
This is fine for URL keywords, etc. We also use this for breadcrumbs. The problem is that we have multiple categories into which a category fits. So "product" could also be at
site/cat1/product
site/cat2/product
site/cat3/productObviously this produces duplicate content. There's no reason why it couldn't live under 1 URL but it would take some time and effort to do so (time we don't necessarily have). As such, we're applying the canonical band-aid and calling it good. My problem is that I think this will still kill our crawl budget (this is not an insignificant number of pages we're talking about). In some cases the duplicate pages are bloating a site by 500%.
So what say you all? Do we just simply do canonical and call it good or do we need to take into account the crawl budget and actually remove the duplicate pages. Or am I totally off base and canonical solves the crawl budget issue as well?
-
agreed! we ran into the same problem with content (articles, etc). if you think of it in the same way as blog posts, they each have a unique URL, but with tags (i.e. categories) you are able to get them posted to the appropriate category landing pages.
have a somewhat related issue that i posted here
-
Another great way to go is to not put the category in the product URL. That was usually the best solution when I work on e-commerce sites.
-
Hi Highland,
I would defiantly work on making sure that your product only lives in one category. The canonical tag is a nice little band-aid but it still fix the root of the problem. I would suggest you can have it listed in many different categories but it only lives in one category at the product level. So for instance:
It's displayed here
site/cat1
site/cat2
site/cat3But it only displays product details at a url like this
site/category/product
I'm not a huge fan of having Google crawl 4 or 5 extra pages per product just to find a canonical tag when you could just spend the extra programming time to make it work correctly.
Casey
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Help with 302 Temporary Redirect warning via MOZ crawl
Hi Guys, This is my first post so hopefully I'm using the forum correctly. MOZ crawl tells me that I have 35 pages with a temporary redirect The URL column displays 302 Found along with the http:// URL Redirection Location column shows the corresponding https:// URL This all seems pretty self explanatory. However, I’ve checked my .htaccess file and I can’t see any 302 references in it. I'm trying to figure out where the 302 redirects are from and how I can make them permanent Please can anyone help me out? My .htaccess looks like it needs a little tidy (there are 2 if blocks) <ifmodule mod_rewrite.c="">RewriteEngine On
Web Design | | ianalannash
RewriteCond %{SERVER_PORT} 80
RewriteRule ^(.*)$ https://www.mysite.com/$1 [R,L]</ifmodule> BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
RewriteCond %{HTTP_HOST} ^mysite.com$ [NC]
RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L]</ifmodule> END WordPress0 -
Dedicated landing pages vs responsive web design
I've been doing some research into web design and page layout as my company is considering a re-design. However, we have come to an argument around responsive webdesign vs SEO. The argument is around me (SEO specialist) arguing that I want dedicated pages for all my content as it's good for SEO since it focuses keywords and content properly, and it still adheres to good user journeys (providing it's done correctly), and my web designer arguing that mobile traffic is on the rise (which it is I know) so we should have more content under 1 URL and use responsive web design so that users can just scroll through content instead of having to keep be direct to different pages. What do I do... I can't find any blogs, questions, or whiteboards that really touches on this topic, so can anyone advise me on whether I should: Create dedicated landing pages for each bit of content which is good for SEO and taking users on a journey around my site OR All content that is relative to a landing page, put all under that one URL (e.g. "About us" may have info on the company, our team, our history, careers) and allow people to scroll down what could be a very long page on any device, but may effect SEO as I can't focus keywords/content under one URL properly, so it may effect rankings. Any advice SEO and user experience whizzes out there?
Web Design | | blackboxideas0 -
Site is getting crushed by spam traffic and Google Webmaster Tools giving crawl warnings. Also...
Currently hosting a site I'm planning on moving to a new server ASAP, 301 redirecting and have a domain that has nice authority and very old. On the current site I need to clean up the blog. I have a few questions actually.... 1. I'd like to remove most of the blog articles as I want the new site to be very high quality, but isn't it dangerous to do a 301 redirect to the same page for all these articles? 2. I want to focus on the new site as the current site has too many issues but still managing to hang in their. is highly outdated yet I don't want to spend a ton of time on the site before the 301 redirect. With the Pigeon and Panda 4.0 rumors being released soon, I want to get the new site completed ASAP. Do you think it's better if I fix the 3. Would removing cloudflare make things better or worse with the crashing of my site due to high traffic (mainly spam on the blog.) 4. My best article by far is outdated, but should I waste time updating it before redirecting or should I just get the new site going? I did way too many guest posts thinking content is king, but at least checked the outgoing links Domain Auth, Page Auth, and MozTrust in OSE, but first off I'm going to remove a page that mentions I'm looking for guest bloggers. I tried to keep the posts relevant but at the time you could get away with 5. Anything I can do to slow down these spammers on Wordpress? I noticed most of them are checking for vulnerabilities but I'm keeping it up to date, have caching setup. Thanks!
Web Design | | eugenecomputergeeks0 -
Advice needed: Google crawling for single page applicartions with java script
Hi Moz community,we have a single page application (enjoywishlist.com) with a lot of content in java script light boxes. There is a lot of valuable content embedded but google can not crawl the content and we can missing out on some opportunities as a result. I was wondering if someone was able to solve a similar issue (besides moving the content from the java script to the HTML body). There appears to be a few services sprouting up to handle single page applications and crawling in google.http://getseojs.com/https://prerender.io/Did anyone use these services? Some feedback would be much appreciated!ThanksAndreas
Web Design | | AndreasD0 -
Post vs Pages
Does Google make any distinction between a web page and a blog post? Assuming all else is equal...any reason why a page would rank higher than a post? And that includes a page in WordPress vs a WordPress blog post.
Web Design | | Pinlaser1 -
Question on Breadcrumb and Canonical
Hi SEOmozers, I have another question. =] Thanks in advance. First question: How important is the breadcrumb for SEO? I know that breadcrumb makes better UX because it shows how the visitor landed on this page and the breadcrumb may show up in the search engine. But other than that, how important is it? Second Question: If I have a page that can be found via 2 locations, how should I handle this in regards to breadcrumb? For example, I have page A. You can access page A via Category A and Category B. Therefore, what I did was list Page A under Category A and when someone visit Category B and click on Page A, it will redirect to the page A that was found via Category A. The problem is on page A, the breadcrumb is Home > Category A > Page A. So if someone visit Category B and click on Page A, it redirects and the breadcrumb shows Home > Category A > Page A. What should I do with the breadcrumb for Category B > Page A? Should I create another page A and just use canonical on it? Should I create another page A but do not index it? or leave it as is? 1 Page A, can be access via 2 categories. Please advise. Thank you!
Web Design | | TommyTan0 -
Does Google have problem crawling ssl sites?
We have a site that was ranking well and recently dropped in traffic and ranking. The whole site is https and and not just the shopping pages. Thats the way the server is setup, they make whole site https. My manager thinks the drop in ranking is due to google not crawling https. I think contrary, but would like some feedback on this. Site is here
Web Design | | anthonytjm0 -
Development site accidentally crawled - Will this cause problems?
We are currently developing a new version of our website and to make it easy to access for all team members, we just set it up on a server accessible via a publicly accessible domain name (ie devsite.com). There has been no SEO and no links created to this site, or so I thought. Recently, I found out that Google somehow found its way to this development site and has been indexing the pages! I was a little alarmed, as there are no links to the domain and we'll soon be transitioning all the content over to our primary production domain. I immediately created a robots.txt file to disallow access to the entire development domain. My fear is that there may be some duplicate content penalty if Google sees that the content that is on our new site (once it goes live and is pushed to our REAL domain name) was previously indexed on our test domain. We're slated to launch in 2-3 weeks. Is there anything else I should do? Should I even be worried? I'm probably a bit paranoid, but given the amount of time and effort that has gone into this new site, I love any advice or thoughts. Thank You!
Web Design | | AndrewY0