Index Bloat: Canonicalize, Redirect or Delete URLs?
-
I was doing some simple on-page recommendations for a client and realized that they have a bit of a website bloat problem. They are an ecommerce shoe store and for one product, there could be 10+ URLs. For example, this is what ONE product looks like:
example.com/products/shoename-color1
example.com/products/shoename-color2
example.com/collections/style/products/shoename-color1
example.com/collections/style/products/shoename-color2
example.com/collections/adifferentstyle/products/shoename-color1
example.com/collections/adifferentstyle/products/shoename-color2
example.com/collections/shop-latest-styles/products/shoename-color1
example.com/collections/shop-latest-styles/products/shoename-color2
example.com/collections/all/products/shoename-color1
example.com/collections/all/products/shoename-color2
...and so on... all for the same shoe. They have about 20-30 shoes altogether, and some come in 4-5 colors. This has caused some major bloat on their site and I assume some confusion for the search engine. That said, I'm trying to figure out what the best way to tackle this is from an SEO perspective.
Here's where I've gotten to so far:
Is it better to canonicalize all URLs, referencing back to one "main" one, delete all bloat pages re-link everything to the main one(s), or 301 redirect the bloat URLs back to the "main" one(s)?
Or is there another option that I haven't considered?
Thanks!
-
Hi there,
This is exactly the case where Google recommends to use canonical, on this resource page: Consolidate duplicate URLs - google Search Console Help.
Keep in mind that canonicals are efficient when different URLs have the same content.
I'd avoid redirections because that would be hurt user experiences when navigating the website, and we know that hurting UX upsets Google.Hope it helps
Best luck.
Gaston
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Recovering from an open redirect
From a previous company we have inherited a domain which once contained an open redirect redirect.magnet.me. Even though this domain has been returning a 410 for every single request directed at it for the last few months, we continue to see new links popping up in Google which refer to this domain https://www.google.com/search?safe=off&q=%22redirect.magnet.me%22 Currently we just keep disavowing these links, but at the same time they keep appearing in our Moz.com tooling, being unable to disavow those links here. Does anybody have any successful tips how to deal with this scenario, compared to only disavow new links after the fact?
Intermediate & Advanced SEO | | rogier_slag0 -
URL indexed but not submitted in sitemap, however the URL is in the sitemap
Dear Community, I have the following problem and would be super helpful if you guys would be able to help. Cheers Symptoms : On the search console, Google says that some of our old URLs are indexed but not submitted in sitemap However, those URLs are in the sitemap Also the sitemap as been successfully submitted. No error message Potential explanation : We have an automatic cache clearing process within the company once a day. In the sitemap, we use this as last modification date. Let's imagine url www.example.com/hello was modified last time in 2017. But because the cache is cleared daily, in the sitemap we will have last modified : yesterday, even if the content of the page did not changed since 2017. We have a Z after sitemap time, can it be that the bot does not understands the time format ? We have in the sitemap only http URL. And our HTTPS URLs are not in the sitemap What do you think?
Intermediate & Advanced SEO | | ZozoMe0 -
Mobile indexing and tabs
Hello, With the new mobile indexing 1 st do search engine (google) give as much value to content in tabs and no visible in the 1 st place as content which is visible on the page ? Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Redirect Issue in .htaccess
Hi, I'm stumped on this, so I'm hoping someone can help. I have a Wordpress site that I migrated to https about a year ago. Shortly after I added some code to my .htaccess file. My intention was to force https and www to all pages. I did see a moderate decline in rankings around the same time, so I feel the code may be wrong. Also, when I run the domain through Open Site Explorer all of the internal links are showing 301 redirects. The code I'm using is below. Thank you in advance for your help! Redirect HTTP to HTTPS RewriteEngine On ensure www. RewriteCond %{HTTP_HOST} !^www. [NC]
Intermediate & Advanced SEO | | JohnWeb12
RewriteRule ^ https://www.%{HTTP_HOST}%{REQUEST_URI} [L,R=301] ensure https RewriteCond %{HTTP:X-Forwarded-Proto} !https
RewriteCond %{HTTPS} off
RewriteRule ^ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress USER IP BANNING <limit get="" post="">order allow,deny
deny from 213.238.175.29
deny from 66.249.69.54
allow from all</limit> #Enable gzip compression
AddOutputFilterByType DEFLATE text/plain
AddOutputFilterByType DEFLATE text/html
AddOutputFilterByType DEFLATE text/xml
AddOutputFilterByType DEFLATE text/css
AddOutputFilterByType DEFLATE application/xml
AddOutputFilterByType DEFLATE application/xhtml+xml
AddOutputFilterByType DEFLATE application/rss+xml
AddOutputFilterByType DEFLATE application/javascript
AddOutputFilterByType DEFLATE application/x-javascript #Setting heading expires
<ifmodule mod_expires.c="">ExpiresActive on
ExpiresDefault "access plus 1 month"
ExpiresByType application/javascript "access plus 1 year"
ExpiresByType image/x-ico "access plus 1 year"
ExpiresByType image/jpg "access plus 14 days"
ExpiresByType image/jpeg "access plus 14 days"
ExpiresByType image/gif "access plus 14 days"
ExpiresByType image/png "access plus 14 days"
ExpiresByType text/css "access plus 14 days"</ifmodule>0 -
301 Redirection
Hi there guys, I have a question about redirection. My boss has just bought a new domain name and he wants it to redirect to our current site when looking for specific products. www.example.com is our current website www.productname.com is the new domain So the new domain would be redirected to example.com. Would that be considered against Google Policies? Thanks
Intermediate & Advanced SEO | | PremioOscar0 -
Redirecting index.html to the root
Hi, I was wondering if there is a safe way to consolidate link juice on a single version of a home page. I find incoming links to my site that link to both mysite.com/ and mysite.com/index.html. I've decided to go with mysite.com/ as my main and only URL for the site and now I'd like to transfer all link juice from mysite.com/index.html to mysite.com/
Intermediate & Advanced SEO | | romanbond
When i tried 301 redirect from index.html to the root it created an indefinite loop, of course. I know I can use a RewriteRule.., but will it transfer the juice?? Please help!5 -
Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site
Hi, Any assistance would be greatly appreciated. Basically, our rankings and traffic etc have been dropping massively recently google sent us a message stating " Googlebot found an extremely high number of URLs on your site". This first highligted us to the problem that for some reason our eCommerce site has recently generated loads (potentially thousands) of rubbish urls hencing giving us duplication everywhere which google is obviously penalizing us with in the terms of rankings dropping etc etc. Our developer is trying to find the route cause of this but my concern is, How do we get rid of all these bogus urls ?. If we use GWT to remove urls it's going to take years. We have just amended our Robot txt file to exclude them going forward but they have already been indexed so I need to know do we put a redirect 301 on them and also a HTTP Code 404 to tell google they don't exist ? Do we also put a No Index on the pages or what . what is the best solution .? A couple of example of our problems are here : In Google type - site:bestathire.co.uk inurl:"br" You will see 107 results. This is one of many lot we need to get rid of. Also - site:bestathire.co.uk intitle:"All items from this hire company" Shows 25,300 indexed pages we need to get rid of Another thing to help tidy this mess up going forward is to improve on our pagination work. Our Site uses Rel=Next and Rel=Prev but no concanical. As a belt and braces approach, should we also put concanical tags on our category pages whereby there are more than 1 page. I was thinking of doing it on the Page 1 of our most important pages or the View all or both ?. Whats' the general consenus ? Any advice on both points greatly appreciated? thanks Sarah.
Intermediate & Advanced SEO | | SarahCollins0 -
Can a XML sitemap index point to other sitemaps indexes?
We have a massive site that is having some issue being fully crawled due to some of our site architecture and linking. Is it possible to have a XML sitemap index point to other sitemap indexes rather than standalone XML sitemaps? Has anyone done this successfully? Based upon the description here: http://sitemaps.org/protocol.php#index it seems like it should be possible. Thanks in advance for your help!
Intermediate & Advanced SEO | | CareerBliss0