What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
-
We are now introducing 5 links in all our category pages for different sorting options of category listings.
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon.Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization?
On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.
-
With canonicals, I would not worry about the incoming pages. If the new content is useful and relevant, plus linked to internally, they should do fine in terms of indexation. Use the canonical for now, and once you launch the new pages, well a month after launch, if there are key pages not getting indexed, then you can reassess. The canonical is the right thing to do in this case.
As for link equity, you are right, that is a simplistic view of it. It is actually much more intricate than that, but that's a good basic understanding. However, the canonical is not going to hurt your internal link equity. Those links to the different sorting are navigational in nature and the structure will be repeated throughout the site. Google's algo is good at determining internal, editorial links versus those that are navigational in nature. The navigational links don't impact the strength nearly as much as an editorial link.
My personal belief is that you are worrying about something that isn't going to make an impact on your organic traffic. Ensure the correct canonicals are in place and launch the new content. If that new content has the same issue with sorting, use canonicals there as well and let Google figure it out. "They" have gotten pretty good at identifying what to keep and what not.
If you don't want the sorting pages in there at all, you'll need to do one of the following:
- Noindex, disallow in robots.txt - Rhea Drysdale showed me a few years back that you can do a disallow and noindex in robots. If you do both, Google gets the command to not only noindex the URLs, but also cannot crawl the content.
- Noindex, nofollow using meta robots - This would stop all link equity flow from these pages. If you want to attempt to stop flow to these pages, you'll need to nofollow any links to them. The pages can still be crawled however.
- Noindex, follow - Same as above but internal link equity would still flow. Again, if you want to attempt to cut off link equity to these sorting pages, any links to them would need to be nofollowed.
- Disallow in robots - This would stop them from crawling the content, but the URLs could technically still be indexed.
Personally, I believe trying to manage link equity using nofollow is a waste of time. You more than likely have other things that could be making larger impacts. The choice is yours however and I always recommend testing anything to see if it makes an impact.
-
Kate. The domain has 100.000 pages and will scale to over 1 million unique pages during the next couple of months. I do not want the Sorting URLs have any negative effect on the new indexing of the new 900.000 unique pages in the next months.
Regarding link equity. My simplified understanding of link equity is that if a page has 10 links then each link carries 10% of the total link juice of the page. If now 5 of the 10 links do link to a canonical version of the same page (=sorting URLs), I may be losing out on 50% of the potential link juice the page carries. This is my concern. Therefore my doubt is if I should rather try to hide these sorting URLs from google (same as was also recommended by Rand for facetted navigation pages that one does not consider important for being indexed).
-
Is your issue with crawling or indexing? Those are two separate issues. Why don't you want Google having the canonicals in the index? If you can give me some more insight, I can try to recommend the best option.
And I'm not following your last question. Can you try to ask it another way?
-
Hi Kate, thanks lot. Yes canonical is something we should definetly do and we have implemented.
Still I had the experience in the past that google also indexed lots of canonicalized URLs with near identical content. Any additional step I could do to minimize indexing of these URLs further?
Wouldn't then the basically "self referencing" URLS of sorting links (going to canonicalized versions of the same page) be lost for link equity?
-
This one would need a canonical. For one category page with 5 different sort options, you'd need one canonical URL (one without any sorting or the default sorting) and point all others to that URL using a canonical tag.
https://support.google.com/webmasters/answer/139066?hl=en
Would that work for your setup? If I understand your situation correctly, this should work. It consolidates link equity and allows Google to choose what needs to be indexed and served.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does redirecting a duplicate page NOT in Google‘s index pass link juice? (External links not showing in search console)
Hello! We have a powerful page that has been selected by Google as a duplicate page of another page on the site. The duplicate is not indexed by Google, and the referring domains pointing towards that page aren’t recognized by Google in the search console (when looking at the links report). My question is - if we 301 redirect the duplicate page towards the one that Google has selected as canonical, will the link juice be passed to the new page? Thanks!
Intermediate & Advanced SEO | | Lewald10 -
Best SEO practice for multiple languages in website
HI, We would like to include multiple languages for our global website. What's the best practice to gain from UI and SEO too. Can we have auto language choosing website as per browsing location? Or dedicated pages for important languages like www.website.com/de for German. If we go for latter, how about when users browsing beside language page as they will be usually in English
Intermediate & Advanced SEO | | vtmoz0 -
Best Way To Go About Fixing "HTML Improvements"
So I have a site and I was creating dynamic pages for a while, what happened was some of them accidentally had lots of similar meta tags and titles. I then changed up my site but left those duplicate tags for a while, not knowing what had happened. Recently I began my SEO campaign once again and noticed that these errors were there. So i did the following. Removed the pages. Removed directories that had these dynamic pages with the remove tool in google webmasters. Blocked google from scanning those pages with the robots.txt. I have verified that the robots.txt works, the pages are longer in google search...however it still shows up in in the html improvements section after a week. (It has updated a few times). So I decided to remove the robots.txt file and now add 301 redirects. Does anyone have any experience with this and am I going about this the right away? Any additional info is greatly appreciated thanks.
Intermediate & Advanced SEO | | tarafaraz0 -
Is this the "Google Dance"?
We just did a site redesign, and removed the noindex, etc. about 10 days ago. Over the last 24 hours, I've gotten some of my top keywords on the first page, but now they are gone, a few hours later. I assume this is typical?
Intermediate & Advanced SEO | | CsmBill0 -
Best way to permanently remove URLs from the Google index?
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt. I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt). What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login. What is the next best solution?
Intermediate & Advanced SEO | | nicole.healthline0 -
"Category" word in URLs of blog is it SEO Friendly URL ??
Hello respected community members, I saw many times that "Category" word comes in URL of blog. So my que is that is this negative for SEO or Positive. & if we don't wanna to come CATEGORY in URL how can we remove while URL Optimization ?
Intermediate & Advanced SEO | | sourabhrana390 -
I try to apply best duplicate content practices, but my rankings drop!
Hey, An audit of a client's site revealed that due to their shopping cart, all their product pages were being duplicated. http://www.domain.com.au/digital-inverter-generator-3300w/ and http://www.domain.com.au/shop/digital-inverter-generator-3300w/ The easiest solution was to just block all /shop/ pages in Google Webmaster Tools (redirects were not an easy option). This was about 3 months ago, and in months 1 and 2 we undertook some great marketing (soft social book marking, updating the page content, flickr profiles with product images, product manuals onto slideshare etc). Rankings went up and so did traffic. In month 3, the changes in robots.txt finally hit and rankings decreased quite steadily over the last 3 weeks. Im so tempted to take off the robots restriction on the duplicate content.... I know I shouldnt but, it was working so well without it? Ideas, suggestions?
Intermediate & Advanced SEO | | LukeyJamo0 -
Best way to consolidate link juice
I've got a conundrum I would appreciate your thoughts on. I have a main container page listing a group of products, linking out to individual product pages. The problem I have is the all the product pages target exactly the same keywords as the main product page listing all the products. Initially all my product pages were ranking much higher then the container page, as there was little individual text on the container page, and it was being hit with a duplicate content penality I believe. To get round this, on the container page, I have incorporated a chunk of text from each product listed on the page. However, that now means "most" of the content on an individual product page is also now on the container page - therefore I am worried that i will get a duplicate content penality on the product pages, as the same content (or most of it) is on the container page. Effectively I want to consolidate the link juice of the product pages back to the container page, but i am not sure how best to do this. Would it be wise to rel=canonical all the product pages back to the container page? Rel=nofollow all the links to the product pages? - or possibly some other method? Thanks
Intermediate & Advanced SEO | | James770