Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
What is best practice for "Sorting" URLs to prevent indexing and for best link juice ?
-
We are now introducing 5 links in all our category pages for different sorting options of category listings.
The site has about 100.000 pages and with this change the number of URLs may go up to over 350.000 pages.
Until now google is indexing well our site but I would like to prevent the "sorting URLS" leading to less complete crawling of our core pages, especially since we are planning further huge expansion of pages soon.Apart from blocking the paramter in the search console (which did not really work well for me in the past to prevent indexing) what do you suggest to minimize indexing of these URLs also taking into consideration link juice optimization?
On a technical level the sorting is implemented in a way that the whole page is reloaded, for which may be better options as well.
-
With canonicals, I would not worry about the incoming pages. If the new content is useful and relevant, plus linked to internally, they should do fine in terms of indexation. Use the canonical for now, and once you launch the new pages, well a month after launch, if there are key pages not getting indexed, then you can reassess. The canonical is the right thing to do in this case.
As for link equity, you are right, that is a simplistic view of it. It is actually much more intricate than that, but that's a good basic understanding. However, the canonical is not going to hurt your internal link equity. Those links to the different sorting are navigational in nature and the structure will be repeated throughout the site. Google's algo is good at determining internal, editorial links versus those that are navigational in nature. The navigational links don't impact the strength nearly as much as an editorial link.
My personal belief is that you are worrying about something that isn't going to make an impact on your organic traffic. Ensure the correct canonicals are in place and launch the new content. If that new content has the same issue with sorting, use canonicals there as well and let Google figure it out. "They" have gotten pretty good at identifying what to keep and what not.
If you don't want the sorting pages in there at all, you'll need to do one of the following:
- Noindex, disallow in robots.txt - Rhea Drysdale showed me a few years back that you can do a disallow and noindex in robots. If you do both, Google gets the command to not only noindex the URLs, but also cannot crawl the content.
- Noindex, nofollow using meta robots - This would stop all link equity flow from these pages. If you want to attempt to stop flow to these pages, you'll need to nofollow any links to them. The pages can still be crawled however.
- Noindex, follow - Same as above but internal link equity would still flow. Again, if you want to attempt to cut off link equity to these sorting pages, any links to them would need to be nofollowed.
- Disallow in robots - This would stop them from crawling the content, but the URLs could technically still be indexed.
Personally, I believe trying to manage link equity using nofollow is a waste of time. You more than likely have other things that could be making larger impacts. The choice is yours however and I always recommend testing anything to see if it makes an impact.
-
Kate. The domain has 100.000 pages and will scale to over 1 million unique pages during the next couple of months. I do not want the Sorting URLs have any negative effect on the new indexing of the new 900.000 unique pages in the next months.
Regarding link equity. My simplified understanding of link equity is that if a page has 10 links then each link carries 10% of the total link juice of the page. If now 5 of the 10 links do link to a canonical version of the same page (=sorting URLs), I may be losing out on 50% of the potential link juice the page carries. This is my concern. Therefore my doubt is if I should rather try to hide these sorting URLs from google (same as was also recommended by Rand for facetted navigation pages that one does not consider important for being indexed).
-
Is your issue with crawling or indexing? Those are two separate issues. Why don't you want Google having the canonicals in the index? If you can give me some more insight, I can try to recommend the best option.
And I'm not following your last question. Can you try to ask it another way?
-
Hi Kate, thanks lot. Yes canonical is something we should definetly do and we have implemented.
Still I had the experience in the past that google also indexed lots of canonicalized URLs with near identical content. Any additional step I could do to minimize indexing of these URLs further?
Wouldn't then the basically "self referencing" URLS of sorting links (going to canonicalized versions of the same page) be lost for link equity?
-
This one would need a canonical. For one category page with 5 different sort options, you'd need one canonical URL (one without any sorting or the default sorting) and point all others to that URL using a canonical tag.
https://support.google.com/webmasters/answer/139066?hl=en
Would that work for your setup? If I understand your situation correctly, this should work. It consolidates link equity and allows Google to choose what needs to be indexed and served.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to stop URLs that include query strings from being indexed by Google
Hello Mozzers Would you use rel=canonical, robots.txt, or Google Webmaster Tools to stop the search engines indexing URLs that include query strings/parameters. Or perhaps a combination? I guess it would be a good idea to stop the search engines crawling these URLs because the content they display will tend to be duplicate content and of low value to users. I would be tempted to use a combination of canonicalization and robots.txt for every page I do not want crawled or indexed, yet perhaps Google Webmaster Tools is the best way to go / just as effective??? And I suppose some use meta robots tags too. Does Google take a position on being blocked from web pages. Thanks in advance, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
I found a lot of duplicate title tags showing in Google Webmaster Tools. When I visited the URL's that these duplicates belonged to, I found that they were just images from a gallery that we didn't particularly want Google to index. There is no benefit to the end user in these image pages being indexed in Google. Our developer has told us that these urls are created by a module and are not "real" pages in the CMS. They would like to add the following to our robots.txt file Disallow: /catalog/product/gallery/ QUESTION: If the these pages are already indexed by Google, will this adjustment to the robots.txt file help to remove the pages from the index? We don't want these pages to be found.
Intermediate & Advanced SEO | | andyheath0 -
Does Google Index URLs that are always 302 redirected
Hello community Due to the architecture of our site, we have a bunch of URLs that are 302 redirected to the same URL plus a query string appended to it. For example: www.example.com/hello.html is 302 redirected to www.example.com/hello.html?___store=abc The www.example.com/hello.html?___store=abc page also has a link canonical tag to www.example.com/hello.html In the above example, can www.example.com/hello.html every be Indexed, by google as I assume the googlebot will always be redirected to www.example.com/hello.html?___store=abc and will never see www.example.com/hello.html ? Thanks in advance for the help!
Intermediate & Advanced SEO | | EcommRulz0 -
URL Value: Menu Links vs Body Content Links
Hi All, I'm a little confused. I have read a number of articles from authority sites that give mixed signals over the importance of menu links vs body content links. It is suggested that whilst all menu links spread link juice equally, Google does not see them as favourably. Inserting a link within the body will add more link juice value to the desired page. Any thoughts would be appreciated. Thanks Mark
Intermediate & Advanced SEO | | Mark_Ch0 -
Duplicate Content www vs. non-www and best practices
I have a customer who had prior help on his website and I noticed a 301 redirect in his .htaccess Rule for duplicate content removal : www.domain.com vs domain.com RewriteCond %{HTTP_HOST} ^MY-CUSTOMER-SITE.com [NC]
Intermediate & Advanced SEO | | EnvoyWeb
RewriteRule (.*) http://www.MY-CUSTOMER-SITE.com/$1 [R=301,L,NC] The result of this rule is that i type MY-CUSTOMER-SITE.com in the browser and it redirects to www.MY-CUSTOMER-SITE.com I wonder if this is causing issues in SERPS. If I have some inbound links pointing to www.MY-CUSTOMER-SITE.com and some pointing to MY-CUSTOMER-SITE.com, I would think that this rewrite isn't necessary as it would seem that Googlebot is smart enough to know that these aren't two sites. -----Can you comment on whether this is a best practice for all domains?
-----I've run a report for backlinks. If my thought is true that there are some pointing to www.www.MY-CUSTOMER-SITE.com and some to the www.MY-CUSTOMER-SITE.com, is there any value in addressing this?0 -
Best Practice For Company/Client Logo Endorsement
Article: http://searchengineland.com/homepage-sliders-are-bad-for-seo-usability-163496 I came across the following article and somewhat agree with the authors summary.
Intermediate & Advanced SEO | | Mark_Ch
I find sliders a distraction to B2B users and overall offers no SEO benefits. Scenario
As a service provider, over time I have worked with many high profile blue chip comnpanies. As part of my site redesign, I'm looking to show users my client achievements. My initial thoughts are to carry out the following: On the home page I'm looking to incorporate some high profile company logos (similar to http://www.semrush.com) with a hyperlink "more customers" to the right of logo caption. The link will take the user to a dedicated page (www.mydomain.co.uk/customer) showing a comprehensive list of company logos. Questions
#1 Is the above practice good or bad.
#2 Is there a better way to achieve the above Any other practical advise on user experience, social engagement, website speed, etc would be much appreciated. Thanks Mark0 -
Best way to block a sub-domain from being indexed
Hello, The search engines have indexed a sub-domain I did not want indexed its on old.domain.com and dev.domain.com - I was going to password them but is there a best practice way to block them. My main domain default robots.txt says :- Sitemap: http://www.domain.com/sitemap.xml global User-agent: *
Intermediate & Advanced SEO | | JohnW-UK
Disallow: /cgi-bin/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /wp-content/plugins/
Disallow: /wp-content/cache/
Disallow: /wp-content/themes/
Disallow: /trackback/
Disallow: /feed/
Disallow: /comments/
Disallow: /category//
Disallow: */trackback/
Disallow: */feed/
Disallow: /comments/
Disallow: /?0 -
Is it a bad idea to have a "press" page and link to press mentions of our company?
We've recently been getting quite a bit of press. Would it be wise to create a "press" page and link to mentions of us or would this devalue the links on the press pages as Google may think they reciprocal?
Intermediate & Advanced SEO | | JenniferDacosta0