Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best-practice URL structures with multiple filter combinations
-
Hello,
We're putting together a large piece of content that will have some interactive filtering elements. There are two types of filters, topics and object types.
The architecture under the hood constrains us so that everything needs to be in URL parameters. If someone selects a single filter, this can look pretty clean:
www.domain.com/project?topic=firstTopic
or
www.domain.com/project?object=typeOneThe problems arise when people select multiple topics, potentially across two different filter types:
www.domain.com/project?topic=firstTopic-secondTopic-thirdTopic&object=typeOne-typeTwo
I've raised concerns around the structure in general, but it seems to be too late at this point so now I'm scratching my head thinking of how best to get these indexed. I have two main concerns:
- A ton of near-duplicate content and hundreds of URLs being created and indexed with various filter combinations added
- Over-reacting to the first point above and over-canonicalizing/no-indexing combination pages to the detriment of the content as a whole
Would the best approach be to index each single topic filter individually, and canonicalize any combinations to the 'view all' page? I don't have much experience with e-commerce SEO (which this problem seems to have the most in common with) so any advice is greatly appreciated.
Thanks!
-
Thanks for the detailed answer Jonathan. What you suggested was definitely in line with my thinking - indexing just the single topics at most and trying to either noindex or canonicalize all the thousands of possible variations. I definitely agree that all those random combinations of topics/objects hold no real value and at best will eat up crawl budget unnecessarily.
I can make sure Google treats these parameters as URLs via Search Console, they're unique to this piece of content; and I think I can noindex all the random combinations of filters (hopefully).
I'm still waiting to hear more from the dev team but I have a feeling that I won't be able to change the format to subdirectories instead of differentiating everything with query parameters - not the ideal situation but I'll have to make do.
Anyways, thanks again for your thoughtful reply!
Josh
-
Google is supposed to disregard everything after the ? in the query string when indexing. However, I know at times query strings will get indexed if the content on the query stringed URL appears different enough to Google. So I would agree with your motive to try to get these dynamic URLs simplified.
From what i have read on similar scenarios, and my first thought is, do these filtered view pages benefit searchers? Typically it benefits searchers to index maybe the category level of pages. In your instance, this may be the first topic. But once URLs start referencing very specific content that one user was filtering for, I would probably suggest a noIndex meta tag. There should be a scalable solution to this so you don't have to individual go into every filtered page possibility and add noIndex to the head.
If there is a specific filtered view you believe may benefit searches, or you have already seen a demand for, I would suggest making this a page using subfolders
www.domain.com/project/firstTopic/typeOne
and noIndexing all the crazy dynamically generated query string URLs. This should allow you to seize opportunities where you see search demand and alleviate any duplicate content risks.
If you don't want to noIndex, I would gauge the quality of these nitty gritty filtered pages, and if you see value in them, I would agree canonicalizing to the preceding category page sounds like a good solution.
I think this article does a good job explaining this. It suggests that if your filters are just narrowing content on the page rather than changing it, to noIndex or canonicalize (Although, the author does remind you that canonicalization is only a suggestion to Google and is not followed 100% of time for all scenarios).
I hope this helps, and if you don't see how these solutions would be implemented on your site, this issue may require some dev help.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best Practices for Title Tags for Product Listing Page
My industry is commercial real estate in New York City. Our site has 300 real estate listings. The format we have been using for Title Tags are below. This probably disastrous from an SEO perspective. Using number is a total waste space. A few questions:
Intermediate & Advanced SEO | | Kingalan1
-Should we set listing not no index if they are not content rich?
-If we do choose to index them, should we avoid titles listing Square Footage and dollar amounts?
-Since local SEO is critical, should the titles always list New York, NY or Manhattan, NY?
-I have red that titles should contain some form of branding. But our company name is Metro Manhattan Office Space. That would take up way too much space. Even "Metro Manhattan" is long. DO we need to use the title tag for branding or can we just focus on a brief description of page content incorporating one important phrase? Our site is: w w w . m e t r o - m a n h a t t a n . c o m <colgroup><col width="405"></colgroup>
| Turnkey Flatiron Tech Space | 2,850 SF $10,687/month | <colgroup><col width="405"></colgroup>
| Gallery, Office Rental | Midtown, W. 57 St | 4441SF $24055/month | <colgroup><col width="405"></colgroup>
| Open Plan Loft |Flatiron, Chelsea | 2414SF $12,874/month | <colgroup><col width="405"></colgroup>
| Tribeca Corner Loft | Varick Street | 2267SF $11,712/month | <colgroup><col width="405"></colgroup>
| 275 Madison, LAW, P7, 3,252SF, $65 - Manhattan, New York |0 -
SEO Best Practices regarding Robots.txt disallow
I cannot find hard and fast direction about the following issue: It looks like the Robots.txt file on my server has been set up to disallow "account" and "search" pages within my site, so I am receiving warnings from the Google Search console that URLs are being blocked by Robots.txt. (Disallow: /Account/ and Disallow: /?search=). Do you recommend unblocking these URLs? I'm getting a warning that over 18,000 Urls are blocked by robots.txt. ("Sitemap contains urls which are blocked by robots.txt"). Seems that I wouldn't want that many urls blocked. ? Thank you!!
Intermediate & Advanced SEO | | jamiegriz0 -
Best practice for deindexing large quantities of pages
We are trying to deindex a large quantity of pages on our site and want to know what the best practice for doing that is. For reference, the reason we are looking for methods that could help us speed it up is we have about 500,000 URLs that we want deindexed because of mis-formatted HTML code and google indexed them much faster than it is taking to unindex them unfortunately. We don't want to risk clogging up our limited crawl log/budget by submitting a sitemap of URLs that have "noindex" on them as a hack for deindexing. Although theoretically that should work, we are looking for white hat methods that are faster than "being patient and waiting it out", since that would likely take months if not years with Google's current crawl rate of our site.
Intermediate & Advanced SEO | | teddef0 -
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
Slug best practices?
Hello, my team is trying to understand how to best construct slugs. We understand they need to be concise and easily understandable, but there seem to be vast differences between the three examples below. Are there reasons why one might be better than the others? http://www.washingtonpost.com/news/morning-mix/wp/2014/06/20/bad-boys-yum-yum-violent-criminal-or-not-this-mans-mugshot-is-heating-up-the-web/ http://hollywoodlife.com/2014/06/20/jeremy-meeks-sexy-mug-shot-felon-viral/ http://www.tmz.com/2014/06/19/mugshot-eyes-felon-sexy/
Intermediate & Advanced SEO | | TheaterMania0 -
Submitting URLs multiple times in different sitemaps
We have a very dynamic site, with a large number of pages. We use a sitemap index file, that points to several smaller sitemap files. The question is: Would there be any issue if we include the same URL in multiple sitemap files? Scenario: URL1 appears on sitemap1. 2 weeks later, the page at URL1 changes and we'd like to update it on a sitemap. Would it be acceptable to add URL1 as an entry in sitemap2? Would there be any issues with the same URL appearing multiple times? Thanks.
Intermediate & Advanced SEO | | msquare0 -
There's a website I'm working with that has a .php extension. All the pages do. What's the best practice to remove the .php extension across all pages?
Client wishes to drop the .php extension on all their pages (they've got around 2k pages). I assured them that wasn't necessary. However, in the event that I do end up doing this what's the best practices way (and easiest way) to do this? This is also a WordPress site. Thanks.
Intermediate & Advanced SEO | | digisavvy0 -
URL Length or Exact Breadcrumb Navigation URL? What's More Important
Basically my question is as follows, what's better: www.romancingdiamonds.com/gemstone-rings/amethyst-rings/purple-amethyst-ring-14k-white-gold (this would fully match the breadcrumbs). or www.romancingdiamonds.com/amethyst-rings/purple-amethyst-ring-14k-white-gold (cutting out the first level folder to keep the url shorter and the important keywords are closer to the root domain). In this question http://www.seomoz.org/qa/discuss/37982/url-length-vs-url-keywords I was consulted to drop a folder in my url because it may be to long. That's why I'm hesitant to keep the bradcrumb structure the same. To the best of your knowldege do you think it's best to drop a folder in the URL to keep it shorter and sweeter, or to have a longer URL and have it match the breadcrumb structure? Please advise, Shawn
Intermediate & Advanced SEO | | Romancing0