[Advice] Dealing with an immense URl structure full of canonicals with Budget & Time constraint
-
Good day to you Mozers,
I have a website that sells a certain product online and, once bought, is specifically delivered to a point of sale where the client's car gets serviced.
This website has a shop, products and informational pages that are duplicated by the number of physical PoS. The organizational decision was that every PoS were supposed to have their own little site that could be managed and modified.
Examples are:
- Every PoS could have a different price on their product
- Some of them have services available and some may have fewer, but the content on these service page doesn't change.
I get over a million URls that are, supposedly, all treated with canonical tags to their respective main page. The reason I use "supposedly" is because verifying the logic they used behind canonicals is proving to be a headache, but I know and I've seen a lot of these pages using the tag.
i.e:
- https:mysite.com/shop/ <-- https:mysite.com/pointofsale-b/shop
- https:mysite.com/shop/productA <-- https:mysite.com/pointofsale-b/shop/productA
The problem is that I have over a million URl that are crawled, when really I may have less than a tenth of them that have organic trafic potential.
Question is:
For products, I know I should tell them to put the URl as close to the root as possible and dynamically change the price according to the PoS the end-user chooses. Or even redirect all shops to the main one and only use that one.I need a short term solution to test/show if it is worth investing in development and correct all these useless duplicate pages. Should I use Robots.txt and block off parts of the site I do not want Google to waste his time on?
I am worried about: Indexation, Accessibility and crawl budget being wasted.
Thank you in advance,
-
Hey Chris!
Thanks a lot for your time. I did send you a PM the day after your original post, I will send you another :).
Thanks a lot for your additionnal advice. You're right about managing client's expectations and its crucial. You're pointing out some valid points and I will have to ponder about how I approach this whole situation.
Charles,
-
Hey Charles,
No problem, I've been out of the office most of the past week so I'm trying to catch up on a few of these now, sorry! I don't recall seeing any PMs either.
I feel weird to recommend shaving 3/4 of their site on which they put a lot of money in.
That's perfectly normal and I'd have the same reservations. If you do decide to go ahead with it though (and I'm absolutely not looking to push you into a decision either way, just providing the info) you can highlight the fact that paying a lot of money for a website doesn't make it inherently good. If those extra pages are providing no unique value then they're just a hindrance to their long-term goal of earning a return from that site via organic traffic.
It's a conversation we have semi-regularly with new clients. They think that because they just spent $20k on a new site, making changes to it is silly and a waste of the money they invested in the first place. "Sure it's broken but it was expensive"... I don't think search engines or users really care how much it cost
in the eyes of the client, it may come off as bold.
It certainly is bold and don't be fooled, there is a reasonable chance their rankings will get worse before they get better. In some cases when we perform a cleanup like this we'll see a brief drop before a steady improvement.
This doesn't happen all the time by any means, in fact we did a smaller scale version of this last week for two new clients and both have already started moving ahead over the weekend without a drop in rankings prior. It's really just about managing expectations and pitching the long term benefit over the short term fear.
Just be very careful in the way you project-manage it - be meticulous with updating internal links and 301 any pages that have external links pointing to them as well. You want to end up with a clean, efficient and crawlable website that retains as much value as possible.
You understand many sets of eyes are directed at them and a lot is to gain.
Also a very valid concern!
I'm probably not telling you anything you don't already know anyhow so don't think I'm trying to lecture you on how to do your job, just sharing my knowledge and anecdotal evidence on similar things.
-
Hey Chris!
Thanks for that lenghty response. It is very much appreciated and so is your offer for help. Let me check with some people to see if I can share the company's name.
[EDIT] Sent you a private msgOne of the reason I want to test the waters is, to be real honest, I feel weird to recommend shaving 3/4 of their site on which they put a lot of money in. I guess it comes down to reassuring them that these changes will be positive, but in the eyes of the client, it may come off as bold.
Another thing is, it is an international business that have different teams for different country. For more than 20 countries, they are the only one to try and sell their product online. You understand many sets of eyes are directed at them and a lot is to gain.
-
Hi Charles,
That's a tough one! I definitely see the motivation to test the waters here first before you go spending time on it but it will likely take less time than you think and either way, the user experience will be significantly better once you're done so I'd expect that either way, your time/dev investment would likely be viable.
I suppose you could block certain sections via Robots and wait to measure the results but I'd be more inclined to throw on the gloves and get elbow deep!
You've already mentioned the issues the current structure causes so you are aware of them which is great. With those in mind, focus on the user experience. What is it they're looking for on your site? How would they expect to find it? Can they find the solution with as few clicks as practical?
Rand did a Whiteboard Friday recently on Cleaning up the Cruft which was a great overview of the broader areas you can often trim your site back down to size. For me anyway, the aim is to have as few pages on the site as practical. If a page(s), category, tag etc doesn't need to exist then just remove it!
It's hard to say or to give specific advice here without seeing your site but chances are if you were to sit down and physically map out your website you'd find a lot of redundancy that, once fixed, would cut your million pages down to a significantly more manageable number. A recent example of this for us was a client who had a bunch of redundant blog categories and tags as well as multiple versions of some URLs due to poor internal linking. We cut their total URL volume from over 300 to just 78 and that alone was enough to significantly improve their search visibility.
I'd be happy to take a closer look at this one if you're willing to share your URL, though I understand if you're not. Either way, the best place to start here will be reviewing your site structure and seeing if it truly makes sense.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Advice needed on canonical paginated pages
Hi there. I use Genesis and StudioPress themes. I recently noticed that the canonical link for blog pages points to the first page on all paginated pages, which I understand is an SEO no-no. I found some code here that adds a unique canonical link to each paginated page but for categories only. It works fine. I only have one category for my site. My question is: is there a downside (or even upside) to not having a blog page and placing a link to my category page in the navigation bar instead, using the category page as the blog page? It looks good and works. What do you think? I find it odd that this seems to be an issue across the Internet and the only solution that comes up relies on the Yoast plugin, which I don't want to use (don't want to use a plugin for SEO). Thanks in advance.
Intermediate & Advanced SEO | | Nobody16165422281340 -
Can I remove certain parameters from the canonical URL?
For example, https://www.jamestowndistributors.com/product/epoxy-and-adhesives?page=2&resultsPerPage=16 is the paginated URL of the category https://www.jamestowndistributors.com/product/epoxy-and-adhesives/. Can I remove the &resultsPerPage= variation from the canonical without it causing an issue? Even though the actual page URL has that parameter? I was thinking of using this: instead of: What is the best practice?
Intermediate & Advanced SEO | | laurengdicenso0 -
URL Structure & Best Practice when Facing 4+ Sub-levels
Hi. I've spent the last day fiddling with the setup of a new URL structure for a site, and I can't "pull the trigger" on it. Example: - domain.com/games/type-of-game/provider-name/name-of-game/ Specific example: - arcade.com/games/pinball/deckerballs/starshooter2k/ The example is a good description of the content that I have to organize. The aim is to a) define url structure, b) facilitate good ux, **c) **create a good starting point for content marketing and SEO, avoiding multiple / stuffing keywords in urls'. The problem? Not all providers have the same type of game. Meaning, that once I get past the /type-of-game/, I must write a new category / page / content for /provider-name/. No matter how I switch the different "sub-levels" around in the url, at one point, the provider-name doesn't fit as its in need of new content, multiple times. The solution? I can skip "provider-name". The caveat though is that I lose out on ranking for provider keywords as I don't have a cornerstone content page for them. Question: Using the URL structure as outlined above in WordPress, would you A) go with "Pages", or B) use "Posts"
Intermediate & Advanced SEO | | Dan-Louis0 -
Should I implement Structure Data Markup before implementing AMP?
I am about to implement AMP and structured data markup on my site which one should be done first?
Intermediate & Advanced SEO | | Leebi0 -
Looking to remove dates from URL permalink structure. What do you think of this idea?
I know most people who remove dates from their URL structure usually do so and then setup a 301 redirect. I believe that's the right way to go about this typically. My biggest fear with doing a global 301 redirect implementation like that across an entire site is that I've seen cases where this has sort of shocked Google and the site took a hit in organic traffic pretty bad. Heres what I'm thinking a safer approach would be and I'd like to hear others thoughts. What if... Changed permalink structure moving forward to remove the date in future posts. All current URLs stay as is with their dates Moving forward we would go back and optimize past posts in waves (including proper 301 redirects and better URL structure). This way we avoid potentially shocking Google with a global change across all URLs. Do you know of a way this is possible with a large Wordpress website? Do you see any conplications that could come about in this process? I'd like to hear any other thoughts about this please. Thanks!
Intermediate & Advanced SEO | | HashtagJeff0 -
Site Structure - Is it ok to Keep current flat architecture of existing site pages and use silo structure on two new categories only?
Hi there, I have a site structure flat like this it ranks quite well for its niche site.com/red-apples.html site.com/blue-apples.html The site is branching out into a new but related lines of business is it ok to keep existing site architecture as above while using a silo structure just for the two new different but related business? site.com/meat/red-meat.html site.com/fish/oceant-trout.html Thanks for any advice!
Intermediate & Advanced SEO | | servetea0 -
URL Parameters as a single solution vs Canonical tags
Hi all, We are running a classifieds platform in Spain (mercadonline.es) that has a lot of duplicate content. The majority of our duplicate content consists of URL's that contain site parameters. In other words, they are the result of multiple pages within the same subcategory, that are sorted by different field names like price and type of ad. I believe if I assign the correct group of url's to each parameter in Google webmastertools then a lot these duplicate issues will be resolved. Still a few questions remain: Once I set f.ex. the 'page' parameter and i choose 'paginates' as a behaviour, will I let Googlebot decide whether to index these pages or do i set them to 'no'? Since I told Google Webmaster what type of URL's contain this parameter, it will know that these are relevant pages, yet not always completely different in content. Other url's that contain 'sortby' don't differ in content at all so i set these to 'sorting' as behaviour and set them to 'no' for google crawling. What parameter can I use to assign this to 'search' I.e. the parameter that causes the URL's to contain an internal search string. Since this search parameter changes all the time depending on the user input, how can I choose the best one. I think I need 'specifies'? Do I still need to assign canonical tags for all of these url's after this process or is setting parameters in my case an alternative solution to this problem? I can send examples of the duplicates. But most of them contain 'page', 'descending' 'sort by' etc values. Thank you for your help. Ivor
Intermediate & Advanced SEO | | ivordg0 -
SEO Overly-Dynamic URL Website with thousands of URLs
Hello, I have a new client who has a Diablo 3 database. They have created a very interesting site in which every "build" is it's own URL. Every page is a list of weapons and gear for the gamer. The reader may love this but it's nightmare for SEO. I have pushed for a blog to help generate inbound links and traffic but overall I feel the main feature of their site is a headache to optimize. They have thousands of pages index in google but none are really their own page. There is no strong content, H-Tags, or any real substance at all. With a lack of definition for each page, Google see's this as a huge ball of mess, with duplicate Page Titles and too many onpage links. The first thing I did was tell them to add a canonical link which seemed to drop the errors down 12K leaving only 2400 left...which is a nice start, but the remaining errors is still a challenge. I'm thinking about seeing if I can either find a way to make each page it's own blurb, H Tag or simple have the Nav bar and all the links in the database Noindex. That way the site is left with only a handful of URLs + the Blog and Forum Thought?
Intermediate & Advanced SEO | | MikePatch0