Duplicate URL Parameters for Blog Articles
-
Hi there,
I'm working on a site which is using parameter URLs for category pages that list blog articles.
The content on these pages constantly change as new posts are frequently added, the category maybe for 'Heath Articles' and list 10 blog posts (snippets from the blog). The URL could appear like so with filtering:
-
www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general
-
www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016
-
www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=1
-
All pages currently have the same Meta title and descriptions due to limitations with the CMS, they are also not in our xml sitemap
I don't believe we should be focusing on ranking for these pages as the content on here are from blog posts (which we do want to rank for on the individual post) but there are 3000 duplicates and they need to be fixed.
Below are the options we have so far:
Canonical URLs
Have all parameter pages within the category canonicalize to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general and generate dynamic page titles (I know its a good idea to use parameter pages in canonical URLs).
WMT Parameter tool
Tell Google all extra parameter tags belong to the main pages (e.g. www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=3 belongs to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general).
Noindex
Remove all the blog category pages, I don't know how Google would react if we were to remove 3000 pages from our index (we have roughly 1700 unique pages)
We are very limited with what we can do to these pages, if anyone has any feedback suggestions it would be much appreciated.
Thanks!
-
-
Hard to say these days if they do respect the scroll effect there unfortunately.
-
Thanks Martijn,
That sounds like a good idea, we were also considering a Javascript loading option where we remove the pagination and load content on scroll - I am still 50/50 whether or not hidden content like this is crawled or ignored.
-
Thanks Anthony,
We are using rel=prev/next on the pagination for these blog pages which does reduce duplication, but because of the parameter filters we still have thousands of duplicates.
That's a good point about the indexing of older blogs!
-
I would simply set up rel=next/prev on the paginated series and not so much worry about duplicate title tags or using canonical tags. You want to make sure Google continues to crawl deep into your blog pagination and can access older blog posts.
-
Hi,
What I would do is go with both the canonical URLs as the Google Search Console parameters, in order to make sure first that the pages won't be seen as duplicates with the canonical URLs and in addition to that you might want to make sure that Google isn't visiting these pages at all in order to save your crawl budget for the more important pages on your site.
Martijn.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Possible duplicate content issues on same page with urls to multiple tabs?
Hello everyone! I'm first time here, and glad to be part of Moz community! Jumping right into the question I have. For a type of pages we have on our website, there are multiple tabs on each page. To give an example, let's say a page is for the information about a place called "Ladakh". Now the various urls that the page is accessible from, can take the form of: mywanderlust.in/place/ladakh/ mywanderlust.in/place/ladakh/photos/ mywanderlust.in/place/ladakh/places-to-visit/ and so on. To keep the UX smooth when the user switches from one tab to another, we load everything in advance with AJAX but it remains hidden till the user switches to the required tab. Now since the content is actually there in the html, does Google count it as duplicate content? I'm afraid this might be the case as when I Google for a text that's visible only on one of the tabs, I still see all tabs in Google results. I also see internal links on GSC to say a page mywanderlust.in/questions which is only supposed to be linked from one tab, but GSC telling internal links to this page (mywanderlust.in/questions) from all those 3 tabs. Also, Moz Pro crawl reports informed me about duplicate content issues, although surprisingly it says the issue exists only on a small fraction of our indexable pages. Is it hurting our SEO? Any suggestions on how we could handle the url structure better to make it optimal for indexing. FWIW, we're using a fully responsive design with the displayed content being exactly same for both desktop and mobile web. Thanks a ton in advance!
Intermediate & Advanced SEO | | atulgoyal0 -
What is the Redirect Rule for corresponding https urls to new domain with the same https urls?
2 sites have the same urls but the owner wants just the 1 site. So I will be doing a 301 redirect with .htaccess from https://www.example.co.uk/sportsbook/SOCCER/today/ redirecting to https://www.example.com//sportsbook/SOCCER/today/ There are a lot of urls that are the same, so I was wondering what the rule is to put in the file please that will change them all to the corresponding urls? Would this be correct?... RewriteEngine on
Intermediate & Advanced SEO | | WSIDW
RewriteCond %{HTTPS_HOST} ^example.co.uk [NC,OR]
RewriteCond %{HTTPS_HOST} ^www.example.co.uk [NC]
RewriteRule ^(.*)$ https://example.com$1 [L,R=301,NC] Or would a simple rule like this work... redirect 301 / http://www.new domain.com/ If not correct could you please give me the correct rule, thanks! Then of course doing a change of address of address in webmaster tools after. Also... do I still need to do the forwarding from the https://www.example.co.uk/ domain provider after as well? Many thanks for your help in advance.0 -
Duplicate URLs on eCommerce site caused by parameters
Hi there, We have a client with a large eCommerce site with about 1500 duplicate URLs caused by the parameters in the URLs (such as the sort parameter where the list of products are then sorted by price, age etc.) Example: www.example.com/cars/toyota First duplicate URL: www.example.com/cars/toyota?sort=price-ascending Second duplicate URL: www.example.com/cars/toyota?sort=price-descending Third duplicate URL: www.example.com/cars/toyota?sort=age-descending Originally we had advised to add a robots.txt file to block search engines from crawling the URLs with parameters but this hasn't been done. My question: If we add the robots.txt now and exclude all URLs with filters - how long will it take for Google to disregard the duplicate URLs? We could ask the developers to add canonical tags to all the duplicates but these are about 1500... Thanks in advance for any advice!
Intermediate & Advanced SEO | | Gabriele_Layoutweb0 -
Attack of the dummy urls -- what to do?
It occurs to me that a malicious program could set up thousands of links to dummy pages on a website: www.mysite.com/dynamicpage/dummy123 www.mysite.com/dynamicpage/dummy456 etc.. How is this normally handled? Does a developer have to look at all the parameters to see if they are valid and if not, automatically create a 301 redirect or 404 not found? This requires a table lookup of acceptable url parameters for all new visitors. I was thinking that bad url names would be rare so it would be ok to just stop the program with a message, until I realized someone could intentionally set up links to non existent pages on a site.
Intermediate & Advanced SEO | | friendoffood1 -
Duplicate content
I run about 10 sites and most of them seemed to fall foul of the penguin update and even though I have never sought inorganic links I have been frantically searching for a link based answer since April. However since asking a question here I have been pointed in another direction by one of your contributors. It seems At least 6 of my sites have duplicate content issues. If you search Google for "We have selected nearly 200 pictures of short haircuts and hair styles in 16 galleries" which is the first bit of text from the site short-hairstyles.com about 30000 results appear. I don't know where they're from nor why anyone would want to do this. I presume its automated since there is so much of it. I have decided to redo the content. So I guess (hope) at some point in the future the duplicate nature will be flushed from Google's index? But how do I prevent it happening again? It's impractical to redo the content every month or so. For example if you search for "This facility is written in Flash® to use it you need to have Flash® installed." from another of my sites that I coincidently uploaded a new page to a couple of days ago, only the duplicate content shows up not my original site. So whoever is doing this is finding new stuff on my site and getting it indexed on google before even google sees it on my site! Thanks, Ian
Intermediate & Advanced SEO | | jwdl0 -
Guest blogging and duplicate content
I have a guest blog prepared and several sites I can submit it to, would it be considered duplicate content if I submitted one guest blog post to multipul blogs? and if so this content is not on my site but is linking to it. What will google do? Lets say 5 blogs except the same content and post it up, I understand that the first blog to have it up will not be punished, what about the rest of the blogs? can they get punished for this duplicate content? can I get punished for having duplicate content linking to me?
Intermediate & Advanced SEO | | SEODinosaur0 -
Does URL format affect Keyword effectiveness for a URL?
I am looking at our site structure, and don't want to have to rebuild the way the site was linked together based on it's current folder structure so I am wondering what option would work better for our URL structure. I will uses car categories as an example of what I am talking about, but you can insert any category structure you like. For example I would like to have pages like this: www.example.com/ford-convertibles
Intermediate & Advanced SEO | | SL_SEM
www.example.com/chevy-convertibles But instead due to the site structure I will need to have pages like this: www.example.com/ford/convertibles
www.example.com/chevy/convertibles But wonder if I shouldn't do the following to ensure the proper phrase is known for the page: www.example.com/ford/ford-convertibles
www.example.com/chevy/chevy-convertibles The "/ford/ford-convertibles" just seems odd to me as a human, but I haven't seen anything on how well a keyphrase in a URL split by /'s does and I know dashes for phrases are fine. This means I am inclined to go with the"/ford/ford-convertibles"style because it keeps the keyphrase separated by dashes even if it is a bit repetitive. There will be other pages too like "/ford/top-10-fords-ever" but I don't wonder about that since it isnt "ford/ford-xxxxx" Thoughts on whether /'s in a keyphrase are as good as dashes?0 -
Duplicate content - canonical vs link to original and Flash duplication
Here's the situation for the website in question: The company produces printed publications which go online as a page turning Flash version, and as a separate HTML version. To complicate matters, some of the articles from the publications get added to a separate news section of the website. We want to promote the news section of the site over the publications section. If we were to forget the Flash version completely, would you: a) add a canonical in the publication version pointing to the version in the news section? b) add a link in the footer of the publication version pointing to the version in the news section? c) both of the above? d) something else? What if we add the Flash version into the mix? As Flash still isn't as crawlable as HTML should we noindex them? Is HTML content duplicated in Flash as big an issue as HTML to HTML duplication?
Intermediate & Advanced SEO | | Alex-Harford0