Strange Behavior - Dupe Content Via Query String URLs?
-
Hey y'all, could use community help with some strange behavior I'm seeing with a particular ranking.
A week ago a high volume keyword ranking above the fold dropped off the map. I immediately thought must be an algorithmic penguin penalty (no manual action message) or panda / dupe content issue. I think it's dupe content at this point because I found my former ranking page in the omitted results section for the keyword we used to rank for.
The strange thing is that without making any changes, Google would momentarily show our domain ranking high page one again, but with a strange query string URL. At first just domain.com/page/? whereas the old ranking was held by domain.com/page/ but now I see several long query string URLs floating around that the engines don't seem to know what to do with. Canonical tags are in place to canonicalize any query string URL back to the top and I have now designated query string URLs as unimportant in Search Console parameter filtering but these URLs persist.
I ended up deduplicating content to a page on another domain we own (think that was the original problem) and there seemed to be a positive effect but now we are top of page 2 with a much longer query string URL as the ranking page. It seems Google wants to rank everything but the former ranking URL even though it's the most authoritative by far, has canonical signals in place, and is now no longer duplicate content. Content checker tool showed 60% similarity to the other piece, which is a ratio I've never known to cause dupe content.
We found the source of the query string URLs to be from an external site that has a link to us but it's a buggy site so filtering on the page adds the string to our URL, so Google can find them and thinks they're significant.
Long question short, has anyone had trouble like this? Getting weird parameter / query URLs to get out of the index in favor of the non-parameter folder? Is it possible the main folder page got hit with Penguin and is "banned?" Still, I don't know why Google would go out of it's way to rank query string copy pages in its place if that were the case. Any help greatly appreciated.
An example of the URL looks like this:
domain.com/page/?CustomerSubscriptionTrack1PageSize=1&CustomerSubscriptionTrack1Order=Sorter_ID&CustomerSubscriptionTrack1Dir=ASC&CustomerSubscriptionTrack1Page=3&WorkOrder_TBLOrder=Sorter_AssetID&WorkOrder_TBLDir=ASC&ID=106 -
Hey James, sorry to hear you're getting blasted by negative links and appreciate your responses here.
I actually sorted this one out (fingers crossed it stays that way) by having the dev team implement a redirect rule that 301 redirects any query string back to the folder we want ranking. Similar signal to what the canonical tag would send but in my opinion a stronger signal since there is no longer a way to reach those weird query string URLs with a 200 response.
Once that was implemented the appropriate page was right back to its old high ranking position and the query strings are hardly to be seen in the index and are no longer preferred to the old ranking page - so looks like all is right with the world again.
We also disavowed the domain that was the source of many of the query string URLs. I don't think it was a case of negative SEO - just bad coding on their side. I'm not sure what exactly did the trick but I suspect strongly that the 301 redirects is what solidified the index due tot the strong correlation of that change with ranking recovery.
Maybe you can employ a similar solution whereby you can disavow domains where these links originate or set up server side handling to manage URLs of a specific pattern - for example, any URL containing "pornsite.com" if not any query string altogether (in our case we don't have any use for query strings in our URLs so just bagged them all).
Thanks again,
Matt -
Thanks for the response, James. The odd thing is that canonical tags are implemented correctly as far as I can tell. In the of each variation you can find the following code:
rel="canonical" href="https://www.domain.com/page/" />
(still using my example so as to keep the site anonymous)
And this code had been in place well before the issue arose. So yes, we are sending that signal to Google to apply canonical back to the top in every case, without query string.
Not sure what you're confused by in Search Console - the platform provides a tool to deal with parameter URLs just like the ones I'm seeing. I used it to mark all parameter URLs as not changing content, which should designate to engines to exclude them from the index.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content
Hello mozzers, I have an unusual question. I've created a page that I am fully aware that it is near 100% duplicate content. It quotes the law, so it's not changeable. The page is very linkable in my niche. Is there a way I can build quality links to it that benefit my overall websites DA (i'm not bothered about the linkable page being ranked) without risking panda/dupe content issues? Thanks, Peter
Technical SEO | | peterm21 -
URL Mixed Cases and Duplicate Content
Hi There, I have a question for you. I am working on a website where by typing any letter of the URL in lower or upper case, it will give a 200 code. Examples www.examples.com/page1/product www.examples.com/paGe1/Product www.examples.com/PagE1/prOdUcT www.examples.com/pAge1/proODUCt and so on… Although I cannot find evidence of backlinks pointing to my page with mixed cases, shall I redirect or rel=canonical all the possible combination of the cases to a lower version of them in order to prevent duplicate content? And if so, do you have any advice on how to complete such a massive job? Thanks a lot
Technical SEO | | Midleton0 -
Duplicate Content
Hi, I'm working on a site and I'm having some issues with its structure causing duplicate content. The first issue is that the search pages will show up as duplicates.
Technical SEO | | OOMDODigital
A search for new inventory may be new.aspx
The duplicate may be something like new.aspx=page1, or something like that and so on. The second issue is with inventory. When new inventory gets put into the stock of the store, a new page for that item will be populated with duplicate content. There appears to be no canonical source for that page. How can I fix both of these? Thanks!0 -
Can Page Content & Description Have Same Content?
I'm studying my crawl report and there are several warnings regarding missing meta descriptions. My website is built in WordPress and part of the site is a blog. Several of these missing description warnings are regarding blog posts and I was wondering if I am able to copy the first few lines of content of each of the posts to put in the meta description, or would that be considered duplicate content? Also, there are a few warnings that relate to blog index pages, e.g. http://www.iainmoran.com/2013/02/ - I don't know if I can even add a description of these as I think they are dynamically created? While on the subject of duplicate content, if I had a sidebar with information on several of the pages (same info) while the content would be coming from a WP Widget, would this still be considered duplicate content and would Google penalise me for it? Would really appreciate some thoughts on this,please. Thanks, Iain.
Technical SEO | | iainmoran0 -
Remotely Loaded Content
Hi Folks, I have a two part question. I'd like to add a feature to our website where people can click on an ingredient (we manufacture skin care products) and a tool-tip style box pops up and describes information about the ingredient. Because many products share some of the same ingredients, I'm going to load this data from a source file via AJAX. My questions are: Does this type of remotely-fetched content have any effect on how a search engines views and indexes the page? Can it help contribute to the page's search engine ranking? If there are multiple pages fetching the same piece of remotely-fetched content, will this be seen as duplicated content? Thanks! Hal
Technical SEO | | AlabuSkinCare0 -
301 on certain url string
I have a few thousand old urls with the string /content/ in them and are looking for a way to 301 batch redirect them. So for all the urls that contain the word 'content' I would like to redirect to 1 specific page. I have tried the methods below without success. Regular 301's are working fine but this particular method is not working for me. I am running a Joomla site but I don't imagine that would have any impact. Any suggestions would be greatly appreciated. Redirect 301 ^content/.*$ http://www.mysite.com Redirect 301 ^content/ http://www.mysite.com
Technical SEO | | omega0 -
Different TLD's same content - duplicate content? - And a problem in foreign googles?
Hi, Operating from the Netherlands with customers troughout Europe we have for some countries the same content. In the netherlands and Belgium Dutch is spoken and in Germany and Switserland German is spoken. For these countries the same content is provided. Does Google see this as duplicate content? Could it be possible that a german customer gets the Swiss website as a search result when googling in the German Google? Thank you for your assistance! kind regards, Dennis Overbeek Dennis@acsi.eu
Technical SEO | | SEO_ACSI0 -
Duplicate content
I am getting flagged for duplicate content, SEOmoz is flagging the following as duplicate: www.adgenerator.co.uk/ www.adgenerator.co.uk/index.asp These are obviously meant to be the same path so what measures do I take to let the SE's know that these are to be considered the same page. I have used the canonical meta tag on the Index.asp page.
Technical SEO | | IPIM0