Bad Duplicate content issue
-
Hi,
for grappa.com I have about 2700 warnings of duplicate page content. My CMS generates long url like: http://www.grappa.com/deu/news.php/categoria=latest_news/idsottocat=5 and http://www.grappa.com/deu/news.php/categoria%3Dlatest_news/idsottocat%3D5 (this is a duplicated content).
What's the best solution to fix this problem? Do I have to set up a 301 redirect for all the duplicated pages or insert the rel=canonical or rel=prev,next ?
It's complicated becouse it's a multilingual site, and it's my first time dealing with this stuff.
Thanks in advance.
-
Your original question had two URLs, one of where the "=" was replaced with "%3D". If that was an actual crawled URL (and not a copy-and-paste error), then it's likely coming from bad links within your own site. That's malformed, so you should definitely check it out. A desktop crawler like Xenu or Screaming Frog could help track down the culprit:
http://www.seomoz.org/blog/crawler-faceoff-xenu-vs-screaming-frog
-
Thanks Peter for the reply!
What do you mean by "bad internal links" ?
I'm well ranked so based on your suggestions what I have to do is to set up properly the rel=canonical tag and rel=alternate, right? I'm still bit scarred about duplicate content report in the SEOmoz campaign. 2.700 warnings is kind of a big deal.
-
One of these URLs just seems to be the encoded version of the other, which should appear as identical. I'm not seeing any evidence that Google is indexing both. I have a feeling that you may have some bad internal links that need to be fixed. I'm seeing the English/German version of this page in the index, but that should be fine. As Khem said, you could use .
Be careful about converting to a "static" version. It's not that it's a bad idea, but the problem is that you could end up turning 2 duplicates into 3 duplicates. You'll still have to canonicalize the dynamic version to the static version. In other words, done badly, changing your URLs could actually make the problem worse.
-
Rel=prev/next is for paginated series, such as internal search results. While I see you have a pagination parameter on these pages ("idpagina=13"), it doesn't seem like this is a series or that the two pages are even duplicates. I'm a bit confused on the intent, but my initial reaction is that rel=prev/next doesn't fit the bill here.
-
As long as you are managing a multilingual site, it is always recommended to use rel="alternative" even if you're redirecting your website.
For next, prev, don't use, unless you feel it is really required, as I could not find the need May be I missed something, could you be please bit more specific?
-
Thanks Raj! I will for sure re-write the dynamic urls into static and that's a starting point. Take for example these pages:
http://www.grappa.com/eng/grappa.php/argomento=grappa_in_italy/idsezione=1/idpagina=13
Do you suggest in this case to use rel=nex, prev ?
I thought about using rel="alternate" for the multilingual issue, but now my site redirects automatically from www.grappa.com to www.grappa.com/eng/index.php. is that bad for SEO? Should I put rel="canonical" to www.grappa.com ?
Many thanks
-
Hey Nicola, ~2700 is a huge no.
I would suggest you to talk to you programmer/developer to re-write the dynamic URLs into static, which I am sure they can easily do.
second thing, make sure to delete all the duplicate pages or use rel=unfollow. using 301 for all the duplicate pages is not a bad option but not a permanent solutions. It is better to re-write all the dynamics urls into static one, delete all the dups pages and then 301 redirect all the deleted pages to the originals.
for multilingual you can use the following code:
The tag enables you to say, “This is for Spain. this is for Germany
The rel="alternate" hreflang="es" annotations help Google serve the Spanish language or regional URL to searchers
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Purchasing duplicate content
Morning all, I have a client who is planning to expand their product range (online dictionary sites) to new markets and are considering the acquisition of data sets from low ranked competitors to supplement their own original data. They are quite large content sets and would mean a very high percentage of the site (hosted on a new sub domain) would be made up of duplicate content. Just to clarify, the competitor's content would stay online as well. I need to lay out the pros and cons of taking this approach so that they can move forward knowing the full facts. As I see it, this approach would mean forgoing ranking for most of the site and would need a heavy dose of original content as well as supplementing the data on page to build around the data. My main concern would be that launching with this level of duplicate data would end up damaging the authority of the site and subsequently the overall domain. I'd love to hear your thoughts!
Technical SEO | | BackPack851 -
Affiliate Url & duplicate content
Hi i have checked passed Q&As and couldn't find anything on this so thought I would ask.
Technical SEO | | Direct_Ram
I have recently noticed my URLS adding the following to the end: mydomain.com/?fullweb=1 I cant seem to locate where these URLS are coming from and how this is being created? This is causing duplicate content on google. I wanted to know ig anyone has had any previous experience with something like this? If anyone has any information on this it would be a great help. thanks E0 -
Duplicate content problem
Hi there, I have a couple of related questions about the crawl report finding duplicate content: We have a number of pages that feature mostly media - just a picture or just a slideshow - with very little text. These pages are rarely viewed and they are identified as duplicate content even though the pages are indeed unique to the user. Does anyone have an opinion about whether or not we'd be better off to just remove them since we do not have the time to add enough text at this point to make them unique to the bots? The other question is we have a redirect for any 404 on our site that follows the pattern immigroup.com/news/* - the redirect merely sends the user back to immigroup.com/news. However, Moz's crawl seems to be reading this as duplicate content as well. I'm not sure why that is, but is there anything we can do about this? These pages do not exist, they just come from someone typing in the wrong url or from someone clicking on a bad link. But we want the traffic - after all the users are landing on a page that has a lot of content. Any help would be great! Thanks very much! George
Technical SEO | | canadageorge0 -
Duplicate Content - Reverse Phone Directory
Hi, Until a few months ago, my client's site had about 600 pages. He decided to implement what is essentially a reverse phone directory/lookup tool. There are now about 10,000 reverse directory/lookup pages (.html), all with short and duplicate content except for the phone number and the caller name. Needless to say, I'm getting thousands of duplicate content errors. Are there tricks of the trade to deal with this? In nosing around, I've discovered that the pages are showing up in Google search results (when searching for a specific phone number), usually in the first or second position. Ideally, each page would have unique content, but that's next to impossible with 10,000 pages. One potential solution I've come up with is incorporating user-generated content into each page (maybe via Disqus?), which over time would make each page unique. I've also thought about suggesting that he move those pages onto a different domain. I'd appreciate any advice/suggestions, as well as any insights into the long-term repercussions of having so many dupes on the ranking of the 600 solidly unique pages on the site. Thanks in advance for your help!
Technical SEO | | sally580 -
Duplicate page content - index.html
Roger is reporting duplicate page content for my domain name and www.mydomain name/index.html. Example: www.just-insulation.com
Technical SEO | | Collie
www.just-insulation.com/index.html What am I doing wrongly, please?0 -
Duplicate content issues, I am running into challenges and am looking for suggestions for solutions. Please help.
So I have a number of pages on my real estate site that display the same listings, even when parsed down by specific features and don't want these to come across as duplicate content pages. Here are a few examples: http://luxuryhomehunt.com/homes-for-sale/lake-mary/hanover-woods.html?feature=waterfront http://luxuryhomehunt.com/homes-for-sale/lake-mary/hanover-woods.html This happens to be a waterfront community so all the homes are located along the waterfront. I can use a canonical tag, but I not every community is like this and I want the parsed down feature pages to get index. Here is another example that is a little different: http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=without-pool http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=4-bedrooms http://luxuryhomehunt.com/homes-for-sale/winter-park/bear-gully-bay.html?feature=waterfront So all the listings in this community happen to have 4 bedrooms, no pool, and are waterfront. Meaning that they display for each of the parsed down categories. I can possible set something that if the listings = same then use canonical of main page url, but in the next case its not so simple. So in this next neighborhood there are 48 total listings as seen at: http://luxuryhomehunt.com/homes-for-sale/windermere/isleworth.html and being that it is a higher end neighborhood, 47 of the 48 listings are considered "traditional listings" and while it is not exactly all of them it is 99%. Any recommendations is appreciated greatly.
Technical SEO | | Jdubin0 -
What are some of the negative effects of having duplicate content from other sites?
This could include republishing several articles from another site with permission.
Technical SEO | | Charlessipe0 -
Duplicate content
I am getting flagged for duplicate content, SEOmoz is flagging the following as duplicate: www.adgenerator.co.uk/ www.adgenerator.co.uk/index.asp These are obviously meant to be the same path so what measures do I take to let the SE's know that these are to be considered the same page. I have used the canonical meta tag on the Index.asp page.
Technical SEO | | IPIM0