Duplicate content warnings

CatfishTPA

I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?

CatfishTPA

thanks for the help. I am trying to cover all bases here. Duplicate content was one concern, the other one is too high link density and bad incoming links.

I have downloaded a full backlinks report now using Majestic SEO (OSE only shows incoming links from 74 domains...).
I think I may have found the problem. I used to have a forum on that domain years ago which was hacked and used for a lot of spammy outgoing links for stuff like Cialis, Viagra etc.. Those guys also linked from other sites to my forum pages. Example:

When I closed the forum and deleted the forum plugin, I redirected all forum pages to my main page which, under the circumstances was a bad mistake I guess. Because with the redirect, all those spammy links now end up pointing to my main page, right? So first, I have removed that redirect now.

But the problem remains that I still have plenty of links from spam sites pointing to URLs of my domain that do not exist any more.
Is there anything else I can do to remove those links or have Google remove/disregard them, or do you think a reconsideration request explaining the situation would help?

Dr-Pete

Honestly, with only 235 indexed pages, it's pretty doubtful that duplicate content caused you an outright penalty (such as being hit with Panda). Given your industry, it's much more likely you've got a link-based penalty or link quality issue in play.

You do have a chunk of spammy blog comments and some low-value article marketing, for example:

http://undertheinfluence.nationaljournal.com/2010/02/summit-attendees.php

http://www.selfgrowth.com/articles/der-bestebesterhoechstermaximalerneueroptimaler-party-poker-bonus-code-im-internet

A bit of that is fine (and happens in your industry a lot), but when it's too much of your link profile too soon, you could be getting yourself into penalty territory.

Nick_Sayers

Hey There,

Just to clarify, to see the source of those errors, you’ll need to download your Full Crawl Diagnostics CSV and open it up in something like Excel. In the first column, perform a search for the URL of the page you are looking for. When you find the correct row, look in the last column labeled referrer. This tells you the referral URL of the page where our crawlers first found the target URL. You can then visit this URL to find the source of your errors. If you need more help with that, check out this link: http://seomoz.zendesk.com/entries/20895376-crawl-diagnostics-csv

Hope that helps! I will look at the issue on the back end to see if they are actually duplicate content.

Have a great day,

Nick

CatfishTPA

Thanks for looking into this. Actually I checked the whole site by doing a batch search on Copyscape and there were only minor duplicate content issues. I resolved those by editing the content parts in question (on February 24th 2012).
Since I am desperately searching for the reasons why this site was penalized (and it def is...), it would be great to know why your duplicate content checker finds errors. Could only be related to multiple versions of one page on different URLs. I do have all http://mysitedotcom redirected to www.mysitedotcom, and the trailing slash/notrailingslash URL problem was also resolved by a redirect long ago, so I do not know where the problem lies.
Thanks for the help!

Dr-Pete

I think our system has roughly a 90-95% threshold for duplicate content. The pages I'm seeing in your campaign don't look that high, so something is up - I'm checking with support.

For now, use the "Duplicate Page Title" section - that'll tend to give you exact duplicates. The duplicate content detection also covers thin content and near duplicates.

CatfishTPA

Yes that is what I first thought too. If only it were that easy.
But when I do, I see a couple of URLs that definitely do not have any duplicate content . Could it be that the dupe content check considers text in sitewide modules (like the modules "Poker News" and "Tips for ...." in www.poker-coaching.net) as duplicate content, because they appear on all pages?
This way, the duplicate content finding function is totally worthless.

dggusmc

If you drill down into your campaign report into 'Crawl Diagnostics' you will see a dropdown menu that's named "Show". Select 'Duplicate Page Content'... you will see a graph with a table below it. To the right of the URL you will see a column named "Other URL's". The numbers in that column are live links to a page with the list of URL's with duplicate content. At least that is how it is displayed in my campaigns.

Naghirniac

You will find this information at google webmaster tools and at seomoz campaing. There you will the information you need.

One easy way to avoid this is to include the rel canonical metag. You need to include in every page (pages you want to be the official one) inside the head tag the follow:

where ww.example.com/index.html is your page adress. Good luck!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Duplicate content warnings

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Metadata and duplicate content issues

Backlinks or content? What is the problem here?

Google Analytics is treating my blog like all the content is just on the home page.

Fresh Content Still As important?

Strange Traffic / Viewed content in Analytics

What is the best way to eliminate this specific image low lying content?

How serious are the Duplicate page content and Tags error?

Duplicate page content