Duplicate content warnings
-
I have a ton of duplicate content warnings for my site poker-coaching.net, but I can't see where there are duplicate URLs. I cannot find any function where I could check the original URL vs a list of other URLs where the duplicate content is?
-
thanks for the help. I am trying to cover all bases here. Duplicate content was one concern, the other one is too high link density and bad incoming links.
I have downloaded a full backlinks report now using Majestic SEO (OSE only shows incoming links from 74 domains...).
I think I may have found the problem. I used to have a forum on that domain years ago which was hacked and used for a lot of spammy outgoing links for stuff like Cialis, Viagra etc.. Those guys also linked from other sites to my forum pages. Example:| from: http://www.grupoibira.com/foro/viewto...
| Anchor: buy flagyl in lincolnu... | 3 | to: http://www.poker-coaching.net/phpbb3/...
|
When I closed the forum and deleted the forum plugin, I redirected all forum pages to my main page which, under the circumstances was a bad mistake I guess. Because with the redirect, all those spammy links now end up pointing to my main page, right? So first, I have removed that redirect now.
But the problem remains that I still have plenty of links from spam sites pointing to URLs of my domain that do not exist any more.
Is there anything else I can do to remove those links or have Google remove/disregard them, or do you think a reconsideration request explaining the situation would help? -
Honestly, with only 235 indexed pages, it's pretty doubtful that duplicate content caused you an outright penalty (such as being hit with Panda). Given your industry, it's much more likely you've got a link-based penalty or link quality issue in play.
You do have a chunk of spammy blog comments and some low-value article marketing, for example:
http://undertheinfluence.nationaljournal.com/2010/02/summit-attendees.php
A bit of that is fine (and happens in your industry a lot), but when it's too much of your link profile too soon, you could be getting yourself into penalty territory.
-
Hey There,
Just to clarify, to see the source of those errors, you’ll need to download your Full Crawl Diagnostics CSV and open it up in something like Excel. In the first column, perform a search for the URL of the page you are looking for. When you find the correct row, look in the last column labeled referrer. This tells you the referral URL of the page where our crawlers first found the target URL. You can then visit this URL to find the source of your errors. If you need more help with that, check out this link: http://seomoz.zendesk.com/entries/20895376-crawl-diagnostics-csv
Hope that helps! I will look at the issue on the back end to see if they are actually duplicate content.
Have a great day,
Nick
-
Thanks for looking into this. Actually I checked the whole site by doing a batch search on Copyscape and there were only minor duplicate content issues. I resolved those by editing the content parts in question (on February 24th 2012).
Since I am desperately searching for the reasons why this site was penalized (and it def is...), it would be great to know why your duplicate content checker finds errors. Could only be related to multiple versions of one page on different URLs. I do have all http://mysitedotcom redirected to www.mysitedotcom, and the trailing slash/notrailingslash URL problem was also resolved by a redirect long ago, so I do not know where the problem lies.
Thanks for the help! -
I think our system has roughly a 90-95% threshold for duplicate content. The pages I'm seeing in your campaign don't look that high, so something is up - I'm checking with support.
For now, use the "Duplicate Page Title" section - that'll tend to give you exact duplicates. The duplicate content detection also covers thin content and near duplicates.
-
Yes that is what I first thought too. If only it were that easy.
But when I do, I see a couple of URLs that definitely do not have any duplicate content . Could it be that the dupe content check considers text in sitewide modules (like the modules "Poker News" and "Tips for ...." in www.poker-coaching.net) as duplicate content, because they appear on all pages?
This way, the duplicate content finding function is totally worthless. -
If you drill down into your campaign report into 'Crawl Diagnostics' you will see a dropdown menu that's named "Show". Select 'Duplicate Page Content'... you will see a graph with a table below it. To the right of the URL you will see a column named "Other URL's". The numbers in that column are live links to a page with the list of URL's with duplicate content. At least that is how it is displayed in my campaigns.
-
You will find this information at google webmaster tools and at seomoz campaing. There you will the information you need.
One easy way to avoid this is to include the rel canonical metag. You need to include in every page (pages you want to be the official one) inside the head tag the follow:
where ww.example.com/index.html is your page adress. Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Metadata and duplicate content issues
Hi there: I'm seeing a steady decline in organic traffic, but at the same time and increase in pageviews and direct traffic. My site has about 3,000 crawl errors!! Errors are duplicate content, missing description tags, and description too long. Most of these issues are related to events that are being imported from Google calendars via ical and the pages created from these events. Should we block calendar events from being crawled by using the disallow directive in the robots.txt file? Here's the site: https://www.landmarkschool.org/
Reporting & Analytics | | BGR0 -
Hello, our domain authority dropped significantly overnight from 37 to 29\. We have been building good links from high DA pages and producing quality, regular content.
Hello, our domain authority dropped significantly overnight from 37 to 29. We have been building good links from high DA sites and producing regular, good quality content. Anyone able to offer any ideas why? Thanks
Reporting & Analytics | | ProMOZ1231 -
Ecommerce, Product Content & Google Metrics
Hi I know Google has many different variations of what they consider to be thin content. I wondered if anyone has an idea of the best metric to determine what content you need to improve on your site? I work on a large e-commerce site so there are a thousands of product pages - all with product descriptions similar [but not duplicate] to competitors. I guess in terms of quantity, these pages don't have huge amounts of written content, so I'm wondering what Google classes as 'thin' on a product page: 1. Does Google just expect a conversion to deem that product page useful? And if not, what's the best metric to identify what works vs. what doesn't on product pages in Google's eyes. 2. If adding lots of product pages on mass is bad and will decrease overall authority? The content isn't duplicate, but may be fairly similar to other sites selling the same thing. I'm trying to get our reviews added directly to product pages rather than in a pop up to improve the unique content and I'm starting to write guides, FAQ's and I'll work towards getting video started - however, I'm the only SEO & we don't have much resource so this all takes time. If anyone else has any advice on steps to take that would be great 🙂
Reporting & Analytics | | BeckyKey0 -
Google Analytics Content Experiments
Has anyone else found that Google Analytics Content Experiments seems to quite quickly favor the best performing variant in an experiment, and then show that variant many times more often than other/s - not split the traffic evenly? What is Google's thinking behind 'optimizing' during an experiment? It seems odd to me.
Reporting & Analytics | | David_ODonnell0 -
When will traffic data be working ? also whats with the spike in duplicate listing issues with everyone.
Hi There, We have no traffic data, is this something we are doing wrong or is this an issue with SEOMOZ ? Also duplicate listings have gone sky high, check goggle analytics's and all ok ? Any answers ? Thanks Charlie
Reporting & Analytics | | pro580 -
How can you tell if your new content has been indexed?
Other than simply doing a search in each case, is there any way I can tell (in Webmaster Tools, for example) if the 500-1000 new pages of content I have added have been indexed and are now appearing in search results? My traffic hasn't risen much, but I know at least a few of them are in there... How can I tell when they're all in?
Reporting & Analytics | | corp08030 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0