Dulpicate Content being reported
-
Hi
I have a new client whose first MA crawl report is showing lots of duplicate content.
The main batch of these are all the HP url with an 'attachment' part at the end such as:
www.domain.com/?attachment_id=4176
As far as i can tell its some sort of slide show just showing a different image in the main frame of each page, with no other content. Each one does have a unique meta title & H1 though.
Whats the best thing to do here ?
-
Not a problem and leave as is
-
Use the paremeter handling tool in GWT
-
Canonicalise, referencing the HP
or other solution ?
Many Thanks
Dan
-
-
Hi Dan,
Actually it looks like ctrl L will do it (you are creating an excel table). You usually need to erase the first few rows from the export so you have the column header in row 1 and then select all and create the table checking the 'my table has headers' so that you can then filter using the headers
-
Sorry Lynn but what is the 'windows' bit in control-windows-L since cant see on my keyboard, can it have a different icon/symbol etc?
-
Great stuff thanks Lynn !! Ill tell their dev to do that
many many thanks
All Best
Dan
-
cool cheers Don
-
Hi Dan,
The robots must be getting the urls from somewhere so it is worth finding out where. If you download the moz report in csv and open in excel you can control-windows-L to get a filterable list. If you filter for duplicates and find these urls on the left then on the far right it should reference where they are being linked from. I suspect you will find pages in the site that have these images in them and are linking to the attachment_id urls (often it is from gallery pages).
Once you have found the pages, then try applying the yoast redirects and see if they work as expected (ie redirect the attachment_id links to the relevant gallery page for example). Ideally you would get rid of the links completely from the code - this will probably need a bit of dev work on the template but should be pretty straightforward since you are likely just removing the A tag from around the images.
-
Gotcha, definitely don't want to nix pages then. I would imagine Lynn's response is more appropriate then, it is likely that he is using a plugin that has been updated to better SEO practices that he hasn't yet updated.
-
Many thanks Don
ill ask client but dont think so (doubt any links pointing to them) but due to varying kw rich meta titles and h1's think client may have implemented this for some seo reason (hes very seo savvy but bit old school) prob not aware needs more content on page beyond a pic & some meta & an h1.
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
All Best
Dan
-
Thanks Lyn
Yes it is wp i think
If i click on the image it loads page with image (another duplicate) in the series next
I'm not sure what the normal page is since can only find these via the cralw reports, they dont seem to be linked to in any site nav etc
Does that sound to you then like best solution is via Yoast redirects etc ?
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
Cheers
Dan
-
Hi Dan,
If these pages have no SEO value then you can just stop them from being crawled, thus preventing any duplicate content penalties. If you see some backlinks (SEO value) to any of these then I would use Canonical.
robots.txt
User-agent:: *
Disallow: /*attachment_id
Hope it helps,
Don
-
Hi Dan,
Is the site running wordpress? If so it sounds like maybe a badly coded template which is showing links somewhere in the code to the attachments (if you click on the image in its normal page does it take you to the duplicate url you mention?). It would be best to find out where the linking is happening and correct it so the links are removed if at all possible. The Yoast plugin also has a setting where you can redirect attachment ids to their related post (its in the permalinks settings of the yoast plugin) - that might help solve the problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ignore these external links reported in GWT?
Taking a long, Ace-Ventura-like breath here. This question is loaded. Here we go: No manual actions reported against my client's site in GWT HOWEVER, a link: operator search for external links to my client's website shows NO links in results. That seems like a very bad omen to me. OSE shows 13 linking domains, but not the one that's listed in the next bullet point. Issue: In Google Webmaster Tools, I noticed 1,000+ external links to my client's website all coming from riddim-donmagazine.com (there are a small handful of other domains listed, but this one stuck out for the large quantity of links coming from this domain) Those external links all point to two URLs on my client's website. I have no knowledge of any campaigns run by client that would use this other domain (or any schemes for that matter) It appears that this website riddim-donmagazine.com has been suspended by hostgator All of the links were first discovered last year (dates vary but basically August through December 2013) There have not been any newly discovered links from this website reported by GWT since those 2013 dates All of the external links are /? based. Example: http://riddim-donmagazine.com/?author=1&paged=31 If I run that link in preceding bullet point through http://www.webconfs.com/http-header-check.php, or any others from riddim-donmagazine.com those external links return 302 status. My best guess is at one time the client was running an advertising program and this website may have been on that network. One of the external links points to an ad page on the client's website.
Technical SEO | | EEE3
(web.archive.org confirms this is a WordPress site and that it's coverage of Bronx news could trigger an ad for my client or make it related to my client's website when it comes to demographics.) Believe me, this externally linked domain is only a small problem in comparison with the rest of my client's issues (mainly they've changed domains, then they changed website vendors, etc., etc.), but I did want to ask about that one externally linked domain. Whew.Thanks in advance for insights/thoughts/advice!0 -
Content relaunch without content duplication
We write great Content for blog and websites (or at least we try), especially blogs. Sometimes few of them may NOT get good responses/reach. It could be the content which is not interesting, or the title, or bad timing or even the language used. My question for the discussion is, what will you do if you find the content worth audience's attention missed it during its original launch. Is that fine to make the text and context better and relaunch it ? For example: 1. Rechristening the blog - Change Title to make it attractive
Technical SEO | | macronimous
2. Add images
3. Check spelling
4. Do necessary rewrite, spell check
5. Change the timeline by adding more recent statistics, references to recent writeups (external and internal blogs for example), change anything that seems outdated Also, change title and set rel=cannoical / 301 permanent URLs. Will the above make the blog new? Any ideas and tips to do? Basically we like to refurbish (:-)) content that didn't succeed in the past and relaunch it to try again. If we do so will there be any issues with Google bots? (I hope redirection would solve this, But still I want to make sure) Thanks,0 -
If content is at the bottom of the page but the code is at the top, does Google know that the content is at the bottom?
I'm working on creating content for top category pages for an ecommerce site. I can put them under the left hand navigation bar, and that content would be near the top in the code. I can also put the content at the bottom center, where it would look nicer but be at the bottom of the code. What's the better approach? Thanks for reading!
Technical SEO | | DA20130 -
Duplicate Title and Content. How to fix?
So this is the biggest error I have. But I don't know how to fix it. I get that I have to make it so that the duplicats redirect to the source, but I don't know how to do that. For example, this is out of our crawl diagnostic: | On The Block - Page 3 http://www.maddenstudents.com/forumdisplay.php?57-On-The-Block/page3 1 1 0 On The Block - Page 3 http://www.maddenstudents.com/forumdisplay.php?57-On-The-Block/page3&s=8d631e0ac09b7a462164132b60433f98 | 1 | 1 | 0 | That's just an example. But I have over 1000+ like that. How would I go about fixing that? Getting rid of the "&s=8d631e0ac09b7a462164132b60433f98"? I have godaddy as my domain and web hoster. Could they be able to fix it?
Technical SEO | | taychatha0 -
Is duplicate content ok if its on LinkedIn?
Hey everyone, I am doing a duplicate content check using copyscape, and realized we have used a ton of the same content on LinkedIn as our website. Should we change the LinkedIn company page to be original? Or does it matter? Thank you!
Technical SEO | | jhinchcliffe0 -
How can something be duplicate content of itself?
Just got the new crawl report, and I have a recurring issue that comes back around every month or so, which is that a bunch of pages are reported as duplicate content for themselves. Literally the same URL: http://awesomewidgetworld.com/promotions.shtml is reporting that http://awesomewidgetworld.com/promotions.shtml is both a duplicate title, and duplicate content. Well, I would hope so! It's the same URL! Is this a crawl error? Is it a site error? Has anyone seen this before? Do I need to give more information? P.S. awesomewidgetworld is not the actual site name.
Technical SEO | | BetAmerica0 -
Problem with duplicate content
Hi, My problem is this: SEOmoz tells me I have duplicate content because it is picking up my index page in three different ways: http://www.web-writer-articles.co.uk http://www.web-writer-articles.co.uk/ and http://www.web-writer-articles.co.uk/index.php Can someone give me some advice as to how I can deal with this issue? thank you for your time, louandel15
Technical SEO | | louandel150 -
Copying my content
Hi there, I run a successful e-commerce website, which the product pages are rich with content linking to other products etc, one of our retailers who sell our products I just noticed copied and pasted the content I have written for these product pages leaving in all the links, which it turn are linking back to my product pages, is this a good thing? or should I make that retailer put in canonical tags? Thanks for any help
Technical SEO | | Paul780