Dulpicate Content being reported
-
Hi
I have a new client whose first MA crawl report is showing lots of duplicate content.
The main batch of these are all the HP url with an 'attachment' part at the end such as:
www.domain.com/?attachment_id=4176
As far as i can tell its some sort of slide show just showing a different image in the main frame of each page, with no other content. Each one does have a unique meta title & H1 though.
Whats the best thing to do here ?
-
Not a problem and leave as is
-
Use the paremeter handling tool in GWT
-
Canonicalise, referencing the HP
or other solution ?
Many Thanks
Dan
-
-
Hi Dan,
Actually it looks like ctrl L will do it (you are creating an excel table). You usually need to erase the first few rows from the export so you have the column header in row 1 and then select all and create the table checking the 'my table has headers' so that you can then filter using the headers
-
Sorry Lynn but what is the 'windows' bit in control-windows-L since cant see on my keyboard, can it have a different icon/symbol etc?
-
Great stuff thanks Lynn !! Ill tell their dev to do that
many many thanks
All Best
Dan
-
cool cheers Don
-
Hi Dan,
The robots must be getting the urls from somewhere so it is worth finding out where. If you download the moz report in csv and open in excel you can control-windows-L to get a filterable list. If you filter for duplicates and find these urls on the left then on the far right it should reference where they are being linked from. I suspect you will find pages in the site that have these images in them and are linking to the attachment_id urls (often it is from gallery pages).
Once you have found the pages, then try applying the yoast redirects and see if they work as expected (ie redirect the attachment_id links to the relevant gallery page for example). Ideally you would get rid of the links completely from the code - this will probably need a bit of dev work on the template but should be pretty straightforward since you are likely just removing the A tag from around the images.
-
Gotcha, definitely don't want to nix pages then. I would imagine Lynn's response is more appropriate then, it is likely that he is using a plugin that has been updated to better SEO practices that he hasn't yet updated.
-
Many thanks Don
ill ask client but dont think so (doubt any links pointing to them) but due to varying kw rich meta titles and h1's think client may have implemented this for some seo reason (hes very seo savvy but bit old school) prob not aware needs more content on page beyond a pic & some meta & an h1.
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
All Best
Dan
-
Thanks Lyn
Yes it is wp i think
If i click on the image it loads page with image (another duplicate) in the series next
I'm not sure what the normal page is since can only find these via the cralw reports, they dont seem to be linked to in any site nav etc
Does that sound to you then like best solution is via Yoast redirects etc ?
On a side note do you think these could be dragging sites rankings down (there are 350 of them) ?
Cheers
Dan
-
Hi Dan,
If these pages have no SEO value then you can just stop them from being crawled, thus preventing any duplicate content penalties. If you see some backlinks (SEO value) to any of these then I would use Canonical.
robots.txt
User-agent:: *
Disallow: /*attachment_id
Hope it helps,
Don
-
Hi Dan,
Is the site running wordpress? If so it sounds like maybe a badly coded template which is showing links somewhere in the code to the attachments (if you click on the image in its normal page does it take you to the duplicate url you mention?). It would be best to find out where the linking is happening and correct it so the links are removed if at all possible. The Yoast plugin also has a setting where you can redirect attachment ids to their related post (its in the permalinks settings of the yoast plugin) - that might help solve the problem.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Identify hidden content + Question of impact
I have a client that had a web agency which systematically implemented hidden content. Is there a tool that will compare source content vs what is readable on from the browser? Otherwise, can you recommend articles that focus on doing this? What are the impact on hidden content these days? Another client had hidden content on the first page and when we took over them they were ranking nr 2 with the home page on their brand name (nr 1 with a sub page). We have checked the link profile with Majestic, Ahrefs and Moz and nothing spammy comes up so I doubt it is Penguin related.
Technical SEO | | OscarSE0 -
Sitemap For Static Content And Blog
We'll be uploading a sitemap to google search console for a new site. We have ~70-80 static pages that don't really chance much (some may change as we modify a couple pages over the course of the year). But we have a separate blog on the site which we will be adding content to frequently. How can I set up the sitemap to make sure that "future" blog posts will get picked up and indexed. I used a sitemap generator and it picked up the first blog post that's on the site, but am wondering what happens with future ones? I don't want to resubmit a new sitemap each time that has a link to a new blog post we posted.
Technical SEO | | vikasnwu0 -
Duplicate Content Brainstorming
Hi, New here in the SEO world. Excellent resources here. We have an ecommerce website that sells presentation templates. Today our templates come in 3 flavours - for PowerPoint, for Keynote and both - called Presentation Templates. So we've ended up with 3 URLS with similar content. Same screenshots, similar description.. Example: https://www.improvepresentation.com/keynote-templates/social-media-keynote-template https://www.improvepresentation.com/powerpoint-templates/social-media-powerpoint-template https://www.improvepresentation.com/presentation-templates/social-media-presentation-template I know what you're thinking. Why not make a website with a template and give 3 download options right? But what about https://www.improvepresentation.com/powerpoint-templates/ https://www.improvepresentation.com/keynote-templates/ These are powerfull URL's in my opinion taking into account that the strongest keyword in our field is "powerpoint templates" How would you solve this "problem" or maybe there is no problem at all.
Technical SEO | | slidescamp0 -
Content relaunch without content duplication
We write great Content for blog and websites (or at least we try), especially blogs. Sometimes few of them may NOT get good responses/reach. It could be the content which is not interesting, or the title, or bad timing or even the language used. My question for the discussion is, what will you do if you find the content worth audience's attention missed it during its original launch. Is that fine to make the text and context better and relaunch it ? For example: 1. Rechristening the blog - Change Title to make it attractive
Technical SEO | | macronimous
2. Add images
3. Check spelling
4. Do necessary rewrite, spell check
5. Change the timeline by adding more recent statistics, references to recent writeups (external and internal blogs for example), change anything that seems outdated Also, change title and set rel=cannoical / 301 permanent URLs. Will the above make the blog new? Any ideas and tips to do? Basically we like to refurbish (:-)) content that didn't succeed in the past and relaunch it to try again. If we do so will there be any issues with Google bots? (I hope redirection would solve this, But still I want to make sure) Thanks,0 -
How different should content be so that it is not considered duplicate?
I am making a 2nd website for the same company. The name of the company, our services, keywords and contact info will show up several times within the text of both websites. The overall text and paragraphs will be different but some info may be repeated on both sites. Should I continue this? What precautions should I take?
Technical SEO | | savva0 -
Error Reporting
http://pro.seomoz.org/campaigns/33868/issues/18 Rel Canonical Found about 16 hours ago <dl> <dt>Tag value</dt> <dd>http://www.geeks.com/</dd> <dt>Description</dt> <dd>Using rel=canonical suggests to search engines which URL should be seen as canonical.</dd> <dd>We do have rel canonical on some of the pages this report is recommending that we "fix" this issue.</dd> <dd> Rel Canonical Found about 16 hours ago <dl> <dt>Tag value</dt> <dd>http://www.geeks.com/products.asp?cat=MBB</dd> <dt>Description</dt> <dd>Using rel=canonical suggests to search engines which URL should be seen as canonical.</dd> </dl> <a class="more expanded">Minimize</a> </dd> </dl>
Technical SEO | | JustinGeeks0 -
Understanding the actions needed from a Crawl Report
I've just joined SEOMOZ last week and have not even received my first full-crawl yet, but as you know, I do get the re-crawl report. It shows I have 50 301's and 20 rel canonical's. I'm still very confused as to what I'm supposed to fix...And, all the rel canonical's are my sites main pages, so hence I am still equally confused as to what the canonical is doing and how do I properly setup my site. I'm a technical person and can grasp most things fairly quickly, but on this the light bulb is taking a little while longer to fire-up 🙂 If my question wasn't total jibberish and you can help shed some light, I would be forever grateful. Thank you.
Technical SEO | | apmgsmith0 -
Duplicate content
Greetings! I have inherited a problem that I am not sure how to fix. The website I am working on had a 302 redirect from its original home url (with all the link juice) to a newly designed page (with no real link juice). When the 302 redirect was removed, a duplicate content problem remained, since the new page had already been indexed by google. What is the best way to handle duplicate content? Thanks!
Technical SEO | | shedontdiet0