Issue with duplicate content
-
Hello guys, i have a question about duplicate content. Recently I noticed that MOZ's system reports a lot of duplicate content on one of my sites. I'm a little confused what i should do with that because this content is created automatically. All the duplicate content comes from subdomain of my site where we actually share cool images with people. This subdomain is actually pointing to our Tumblr blog where people re-blog our posts and images a lot.
I'm really confused how all this duplicate content is created and what i should do to prevent it. Please tell me whether i need to "noindex", "nofollow" that subdomain or you can suggest something better to resolve that issue.
Thank you!
-
Peter, i'm trying to PM you but i have no idea what to place in the "recepient" field. Thank you for assistance.
-
We only crawl your own site, so we wouldn't surface a duplicate with Tumblr, unless something really, really weird is going on. This is why I need to look at the campaign - what you're describing shouldn't happen, in theory, so I have a feeling this is a pretty unusual situation.
-
Hello Peter, thank you for helping!
Peter, why do you say that neither Moz nor Webmaster Tools are going to detect the duplicates between your subdomain and Tumblr? MOZ is detecting it now. Can you more elaborate on it?
THanks
-
My gut feeling is that you have 2+ issues going on here. Neither Moz nor Webmaster Tools are going to detect the duplicates between your subdomain and Tumblr. So, we/they, must be seeing duplicates in the subdomain itself. This sounds like an overly-complex setup that is likely to be causing you some harm, but without seeing it in play, it's really hard to diagnose.
Could you PM me with the domain - I can log into Moz Analytics as you and check, but you have a few campaigns set up.
(Sorry, I originally was logged in as Marina and my reply posted - I apologize for the confusion)
-
Hello Kane,
Thank you for trying to help me!
I added a link to three screenshots. Two of them are from my MOZ account showing exponential increase of duplicate content and the second one is the subdomain where that duplicate content is coming from. The third screenshot is from my gmail account showing notification from GWT about deep links issue. I'm not sure whether these two issues have anything in common but i fell that they do. Please let me know what you think.
Thanks
-
Hi Marina, a few questions for you:
Can you possibly post screenshots of the Google Webmaster Tools warning that you're seeing?
Does your website have an app associated with it?
Assuming your Tumblr content isn't reposted somewhere on your main domain, it doesn't seem like a duplicate content issue to me, it seems like the GWT message is related to deep linking for a mobile app. I can't imagine why you'd get that if you don't have an app.
-
Thank you for help!
Answering your questions:-
My subdomain look like this: photos.domain.com and it was poined to Tumblr platform (our blog on Tumblr) because it is very image-fiendly platform as well as they host all the images.
-
We use this subdomain only for images posting. We don't use this content on our root domain at all.
I'm really confused what Android app they are talking about. Do the consider Tumblr as Android app?
Thanks
-
-
Hi there
Do you have a web development team or a web developer? What I would do is pass this notification over to them, along with your notifications from Moz, and see if they have a means to correct these issues. I am assuming that Google passed along resources in their notification; I would look into those and see what your options are.
If you do not have a web development team, I would check out the Recommended List to find a company that does web development as well as SEO that can assist in this. What it sounds like to me is that you are linking off to an app with a subdomain and it's creating a different user experience than the one generated by your website.
If I were you, I would find a suitable blogging platform that you can bring your sharing capabilities onto, and create a consistent and seamless experience for your users. Two questions:
- Is your subdomain blog.domain.com? Or is it named differently?
- Do you have your blog posts on your website and copied word for word on your subdomain?
Here are a couple of more resources to review with your team:
App Indexing for Google Search Overview What is App Indexing?
App Indexing for Google Search Technical Details Enabling Deep Links for App ContentLet me know if any of this helps or if you have any more comments - good luck!
-
Thank you for replies!
I'm fairly well aware about duplicate content issue but i have never faced such particular issue. As Lesley said i don't have access to head sections of each post because those posts are practically not on my property but on Tumblr's. And i have no idea how it is created. I assume that it is cause by Tumblr's feature that allows users to re-blog my blog posts.
Moreover, i've just received a warning from Google Webmaster Tools specifically pertaining this subdomain. I'm really confused. Please help
Fix app deep links to ....com/ that dont match the content of the web pages
Dear webmaster
When indexing the deep links to your app, we detected that the content of 1 app pages doesnt match the content of the corresponding web page. This is a bad experience for your users because they wont find what they were looking for on your app page. We wont show deep links for these app pages in our smartphone search results. This is an important issue that needs your immediate attention.
Take these actions to fix this issue:
- Check the Android Apps section of the Crawl Errors report in Webmaster Tools to find examples of app URIs whose content doesnt match their corresponding web page.
- Use these examples to debug the issue:
- Open the corresponding web page to have it ready.
- Use Android debug bridge to open the app page.
- Make sure the content on both your web page and your app page is the same.
- If necessary, change the content on your app (or change your sitemap / or rel=alternate element associations to make sure the each app page is connected to the right web page).
- If necessary, change your robots.txt file to allow crawling of relevant resources. This mismatch might also be due to the fact that some of the resources associated with the app page are disallowed from crawling though robots.txt.
-
I am not very experienced with tumblr personally, but I am pretty sure it cannot be done because they don't give you access to what you would need. You would need access to the head section of each page so that you could put the canonical tag in.
One thing that MIGHT could work, but would be tricky and I would also consult with someone else about too, to see what they though. Is that if the url's are the same minus the sub domain, you could get apache to rewrite a canonical in the actual request header and send it over. I do not know if google would respect this, so I would ask others advice.
-
Hi there
The only time you should noindex a site is if it's not supposed to be seen by search engines - if that's the case, then noindex it.
However, if this content is supposed to be seen by search engines, I would make use of your canonical tags on the subdomain and point it to the original content on the domain.
I would also think of a way to build a community with your website - it sounds like you have opportunities to do so and are getting some attention from your audience and how they are sharing your posts and information.
Also, look into sitemap opportunities with your images and how you can help crawlers understand the information on your website.
You can read more about duplicate content here.
Hope this helps a bit! Let me know if you have any questions or comments!
-
Hello Lesley,
Thank you for response! Well, the subdomain is pointing to our Tumblr blog. I have access to both main domain and Tumblr. Where should i add canonical?Thanks
-
Do you have control over the sub domain to add a canonical to it and point the canonical to the original content?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is this duplicate content that I should be worried about?
Our product descriptions appear in two places and on one page they appear twice. The best way to illustrate that would be to link you to a search results page that features one product. My duplicate content concern refers to the following, When the customer clicks the product a pop-up is displayed that features the product description (first showing of content) When the customer clicks the 'VIEW PRODUCT' button the product description is shown below the buy buytton (second showing of content), this is to do with the template of the page and is why it is also shown in the pop-up. This product description is then also repeated further down in the tabs (third showing of content). My thoughts are that point 1 doesn't matter as the content isn't being shown from a dedicated URL and it relies on javascript. With regards to point 2, is the fact the same paragraph appears on the page twice a massive issue and a duplicate content problem? Thanks
Technical SEO | | joe-ainswoth0 -
Does adding a noindex tag reduce duplicate content?
I've been working under the assumption for some time that if I have two (or more) pages which are very similar that I can add a noindex tag to the pages I don't need and that will reduce duplicate content. As far as I know this removes the pages with the tag from Google's index and stops any potential issues with duplicate content. It's the second part of that assumption that i'm now questioning. Despite pages having the noindex tag they continue to appear in Google Search console as duplicate content, soft 404 etc. That is, new pages are appearing regularly that I know to have the noindex tag. My thoughts on this so far are that Google can still crawl these pages (although won't index them) so shows them in GSC due to a crude issue flagging process. I mainly want to know: a) Is the actual Google algorithm sophisticated enough to ignore these pages even through GSC doesn't. b) How do I explain this to a client.
Technical SEO | | ChrisJFoster0 -
Duplicate content problem
Hi, i work in joomla and my site is www.in2town.co.uk I have been looking at moz tools and it is showing i have over 600 pages of duplicate content. The problem is shown below and i am not sure how to solve this, any help would be great, | Benidorm News http://www.in2town.co.uk/benidorm-news/Page-2 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-102 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-103 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-104 9 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-106 28 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-11 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-112 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-114 45 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-115 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-116 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-12 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-120 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-123 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-13 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-130 50 23 3 In2town http://www.in2town.co.uk/blog/In2town/Page-131 50 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-132 31 22 3 In2town http://www.in2town.co.uk/blog/In2town/Page-140 4 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-141 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-21 10 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-22 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-23 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-26 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-271 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-274 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-277 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-28 50 21 2 In2town http://www.in2town.co.uk/blog/In2town/Page-29 50 18 1 In2town http://www.in2town.co.uk/blog/In2town/Page-310 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-341 21 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-342 4 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-343 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-345 1 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-346 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-348 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-349 50 1 0 In2town http://www.in2town.co.uk/blog/In2town/Page-350 50 16 0 In2town http://www.in2town.co.uk/blog/In2town/Page-351 50 19 1 In2town http://www.in2town.co.uk/blog/In2town/Page-82 24 1 0 In2town http://www.in2town.co.uk/blog/in2town 50 20 1 In2town http://www.in2town.co.uk/blog/in2town/Page-10 50 23 3 In2town http://www.in2town.co.uk/blog/in2town/Page-100 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-101 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-105 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-107 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-108 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-109 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-110 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-111 50 22 3 In2town http://www.in2town.co.uk/blog/in2town/Page-113 |
Technical SEO | | ClaireH-1848860 -
Would Google Call These Pages Duplicate Content?
Our Web store, http://www.audiobooksonline.com/index.html, has struggled with duplicate content issues for some time. One aspect of duplicate content is a page like this: http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html. When an audio book title goes out-of-publication we keep the page at our store and display a http://www.audiobooksonline.com/out-of-publication-audio-books-book-audiobook-audiobooks.html whenever a visitor attempts to visit a specific title that is OOP. There are several thousand OOP pages. Would Google consider these OOP pages duplicate content?
Technical SEO | | lbohen0 -
Squarespace Duplicate Content Issues
My site is built through squarespace and when I ran the campaign in SEOmoz...its come up with all these errors saying duplicate content and duplicate page title for my blog portion. I've heard that canonical tags help with this but with squarespace its hard to add code to page level...only site wide is possible. Was curious if there's someone experienced in squarespace and SEO out there that can give some suggestions on how to resolve this problem? thanks
Technical SEO | | cmjolley0 -
Same Video on Multiple Pages and Sites... Duplicate Issues?
We're rolling out quite a bit of pro video and hosting on a 3-party platform/player (likely BrightCove) that also allows us to have the URL reside on our domain. Here is a scenario for a particular video asset: A. It's on a product page that the video is relevant for. B. We have an entry on our blog with the video C. We have a separate section of our site "Video Library" that provides a centralized view of all videos. It's there too. D. We eventually give the video to other sites (bloggers, industry educational sites etc) for outreach and link-building. A through C on our domain are all for user experience as every page is very relevant, but are there any duplicate video issues here? We would likely only have the transcript on the product page (though we're open to suggestions). Any related feedback would be appreciated. We want to make this scalable and done properly from the beginning (will be rolling out 1000+ videos in 2010)
Technical SEO | | SEOPA0 -
How do i deal with duplicate content on the same domain?
I'm trying to find out if there's a way we can combat similar content on different pages on the same site, without having to re write the whole lot? Any ideas?
Technical SEO | | indurain0 -
Duplicate Content and Canonical use
We have a pagination issue, which the developers seem reluctant (or incapable) to fix whereby we have 3 of the same page (slightly differing URLs) coming up in different pages in the archived article index. The indexing convention was very poorly thought up by the developers and has left us with the same article on, for example, page 1, 2 and 3 of the article index, hence the duplications. Is this a clear cut case of using a canonical tag? Quite concerned this is going to have a negative impact on ranking, of course. Cheers Martin
Technical SEO | | Martin_S0