Duplicate content warning: Same page but different urls???
-
Hi guys i have a friend of mine who has a site i noticed once tested with moz that there are 80 duplicate content warnings, for instance
Page 1 is http://yourdigitalfile.com/signing-documents.html
the warning page is http://www.yourdigitalfile.com/signing-documents.html
another example
Page 1 http://www.yourdigitalfile.com/
same second page http://yourdigitalfile.com
i noticed that the whole website is like the nealry every page has another version in a different url?, any ideas why they dev would do this, also the pages that have received the warnings are not redirected to the newer pages you can go to either one???
thanks very much
-
Thanks Tim. Do you have any examples of what those problems might be? With such a large catalog managing those rel canonical tags will be difficult (I don't even know if the store allows them, it's a hosted store solution and little code customization is allowed).
-
Hi there AspenFasteners, in this instance rather than a .HTAccess rule I would suggest applying a rel canonical tag which points to the page you deem as the original master source.
Using the robots to try and hide things could potentially cause you more issues as your categories may struggle to be indexed correctly.
-
We have a similar problem, but much more complex to handle as we have a massive catalog of 80,000 products and growing.
The problem occurs legitimately because our catalog is so large that we offer different navigation paths to the same content.
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8314.htm
http://www.aspenfasteners.com/Self-Tapping-Sheet-Metal-s/8315.htm
(If you look at the "You are here" breadcrumb trail, you will see the subtle differences in the navigation paths, with 8314.htm, the user went through Home > Screws, with 8315.htm, via Home > Security Fasteners > Screws).
Our hosted web store does not offer us htaccess, so I am thinking of excluding the redundant navigation points via robots.txt.
My question: is there any reason NOT to do this?
-
Oh ok
The only reason i was thinking it is duplicate content is the warnings i got on the moz crawl, see below.
75 Duplicate Page Content
6 4xx Client Error
5 Duplicate Page Title
44 Missing Meta Description Tag
5 Title Element is Too Short
I have found over 80 typos, grammatical errors, punctuation errors and incorrect information which was leading me to believe the quality of the work and their attention to detail was rather bad, which is why i thought this was a possibility.
Thanks again for your time its really appreciated
-
I wouldn't say that they have created two pages, it is just that because you have two versions of the domain and not set a preferred version that you are getting it indexing twice. .HTaccess changes are under the hood of the website and could have simply been an oversight.
-
Hey Tim
Thanks for your answer. It's really weird, other than lazyness on the devs part not to remove old or previous versions of pages?, have you any idea why they would create multiple versions of the same page with different url's?? is there any legit reason like ones severs mobile or something??
Just wondering thanks for replying
-
OK, so in this instance the only issue you have is that you need to choose your preferred start point - www or non www.
I would add a bit of code to your htaccess file to point to your preferred choice. I personally prefer a www. domain. Something like the below would work.
RewriteCond %{HTTP_HOST} ^example.com$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]As your site is already indexed I would also for the time being and as more of a safety measure add canonicals to the pages that point to the www. version of your site.
Also if you have a Google Search Console account, you can select your prefered domain prefix in there. this will again help with your indexation.
Hopefully I have covered most things.
Cheers
Tim
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How to stop google bot from crawling spammy injected pages by hacker?
Hello, Please help me. Our one of website is under attack by hacker once again. They have injected spammy URL and google is indexing, but we could not find these pages on our website. These all are 404 Pages. Our website is not secured. No HTTPS Our website is using wordpress CMS Thanks
White Hat / Black Hat SEO | | ShahzadAhmed0 -
Site build in the 80% of canonical URLs - What is the impact on visibility?
Hey Everyone, I represent international wall decorations store where customer can freely choose a pattern to be printed on a given material among a few milions of patterns. Due to extreme large number of potential URL combinations we struggle with too many URL adressess for a months now (search console notifications). So we finally decided to reduce amount of products with canonical tag. Basing on users behavior, our business needs and monthly search volume data we selected 8 most representative out of 40 product categories and made them canonical toward the rest. For example: If we chose 'Canvas prints' as our main product category, then every 'Framed canvas' product URL points rel=canonical tag toward its equivalent URL within 'Canvas prints' category. We applied the same logic to other categories (so "Vinyl wall mural - Wild horses running" URL points rel=canonical tag to "Wall mural - Wild horses running" URL, etc). In terms of Googlebot interpretation, there are really tiny differences between those Product URLs, so merging them with rel=canonical seems like a valid use. But we need to keep those canonicalised URLs for users needs, so we can`t remove them from a store as well as noindex does not seem like an good option. However we`re concerned about our SEO visibility - if we make those changes, our site will consist of ~80% canonical URLs (47,5/60 millions). Regarding your experience, do you have advices how should we handle that issue? Regards
White Hat / Black Hat SEO | | _JediMindBender
JMB0 -
Duplication Effects on Page Rank and Domain Authority
Hi Does page rank and domain authority page rank drop due to duplication issues on a web domain or on a web page? Thanks.
White Hat / Black Hat SEO | | SEOguy10 -
Diminishing Returns for Links to an Unrelated Page
Suppose I have a new website about cars and I had created a page about something completely not-related - like cupcakes. However, I found that it was very easy to get high quality sites to link to the cupcakes page where as it was very difficult to get people to link to the homepage about cars. If my goal is to increase the SEO for the homepage (which again is related to cars), is there a point where additional high quality links to my cupcakes page is not useful for it anymore? What if I created another page - about frosted cupcakes - which was also easy to get high quality links to?
White Hat / Black Hat SEO | | wlingke10 -
[linkbuilding] link partner page on webshop, is it working?
Hello Mozzers, I am wondering about the effect of link building by swapping links between websites and adding a link partner page to the web shop containing hundreds of links. I have this new competitor coming in to the SERP of Google competing on the keywords I am targeting. The competitor has way more links than our web shop. The competitor has a page with hundreds of links to other web shops witch on there turn has a link to there web shop. (not all off them link back btw) I always thought it is no use sharing links with other websites this way in creating a huge page with hundreds of links. it is of no benefit for neighter website to do this. Still it does seems to work (?) and tis strategy is used by a lot of web shops in the Netherlands. How are you guys looking at this?
White Hat / Black Hat SEO | | auke1810
Witch of you guy's are using strategy like this?
Should I pick up this strategy myself?0 -
Removing/ Redirecting bad URL's from main domain
Our users create content for which we host on a seperate URL for a web version. Originally this was hosted on our main domain. This was causing problems because Google was seeing all these different types of content on our main domain. The page content was all over the place and (we think) may have harmed our main domain reputation. About a month ago, we added a robots.txt to block those URL's in that particular folder, so that Google doesn't crawl those pages and ignores it in the SERP. We now went a step further and are now redirecting (301 redirect) all those user created URL's to a totally brand new domain (not affiliated with our brand or main domain). This should have been done from the beginning, but it wasn't. Any suggestions on how can we remove all those original URL's and make Google see them as not affiliated with main domain?? or should we just give it the good ol' time recipe for it to fix itself??
White Hat / Black Hat SEO | | redcappi0 -
Syndicated content outperforming our hard work!
Our company (FindMyAccident) is an accident news site. Our goal is to roll our reporting out to all 50 states; currently, we operate full-time in 7 states. To date, the largest expenditure is our writing staff. We hire professional
White Hat / Black Hat SEO | | Wayne76
journalists who work with police departments and other sources to develop written
content and video for our site. Our visitors also contribute stories and/or
tips that add to the content on our domain. In short, our content/media is 100% original. A site that often appears alongside us in the SERPs in the markets where we work full-time is accidentin.com. They are a site that syndicates accident news and offers little original content. (They also allow users to submit their own accident stories, and the entries index quickly and are sometimes viewed by hundreds of people in the same day. What's perplexing is that these entries are isolated incidents that have little to no media value, yet they do extremely well.) (I don't rest my bets with Quantcast figures, but accidentin does use their pixel sourcing and the figures indicate that they are receiving up to 80k visitors a day in some instances.) I understand that it's common to see news sites syndicate from the AP, etc., and traffic accident news is not going to have a lot of competition (in most instances), but the real shocker is that accidentin will sometimes appear as the first or second result above the original sources??? The question: does anyone have a guess as to what is making it perform so well? Are they bound to fade away? While looking at their model, I'm wondering if we're not silly to syndicate news in the states where we don't have actual staff? It would seem we could attract more traffic by setting up syndication in our vacant states. OR Is our competitor's site bound to fade away? Thanks, gang, hope all of you have a great 2013! Wayne0 -
My page rank dropped by 20 places 1 day before it was cached....any connection?
Hi I've been rather silly and been linking out to other websites for reciprical links. I added about 20 and just discovered some were bad neigbourhoods. On Sunday my rankings tanked but the page was only cached the following day on the Monday. Just wondering if there is any connection. I genuinely did not know that linking out could was bad and have removed all reciprical links as a precaution.
White Hat / Black Hat SEO | | BelfastSEO0