How to remove duplicate content, which is still indexed, but not linked to anymore?
-
Dear community
A bug in the tool, which we use to create search-engine-friendly URLs (sh404sef) changed our whole URL-structure overnight, and we only noticed after Google already indexed the page.
Now, we have a massive duplicate content issue, causing a harsh drop in rankings. Webmaster Tools shows over 1,000 duplicate title tags, so I don't think, Google understands what is going on.
<code>Right URL: abc.com/price/sharp-ah-l13-12000-btu.html Wrong URL: abc.com/item/sharp-l-series-ahl13-12000-btu.html (created by mistake)</code>
After that, we ...
- Changed back all URLs to the "Right URLs"
- Set up a 301-redirect for all "Wrong URLs" a few days later
Now, still a massive amount of pages is in the index twice. As we do not link internally to the "Wrong URLs" anymore, I am not sure, if Google will re-crawl them very soon.
What can we do to solve this issue and tell Google, that all the "Wrong URLs" now redirect to the "Right URLs"?
Best, David
-
Yes David your link is very helpful..
-
Found the perfect answer:
http://www.seomoz.org/blog/uncrawled-301s-a-quick-fix-for-when-relaunches-go-too-well
-
Thanks a lot, Sanket.
Do you think, it might help, to submit a sitemap, which also contains the "Wrong URLs", so we can trigger a recrawl of those pages? Maybe then Google will notice that there is a 301-redirect.
-
Hi Davin
The best thing in this situation is to wait for sometime more.. Because you just done the redirection of wrong url's to right url's so it will take some time. In webmaster tool you will see the changes later because the data in webmaster tool are updates on 15 days or monthly basis, depends on the website so you need to wait. The url that was 301 redirected should not appear in the search results so the problem of duplication will be sorted out shortly so dont worry. Also you can verify the redirection are done correctly or not from this redirect checker tool http://www.internetofficer.com/seo-tool/redirect-check/.
I have one suggestion to crawl your website pages fastly : Maximize the "Crawl Rate" under Settings option of webmaster tool.
Hope my response would help you. If need any help feel free to ask.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate page content on numerical blog pages?
Hello everyone, I'm still relatively new at SEO and am still trying my best to learn. However, I have this persistent issue. My site is on WordPress and all of my blog pages e.g page one, page two etc are all coming up as duplicate content. Here are some URL examples of what I mean: http://3mil.co.uk/insights-web-design-blog/page/3/ http://3mil.co.uk/insights-web-design-blog/page/4/ Does anyone have any ideas? I have already no indexed categories and tags so it is not them. Any help would be appreciated. Thanks.
Intermediate & Advanced SEO | | 3mil0 -
Duplicated Content with Index.php
Good Afternoon, My website uses Joomla CMS and has the htaccess rewrite code enabled to ensure the use of search engine friendly URLs (SEF's). While browsing the crawl diagnostics I have found that Moz considers the /index.php URL a duplicate to our root. I will always under the impression that the htaccess rewrite took care of that issue and obviously I would like to address it. I attempted to create a 301 redirect from the index.php URL to the root but ran into an issue when attempting to login to the admin portion of the website as the redirect sent me back to the homepage. I was curious if anyone had advice for handling the index.php duplication issue, specifically with Joomla. Additionally, I have confirmed that in Google Webmasters, under URL parameters, the index.php parameter is set as 'Representative URL'.
Intermediate & Advanced SEO | | BrandonEML0 -
Google isn't seeing the content but it is still indexing the webpage
When I fetch my website page using GWT this is what I receive. HTTP/1.1 301 Moved Permanently
Intermediate & Advanced SEO | | jacobfy
X-Pantheon-Styx-Hostname: styx1560bba9.chios.panth.io
server: nginx
content-type: text/html
location: https://www.inscopix.com/
x-pantheon-endpoint: 4ac0249e-9a7a-4fd6-81fc-a7170812c4d6
Cache-Control: public, max-age=86400
Content-Length: 0
Accept-Ranges: bytes
Date: Fri, 14 Mar 2014 16:29:38 GMT
X-Varnish: 2640682369 2640432361
Age: 326
Via: 1.1 varnish
Connection: keep-alive What I used to get is this: HTTP/1.1 200 OK
Date: Thu, 11 Apr 2013 16:00:24 GMT
Server: Apache/2.2.23 (Amazon)
X-Powered-By: PHP/5.3.18
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified: Thu, 11 Apr 2013 16:00:24 +0000
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
ETag: "1365696024"
Content-Language: en
Link: ; rel="canonical",; rel="shortlink"
X-Generator: Drupal 7 (http://drupal.org)
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=utf-8 xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:dc="http://purl.org/dc/terms/"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:og="http://ogp.me/ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:sioc="http://rdfs.org/sioc/ns#"
xmlns:sioct="http://rdfs.org/sioc/types#"
xmlns:skos="http://www.w3.org/2004/02/skos/core#"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"> <title>Inscopix | In vivo rodent brain imaging</title>0 -
Why are bit.ly links being indexed and ranked by Google?
I did a quick search for "site:bit.ly" and it returns more than 10 million results. Given that bit.ly links are 301 redirects, why are they being indexed in Google and ranked according to their destination? I'm working on a similar project to bit.ly and I want to make sure I don't run into the same problem.
Intermediate & Advanced SEO | | JDatSB1 -
Duplicate Content For E-commerce
On our E-commerce site, we have multiple stores. Products are shown on our multiple stores which has created a duplicate content problem. Basically if we list a product say a shoe,that listing will show up on our multiple stores I assumed the solution would be to redirect the pages, use non follow tags or to use the rel=canonical tag. Are there any other options for me to use. I think my best bet is to use a mixture of 301 redirects and canonical tags. What do you recommend. I have 5000+ pages of duplicate content so the problem is big. Thanks in advance for your help!
Intermediate & Advanced SEO | | pinksgreens0 -
Duplicate content
I run about 10 sites and most of them seemed to fall foul of the penguin update and even though I have never sought inorganic links I have been frantically searching for a link based answer since April. However since asking a question here I have been pointed in another direction by one of your contributors. It seems At least 6 of my sites have duplicate content issues. If you search Google for "We have selected nearly 200 pictures of short haircuts and hair styles in 16 galleries" which is the first bit of text from the site short-hairstyles.com about 30000 results appear. I don't know where they're from nor why anyone would want to do this. I presume its automated since there is so much of it. I have decided to redo the content. So I guess (hope) at some point in the future the duplicate nature will be flushed from Google's index? But how do I prevent it happening again? It's impractical to redo the content every month or so. For example if you search for "This facility is written in Flash® to use it you need to have Flash® installed." from another of my sites that I coincidently uploaded a new page to a couple of days ago, only the duplicate content shows up not my original site. So whoever is doing this is finding new stuff on my site and getting it indexed on google before even google sees it on my site! Thanks, Ian
Intermediate & Advanced SEO | | jwdl0 -
Duplicate Content - Panda Question
Question: Will duplicate informational content at the bottom of indexed pages violate the panda update? **Total Page Ratio: ** 1/50 of total pages will have duplicate content at the bottom off the page. For example...on 20 pages in 50 different instances there would be common information on the bottom of a page. (On a total of 1000 pages). Basically I just wanted to add informational data to help clients get a broader perspective on making a decision regarding "specific and unique" information that will be at the top of the page. Content ratio per page? : What percentage of duplicate content is allowed per page before you are dinged or penalized. Thank you, Utah Tiger
Intermediate & Advanced SEO | | Boodreaux0 -
Should I remove paid links?
I recently added about 20 paid links from directories but have since seen a 10% drop in traffic. I did also delete about 1000 pages of content that had no inbound links and were duplicated on other sites on the web and replaced the content with new content supplied by a client but still duplicated on other sites on the web, old URLs no longer valid or linked to, new content on new URLs. Assuming the drop in traffic had nothing to do with the content change mentioned above, should I remove the paid links in an attempt to recover? I don't think the old content was bringing in much traffic as it appeared elsewhere on more authoritive sites than mine.
Intermediate & Advanced SEO | | Mulith0