How to remove hundreds of duplicate pages
-
Hi - while i was checking duplicate links, am finding hundreds of duplicates pages :-
-
having undefined after domain name and before sub page url
-
having /%5C%22/ after domain name and before the sub page url
-
Due to Pagination limits
Its a joomla site - http://www.mycarhelpline.com
Any suggestions - shall we use:-
-
301 redirect
-
leave these as standdstill
-
and what to do of pagination pages (shall we create a separate title tag n meta description of every pagination page as unique one)
thanks
-
-
Okay, I took a look at the plugin Ben recommended, and took another look at the SH404SEF one. The free one Ben recommended (http://extensions.joomla.org/extensions/site-management/sef/1063) looks like it can help out with some duplicate content - but what I recommend doing is getting the SH404SEF here http://anything-digital.com/sh404sef/features.html because it allows for setting up canonical tags and also gives you the option to add the rel=next feature to your paginated pages, which is one of your problem areas.
One thing I noticed though is that it specifically states it "automatically adds canonical tags to non-html pages" - so that means it will apply it automatically to Joomla's defaul pdf view, etc. While this is helpful, it may not solve the full issue of your duplicate pages with the undefined and "/%5C%22/" issue.
It does however state that it "removes duplicate URLs" - how it identifies and removes these, I am not sure. You may want to try it out because it is useful for other optimization tasks - or contact the owner for more information.
If the tool doesn't recognize and remove the duplicate pages caused by /undefined/ and "/%5C%22/" then you should disallow crawling of these in your robots.txt file. While you are in your robots.txt file you should remove the /images/ because you want those to be crawled - Joomla adds that in by default.
Because a lot of these pages have already been crawled, you should do a 301 on the duplicate pages to their matching page. This sounds like it will be a long process - this may be aided by the sh404sef plugin - not sure.
I just want to also add that I am in no way affiliated with any of these plugins.
-
The only way to solve the duplication error you are getting is to make the URL's distinct. Googlebot comes to your site and looks at the URL's and if they are not distinct it may not index them very well. I understand your site is showing up fine in the SERP's so this may be one of those items you place on a lower priority until later.
I think R.May knows Joomla so I'll refer to him on how to accomplish this but it may be worth it to make the adjustment. You may find the end result of making your page URL more distinct will actually increase your current SERPs. Just a thought.
Other than that. If your site isn't hurting and the only thing you are concerned about is the report in SEOmoz then I would move on and just make a mental note of it for later.
-
Hi ben - changing url is not well required as the site is getting good serp, however - the duplicacy issue to saveguard us from any future issue - is what we seek for
-
Hi - thanks for replying
-
For the dynamic url - Yes - at the initial start - it was missed and as on its not reqd somehow - as the pages are getting indexed well and good in SERP
-
For pagination - Where we needs this is like in our used car section, discount section& news section where multiple pages are created. shall we create separate title & meta description for every pagination page. is it ideally reqd ?
http://www.mycarhelpline.com/index.php?option=com_usedcar&view=category&Itemid=3
- 'undefined' & /%5C%22/ is coming as per report of SEOmoz and is almost on every page of site (except of home page) with the dynamic url after domain name are preceded with these 2 strings as per moz report
how to get this corrected - want to be preventive from this duplicacy n avoid getting a hit in future even if its going well now -
-
-
I'm not a Joomla expert but to make your URL's search engine friendly you are going to need to add an extension like this. That will allow you to make more distinct URLs that will not be considered "duplicate" anymore.
-
Joomla has soo many dup content issues, you have to know Joomla really well to avoid most of them. The biggest issue is you didn't enable the SEF URLs from the start and left the default index.php?option=com on most of them, which stuffs your URLs full of ugly parameters.
You can still enable this in your global options and with a quick edit to htaccess - but it will change all of your current URLs and you will need to 301 all of them, so that isn't a great option unless you are really suffering - and depending on if you are using J 1.6 or under, this is a time consuming nasty process. Also this is unlikely to get rid of any existing duplicate pages - but may make dealing with them and finding them easier.
I don't see the specific examples you posted though, where are you seeing "undefined" and "%5C%22/ " ?
You should implement rel=canonical on the correct version of each page. I recommend SH404SEF which is a Joomla plugin and makes this process easier - but it isn't free. I don't know of a good free plugin that does this, and Joomla's templates make doing this manually difficult.
Looking at it quickly, I also didn't notice any articles that were paginated, but you should try to follow the rel="next" and rel="prev" for paginated pages. This is likely something you will have to edit your Joomla core files to do.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
External 404 pages
A client of mine is linking to a third-party vendor from their main site. The page being linked to loads with a Page Not Found error and then replaces some application content once the Javascript kicks in. This process is not visible to users (the application loads fine for front-end users) but it is being picked up as a 404 error in broken link reports. This link is part of the site skin so it's on every page. Outside of the annoyance of having lots of 404 errors being flagged in a broken link report, does this cause any actual issue? Eg, do search enginges see that my client is linking to something that is a 404 error, and does that cause them any harm?
Intermediate & Advanced SEO | | mkleamy0 -
Will Reducing Number of Low Page Authority Page Increase Domain Authority?
Our commercial real estate site (www.nyc-officespace-leader.com) contains about 800 URLs. Since 2012 the domain authority has dropped from 35 to about 20. Ranking and traffic dropped significantly since then. The site has about 791 URLs. Many are set to noindex. A large percentage of these pages have a Moz page authority of only "1". It is puzzling that some pages that have similar content to "1" page rank pages rank much better, in some cases "15". If we remove or consolidate the poorly ranked pages will the overall page authority and ranking of the site improve? Would taking the following steps help?: 1. Remove or consolidate poorly ranking unnecessary URLs?
Intermediate & Advanced SEO | | Kingalan1
2. Update content on poorly ranking URLs that are important?
3. Create internal text links (as opposed to links from menus) to critical pages? A MOZ crawl of our site's URLs is visible at the link below. I am wondering if the structure of the site is just not optimized for ranking and what can be done to improve it. THANKS. https://www.dropbox.com/s/oqchfqveelm1q11/CRAWL www.nyc-officespace-leader.com (1).csv?dl=0 Thanks,
Alan0 -
I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2\. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect?
I'm going through the crawl report and it says I've got duplicate pages. For example, blog/page/2 is the same as author/admin/page/2/ Now, the author/admin/page/2 I can't even find in WordPress, but it is the same thing as blog/page/2 nonetheless. Is this something I should just ignore, or should I create the author/admin/page2 and then 301 redirect it to blog/page/2?
Intermediate & Advanced SEO | | shift-inc0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
How to associate content on one page to another page
Hi all, I would like associate content on "Page A" with "Page B". The content is not the same, but we want to tell Google it should be associated. Is there an easy way to do this?
Intermediate & Advanced SEO | | Viewpoints1 -
An affiliate website uses datafeeds and around 65.000 products are deleted in the new feeds. What are the best practises to do with the product pages? 404 ALL pages, 301 Redirect to the upper catagory?
Note: All product pages are on INDEX FOLLOW. Right now this is happening with the deleted productpages: 1. When a product is removed from the new datafeed the pages stay online and are showing simliar products for 3 months. The productpages are removed from the categorie pages but not from the sitemap! 2. Pages receiving more than 3 hits after the first 3 months keep on existing and also in the sitemaps. These pages are not shown in the categories. 3. Pages from deleted datafeeds that receive 2 hits or less, are getting a 301 redirect to the upper categorie for again 3 months 4. Afther the last 3 months all 301 redirects are getting a customized 404 page with similar products. Any suggestions of Comments about this structure? 🙂 Issues to think about:
Intermediate & Advanced SEO | | Zanox
- The amount of 404 pages Google is warning about in GWT
- Right now all productpages are indexed
- Use as much value as possible in the right way from all pages
- Usability for the visitor Extra info about the near future: Beceause of the duplicate content issue with datafeeds we are going to put all product pages on NOINDEX, FOLLOW and focus only on category and subcategory pages.0 -
Whats the best way to remove search indexed pages on magento?
A new client ( aqmp.com.br/ )call me yestarday and she told me since they moved on magento they droped down more than US$ 20.000 in sales revenue ( monthly)... I´ve just checked the webmaster tool and I´ve just discovered the number of crawled pages went from 3.260 to 75.000 since magento started... magento is creating lots of pages with queries like search and filters. Example: http://aqmp.com.br/acessorios/lencos.html http://aqmp.com.br/acessorios/lencos.html?mode=grid http://aqmp.com.br/acessorios/lencos.html?dir=desc&order=name Add a instruction on robots.txt is the best way to remove unnecessary pages of the search engine?
Intermediate & Advanced SEO | | SeoMartin10 -
Linking to local pages on main page - keyword self-cannibalization issue?
Hi guys, Our website has this landing page: www.example.com/service1/ Is this considered keyword self-cannibalization if on the above page we link to local pages such as: www.example.com/service1-in-chicago/ www.example.com/service1-in-newyork/ www.example.com/service1-in-texas/ Many thanks David
Intermediate & Advanced SEO | | sssrpm0