Most Painless way of getting Duff Pages out of SE's Index
-
Hi,
I've had a few issues that have been caused by our developers on our website. Basically we have a pretty complex method of automatically generating URL's and web pages on our website, and they have stuffed up the URL's at some point and managed to get 10's of thousands of duff URL's and pages indexed by the search engines.
I've now got to get these pages out of the SE's indexes as painlessly as possible as I think they are causing a Panda penalty.
All these URL's have an addition directory level in them called "home" which should not be there, so I have:
www.mysite.com/home/page123 instead of the correct URL www.mysite.com/page123
All these are totally duff URL's with no links going to them, so I'm gaining nothing by 301 redirects, so I was wondering if there was a more painless less risky way of getting them all out the indexes (IE after the stuff up by our developers in the first place I'm wary of letting them loose on 301 redirects incase they cause another issue!)
Thanks
-
If all /home directories should be removed, and they all correlate directly to the same URL without the /home then the best means to correct the issue would be a single regex expression to redirect the URLs.
Once you add the redirect Google's index should be cleaned up in about 30 days.
-
Add robots.txt to stop Google crawling the subdirectories you don't want shown such as /home
More information is available on http://www.seomoz.org/learn-seo/robotstxt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there. How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped. Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
Intermediate & Advanced SEO | | rickyporco0 -
Is there a difference between 'Mø' and 'Mo'?
The brand name is Mø but users are searching online for Mo. Should I changed all instances of Mø to be Mo on my clients website?
Intermediate & Advanced SEO | | ben_mozbot010 -
Canonical URL's searchable in Google?
Hi - we have a newly built site using Drupal, and Drupal likes to create canonical tags on pretty much everything, from their /node/ url's to the URL Alias we've indicated. Now, when I pull a moz crawl report, I get a huge list of all the /node/ plus other URL's. That's beside the point though... Question: when I directly enter one of the /node/ url's into a google search, a result is found. Clicking on it redirects to the new URL, but should Google even be finding these non-canonical URL's?? I don't feel like I've seen this before.
Intermediate & Advanced SEO | | Jenny10 -
Category Page - Optimization the product's anchor.
Hello, Does anybody have real experience optimizing internal links in the category page? The category pages of my actual client uses a weird way to link to their own products. Instead of creating diferents links (one in the picture, one in the photo and one in the headline), they create only one huge link, using everything as anchor (picture, text, price, etc.). URL: http://www.friasneto.com.br/imoveis/apartamentos/para-alugar/campinas/ This technique can reduce the total number of links in the page, improving the strenght of the other links, but also can create a "crazy" anchor text for the product. Could I improve my results creating the standard category link (one in the picture, one in the photo and one in the headline)? Hope it's not to confuse.
Intermediate & Advanced SEO | | Nobody15569049633980 -
Getting into Google News, URL's & Sitemaps
Hello, I know that one of the 'technical requirements' to get into google news is that the URL's have unique numbers at the end, BUT, that requirement can be circumvented if you have a Google News Sitemap. I've purchased the Yoast Google News Sitemap (https://yoast.com/wordpress/plugins/news-seo/) BUT just found out that you cannot submit a google news Sitemap until you are accepted into google news. Thus, my question is that do you need to add the digits to the URL's temporarily until you get in and can submit a google news sitemap, OR, is it ok to apply without them and take care of the sitemap after you get in. If anyone has any other tips about getting into Google News that would be great! Thanks!
Intermediate & Advanced SEO | | stacksnew0 -
My blog's categories are winning over my landing pages, what to do?
Hi My blogs categories for the ecommerce site are by subject and are similar to the product landing pages. Example Domain.com/laptops that sells laptops Domain.com/blog/laptops that shows news and articles on laptops Within the blog posts the links of anchor laptop are to the store. What to do? Thanks
Intermediate & Advanced SEO | | BeytzNet1 -
Why the archive sub pages are still indexed by Google?
Why the archive sub pages are still indexed by Google? I am using the WordPress SEO by Yoast, and selected the needed option to get these pages no-index in order to avoid the duplicate content.
Intermediate & Advanced SEO | | MichaelNewman1 -
How can I change my website's content on specific pages without affecting ranking for specific keywords?
My client's website (www.nursevillage.com) content has not been touched for 4 years and we are currently ranking #1 for "per diem nursing". They do not want to make any changes to the site in fear that it might decrease our rankings. We want to try to use utilize that keyword ranking on specific pages (www.nursevillage.com/nv/content/careeroptions/perdiem.jsp ) ranking for "per diem nursing" and try redirecting traffic or placing some banners and links on that page to specific pages or other sites related to "per diem nursing" jobs so we can get nurses to apply to our new nursing jobs. Any advice on why "per diem nursing" is ranking so high for us and what we can change on the site without messing up our ranking would be greatly appreciated. Thanks
Intermediate & Advanced SEO | | ryanperea1000