Best solution to get mass URl's out the SE's index
-
Hi,
I've got an issue where our web developers have made a mistake on our website by messing up some URL's . Because our site works dynamically IE the URL's generated on a page are relevant to the current URL it ment the problem URL linked out to more problem URL's - effectively replicating an entire website directory under problem URL's - this has caused tens of thousands of URL's in SE's indexes which shouldn't be there.
So say for example the problem URL's are like
www.mysite.com/incorrect-directory/folder1/page1/
It seems I can correct this by doing the following:
1/. Use Robots.txt to disallow access to /incorrect-directory/*
2/. 301 the urls like this:
www.mysite.com/incorrect-directory/folder1/page1/
301 to:
www.mysite.com/correct-directory/folder1/page1/3/. 301 URL's to the root correct directory like this:
www.mysite.com/incorrect-directory/folder1/page1/
www.mysite.com/incorrect-directory/folder1/page2/
www.mysite.com/incorrect-directory/folder2/301 to:
www.mysite.com/correct-directory/Which method do you think is the best solution? - I doubt there is any link juice benifit from 301'ing URL's as there shouldn't be any external links pointing to the wrong URL's.
-
Cheers Ryan.
-
Option 2 is preferred.
You definitely do not want to use the robots.txt method. In general, avoid using robots.txt unless there are no other options.
Whenever your site's visitors have a link to an invalid URL, 301 them to the correct URL if you have the content they are seeking. It creates the best user experience and the best SEO results.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What will SEO be like in the 2020's?
Hey guys, I would love to hear your thoughts on how you think SEO will change in the 2020's. The 2010's saw some pretty cool stuff like Panda, Penguin, penalties for non-mobile-friendly, non-secure and slow loading sites. What will be more or less important for SEO's in the 2020's than today? How will machine learning and AI change SEO?
Intermediate & Advanced SEO | | GreenHatWeb0 -
Any way to force a URL out of Google index?
As far as I know, there is no way to truly FORCE a URL to be removed from Google's index. We have a page that is being stubborn. Even after it was 301 redirected to an internal secure page months ago and a noindex tag was placed on it in the backend, it still remains in the Google index. I also submitted a request through the remove outdated content tool https://www.google.com/webmasters/tools/removals and it said the content has been removed. My understanding though is that this only updates the cache to be consistent with the current index. So if it's still in the index, this will not remove it. Just asking for confirmation - is there truly any way to force a URL out of the index? Or to even suggest more strongly that it be removed? It's the first listing in this search https://www.google.com/search?q=hcahranswers&rlz=1C1GGRV_enUS753US755&oq=hcahr&aqs=chrome.0.69i59j69i57j69i60j0l3.1700j0j8&sourceid=chrome&ie=UTF-8
Intermediate & Advanced SEO | | MJTrevens0 -
Can't support IE 7,8,9, 10\. Can we redirect them to another page that's optimized for those browsers so that we can have our site work on modern browers while still providing a destination of IE browsers?
Hi, Our site can't support IE 7,8,9, 10. Can we redirect them to another page that's optimized for those browsers so that we can have our site work on modern broswers while still providing a destination of IE browsers? Would their be an SEO penalty? Thanks!
Intermediate & Advanced SEO | | dspete0 -
Fetch as Google -- Does not result in pages getting indexed
I run a exotic pet website which currently has several types of species of reptiles. It has done well in SERP for the first couple of types of reptiles, but I am continuing to add new species and for each of these comes the task of getting ranked and I need to figure out the best process. We just released our 4th species, "reticulated pythons", about 2 weeks ago, and I made these pages public and in Webmaster tools did a "Fetch as Google" and index page and child pages for this page: http://www.morphmarket.com/c/reptiles/pythons/reticulated-pythons/index While Google immediately indexed the index page, it did not really index the couple of dozen pages linked from this page despite me checking the option to crawl child pages. I know this by two ways: first, in Google Webmaster Tools, if I look at Search Analytics and Pages filtered by "retic", there are only 2 listed. This at least tells me it's not showing these pages to users. More directly though, if I look at Google search for "site:morphmarket.com/c/reptiles/pythons/reticulated-pythons" there are only 7 pages indexed. More details -- I've tested at least one of these URLs with the robot checker and they are not blocked. The canonical values look right. I have not monkeyed really with Crawl URL Parameters. I do NOT have these pages listed in my sitemap, but in my experience Google didn't care a lot about that -- I previously had about 100 pages there and google didn't index some of them for more than 1 year. Google has indexed "105k" pages from my site so it is very happy to do so, apparently just not the ones I want (this large value is due to permutations of search parameters, something I think I've since improved with canonical, robots, etc). I may have some nofollow links to the same URLs but NOT on this page, so assuming nofollow has only local effects, this shouldn't matter. Any advice on what could be going wrong here. I really want Google to index the top couple of links on this page (home, index, stores, calculator) as well as the couple dozen gene/tag links below.
Intermediate & Advanced SEO | | jplehmann0 -
Development site is live (and has indexed) alongside live site - what's the best course of action?
Hello Mozzers, I am undertaking a site audit and have just noticed that the developer has left the development site up and it has indexed. They 301d from pages on old site to equivalent pages on new site but seem to have allowed the development site to index, and they haven't switched off the development site. So would the best option be to redirect the development site pages to the homepage of the new site (there is no PR on dev site and there are no links incoming to dev site, so nothing much to lose...)? Or should I request equivalent to equivalent page redirection? Alternatively I can simply ask for the dev site to be switched off and the URLs removed via WMT, I guess... Thanks in advance for your help! 🙂
Intermediate & Advanced SEO | | McTaggart1 -
Don't affiliate programs have an unfair impact on a company's ability to compete with bigger businesses?
So many coupon sites and other websites these days will only link to your website if you have a relationship with Commission Junction or one of the other large affiliate networks. It seems to me that links on these sites are really unfair as they allow businesses with deep pockets to acquire links unequitably. To me it seems like these are "paid links", as the average website cannot afford the cost of running an affiliate program. Even worse, the only reason why these businesses are earning a link is because they have an affiliate program; that to me should violate some sort of Google rule about types and values of links. The existence of an affiliate program as the only reason for earning a link is preposterous. It's just as bad as paid link directories that have no editorial standards. I realize the affiliate links are wrapped in CJ's code, so that mush diminish the value of the link, but there is still tons of good value in having the brand linked to from these high authority sites.
Intermediate & Advanced SEO | | williamelward0 -
Help my site it's not being indexed
Hello... We have a client, that had arround 17K visits a month... Last september he hired a company to do a redesign of his website....They needed to create a copy of the site on a different subdomain on another root domain... so I told them to block that content in order to not affect my production site, cause it was going to be an exact replica of the content but different design.... The developmet team did it wrong and blocked the production site (using robots.txt), so my site lost all it's organica traffic, which was 85-90% of the total traffic and now only get a couple of hundreds visits a month... First I thought we had been somehow penalized, however when I the other site recieving new traffic and being indexed i realized so I switched the robots.txt and created 301 redirect from the subdomain to the production site. After resending sitemaps, links to google+ and many things I can't get google to reindex my site.... when i do a site:domain.com search in google I only get 3 results. Its been now almost 2 month and honestly dont know what to do.... Any help would be greatly appreciated Thanks Dan
Intermediate & Advanced SEO | | daniel.alvarez0 -
Silo Architecture - need an expert's advice
I understand the concept of silo architecture. What I don't understand is how to build the site navigation. I see experts talking about silos, but their sites have pervasive top level navigation. In theory, your top level nav breaks your silos. If I have 20 pages of supporting content all linked to my silo page, and the top nav is on the supporting content pages, then those pages all link to the pages in the top nav - silo broken, and link juice diluted. it would seem to me that the only way to build a true silo is to strip out all of the navigation on a supporting page, and only have it link to: 1. The silo landing page 2. Other supporting pages in the silo. is this what Bruce Clay does? I've seen Rand's lectures on silos as well. Is this what he is doing? I recently saw a video by the Network Empire team, and they'd also have a pervasive nav. Can someone please explain this?
Intermediate & Advanced SEO | | CsmBill0