Removing Dynamic "noindex" URL's from Index
-
6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google.
It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index.
I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?
-
Hooray! Usually, I just give my advice and then run away, so it's always nice to hear I was actually right about something Seriously, glad you got it sorted out.
-
Just a follow up to your suggestion.
I created sitemaps for the pages I want removed using the google spreadsheet importXML functions, which saved a lot of time.
It took a couple weeks but all of the pages, and similar pages, have successfully been removed from the index. Even the similar pages I didn't get a chance to put in the sitemap yet (importXML limits the results to 100).
Your suggestion worked!
-
I can't 404 dynamic search pages.
-
There are a mix of search pages and old mobile pages.
The search pages I've been testing out having the canonical point to the default search page. I've seen a slight drop in these pages - but I guess I just have to be more patient.
For the other pages the path is no longer there like you were mentioning. I like the idea of setting up the XML sitemap, I never even thought of making a bad/indexed page sitemap. I will give that a shot! Thankfully this will be a quick job with the importXml function in google spreadsheets! Great tip, hopefully it'll work.
-
Is there a crawl path to them currently? One issue I see a lot is that a bunch of pages get indexed, the path is found and cut off, NOINDEX (canonical, 301, etc.) is added, but then the pages never get re-crawled. Since they don't get recrawled, the page-level directive never gets honored.
If there's a URL parameter involved, you could use parameter-handling in GWT - it's not a perfect solution, but it sometimes seems to work without a re-crawl.
The other option would be to create a new XML sitemap with all of the bad/indexed URLs. This may push Google to re-crawl them and then see the tags to deindex. It's a bit safer than re-opening the crawl paths.
If they are being crawled and Google is just ignoring the NOINDEX for some reason, I'd try to 301 or canonical those pages to a primary search page, if that's feasible (probably canonical, since you don't want the users to 301). Sometimes, if a signal isn't working for that long, you just have to shake Google and try a different signal. Even following their exact recommendations, it rarely works as planned at large scale.
-
Don't use GWMT's removal tool to remove URLs which should not be in the index (unless those expose sensitive information). Best practise is to exclude them in robots.txt and to also ensure that the pages either 404 or have a noindex,noarchive tag.
-
Change the site structure and let the pages 404, Google will deindex them if they are not being linked to.
-
You could try adding the pages you want to remove to your robots.txt file. Since you're not linking to them, and it's very unlikely that Googlebot will index those pages naturally now, this might be a better way of telling it which pages to explicitly not index.
I'm not really sure how quickly this will trigger Google to remove those pages from the index - but they do reference robots.txt on the actual "Remove URLs" page of WMT ---> "Use **robots.txt **to specify how search engines should crawl your site, or request **removal **of URLs from Google's search results ..."
For that technique, you'd want to add something like this for all of the pages you want to remove:
Disallow: /oldpage1toremove.php
That should work. If it doesn't, then I would probably just submit the requests through the "Remove URLs" tool.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need help in de-indexing URL parameters in my website.
Hi, Need some help.
Intermediate & Advanced SEO | | ImranZafar
So this is my website _https://www.memeraki.com/ _
If you hover over any of the products, there's a quick view option..that opens up a popup window of that product
That popup is triggered by this URL. _https://www.memeraki.com/products/never-alone?view=quick _
In the URL you can see the parameters "view=quick" which is infact responsible for the pop-up. The problem is that the google and even your Moz crawler is picking up this URL as a separate webpage, hence, resulting in crawl issues, like missing tags.
I've already used the webmaster tools to block the "view" parameter URLs in my website from indexing but it's not fixing the issue
Can someone please provide some insights as to how I can fix this?0 -
What's the best way to noindex pages but still keep backlinks equity?
Hello everyone, Maybe it is a stupid question, but I ask to the experts... What's the best way to noindex pages but still keep backlinks equity from those noindexed pages? For example, let's say I have many pages that look similar to a "main" page which I solely want to appear on Google, so I want to noindex all pages with the exception of that "main" page... but, what if I also want to transfer any possible link equity present on the noindexed pages to the main page? The only solution I have thought is to add a canonical tag pointing to the main page on those noindexed pages... but will that work or cause wreak havoc in some way?
Intermediate & Advanced SEO | | fablau3 -
Google's 'related:' operator
I have a quick question about Google's 'related:' operator when viewing search results. Is there reason why a website doesn't produce related/similar sites? For example, if I use the related: operator for my site, no results appear.
Intermediate & Advanced SEO | | ecomteam_handiramp.com
https://www.google.com/#q=related:www.handiramp.com The site has been around since 1998. The site also has two good relevant DMOZ inbound links. Any suggestions on why this is and any way to fix it? Thank you.0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
Google's Stance on "Hidden" Content
Hi, I'm aware Google doesn't care if you have helpful content you can hide/unhide by user interaction. I am also aware that Google frowns upon hiding content from the user for SEO purposes. We're not considering anything similar to this. The issue is, we will be displaying only a part of our content to the user at a time. We'll load 3 results on each page initially. These first 3 results are static, meaning on each initial page load/refresh, the same 3 results will display. However, we'll have a "Show Next 3" button which replaces the initial results with the next 3 results. This content will be preloaded in the source code so Google will know about it. I feel like Google shouldn't have an issue with this since we're allowing the user action to cycle through all results. But I'm curious, is it an issue that the user action does NOT allow them to see all results on the page at once? I am leaning towards no, this doesn't matter, but would like some input if possible. Thanks a lot!
Intermediate & Advanced SEO | | kirmeliux0 -
Dynamically change anchor text and URLs remotely
Hey i'm looking to create a widget in javascript which i dynamically change the urls and anchor text which link the widget back to my site remotely (via php) once it spreads. I have heard peopled doing this before, but i can't seem to find a example. Does anyone know of any examples/widgets or anything which can do this?
Intermediate & Advanced SEO | | monster990 -
Why is a page with a noindex code being indexed?
I was looking through the pages indexed by Google (with site:www.mywebsite.com) and one of the results was a page with "noindex, follow" in the code that seems to be a page generated by blog searches. Any ideas why it seems to be indexed or how to de-index it?
Intermediate & Advanced SEO | | theLotter0 -
Can some brilliant mozzer out there teach a moron/newbie like me how to 301 redirect several URL's I have?
Okay - I am a supermodel. I look pretty. My legs are amazing. My cheekbones are high. But when it comes to 301 redirects I am the ugliest supermodel on the block. Crap, here is the truth: I am not even a supermodel. I am just a middle-aged, goofy looking dude who is a newbie to fixing websites. I have inherited several sites from a friend and I have been helping by creating solid contextual links internally and externally for a while. But, when Roger the wondrous SEOMoz robot talks to me, he says, "oops, it looks like your foolish freak self has a site that has both a www. and a non-www, which can create competition for yourself." What do I do when he says that? I just whisper a "thank-you" but gently press the skip this step button and go on with my life because I do not know how to make my non-www.'s redirect into the www. sites... Now, I have sort of asked this question on the site before, but I was answered by someone who does not understand my level of ignorance. any use of the word canonical or just put this lfwjkshj.htp/php inside the left ear of your mom, does not tell me anything so, is there any willing and kind soul who can walk me through redirecting several of my sites to their proper home - kind of like Carl Chubbs Weathers did for Happy Gilmore in that Academy Award winning classic? Thanks for the help in advance best, dumbhead
Intermediate & Advanced SEO | | creativeguy0