Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Removing Dynamic "noindex" URL's from Index
-
6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google.
It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index.
I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?
-
Hooray! Usually, I just give my advice and then run away, so it's always nice to hear I was actually right about something
Seriously, glad you got it sorted out. -
Just a follow up to your suggestion.
I created sitemaps for the pages I want removed using the google spreadsheet importXML functions, which saved a lot of time.
It took a couple weeks but all of the pages, and similar pages, have successfully been removed from the index. Even the similar pages I didn't get a chance to put in the sitemap yet (importXML limits the results to 100).
Your suggestion worked!
-
I can't 404 dynamic search pages.
-
There are a mix of search pages and old mobile pages.
The search pages I've been testing out having the canonical point to the default search page. I've seen a slight drop in these pages - but I guess I just have to be more patient.
For the other pages the path is no longer there like you were mentioning. I like the idea of setting up the XML sitemap, I never even thought of making a bad/indexed page sitemap. I will give that a shot! Thankfully this will be a quick job with the importXml function in google spreadsheets! Great tip, hopefully it'll work.
-
Is there a crawl path to them currently? One issue I see a lot is that a bunch of pages get indexed, the path is found and cut off, NOINDEX (canonical, 301, etc.) is added, but then the pages never get re-crawled. Since they don't get recrawled, the page-level directive never gets honored.
If there's a URL parameter involved, you could use parameter-handling in GWT - it's not a perfect solution, but it sometimes seems to work without a re-crawl.
The other option would be to create a new XML sitemap with all of the bad/indexed URLs. This may push Google to re-crawl them and then see the tags to deindex. It's a bit safer than re-opening the crawl paths.
If they are being crawled and Google is just ignoring the NOINDEX for some reason, I'd try to 301 or canonical those pages to a primary search page, if that's feasible (probably canonical, since you don't want the users to 301). Sometimes, if a signal isn't working for that long, you just have to shake Google and try a different signal. Even following their exact recommendations, it rarely works as planned at large scale.
-
Don't use GWMT's removal tool to remove URLs which should not be in the index (unless those expose sensitive information). Best practise is to exclude them in robots.txt and to also ensure that the pages either 404 or have a noindex,noarchive tag.
-
Change the site structure and let the pages 404, Google will deindex them if they are not being linked to.
-
You could try adding the pages you want to remove to your robots.txt file. Since you're not linking to them, and it's very unlikely that Googlebot will index those pages naturally now, this might be a better way of telling it which pages to explicitly not index.
I'm not really sure how quickly this will trigger Google to remove those pages from the index - but they do reference robots.txt on the actual "Remove URLs" page of WMT ---> "Use **robots.txt **to specify how search engines should crawl your site, or request **removal **of URLs from Google's search results ..."
For that technique, you'd want to add something like this for all of the pages you want to remove:
Disallow: /oldpage1toremove.phpThat should work. If it doesn't, then I would probably just submit the requests through the "Remove URLs" tool.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Remove Product & Category from URLS in Wordpress
Does anyone have experience removing /product/ and /product-category/, etc. from URLs in wordpress? I found this link from Wordpress which explains that this shouldn't be done, but I would like some opinions of those who have tried it please. https://docs.woocommerce.com/document/removing-product-product-category-or-shop-from-the-urls/
Intermediate & Advanced SEO | | moon-boots0 -
What's the best way to noindex pages but still keep backlinks equity?
Hello everyone, Maybe it is a stupid question, but I ask to the experts... What's the best way to noindex pages but still keep backlinks equity from those noindexed pages? For example, let's say I have many pages that look similar to a "main" page which I solely want to appear on Google, so I want to noindex all pages with the exception of that "main" page... but, what if I also want to transfer any possible link equity present on the noindexed pages to the main page? The only solution I have thought is to add a canonical tag pointing to the main page on those noindexed pages... but will that work or cause wreak havoc in some way?
Intermediate & Advanced SEO | | fablau3 -
Should you allow an auto dealer's inventory to be indexed?
Due to the way most auto dealership website populate inventory pages, should you allow inventory to be indexed at all? The main benefit us more content. The problem is it creates duplicate, or near duplicate content. It also creates a ton of crawl errors since the turnover is so short and fast. I would love some help on this. Thanks!
Intermediate & Advanced SEO | | Gauge1230 -
Does Google Read URL's if they include a # tag? Re: SEO Value of Clean Url's
An ECWID rep stated in regards to an inquiry about how the ECWID url's are not customizable, that "an important thing is that it doesn't matter what these URLs look like, because search engines don't read anything after that # in URLs. " Example http://www.runningboards4less.com/general-motors#!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 Basically all of this: #!/Classic-Pro-Series-Extruded-2/p/28043025/category=6593891 That is a snippet out of a conversation where ECWID said that dirty urls don't matter beyond a hashtag... Is that true? I haven't found any rule that Google or other search engines (Google is really the most important) don't index, read, or place value on the part of the url after a # tag.
Intermediate & Advanced SEO | | Atlanta-SMO0 -
Removing index.php
I have question for the community and whether or not this is a good or bad idea. I currently have a Joomla site that displays www.domain.com/index.php in all the URLs with the exception of the home page. I have read that it's better to not have index.php showing in the URL at all. Does it really matter if I have index.php in my URL? I've read that it is a bad practice. I am thinking about installing the sh404SEF component on my site and removing the index.php. However, I rank pretty high for the keywords I want in Google, Bing and Yahoo. All of the URLs that show up in the searches have index.php as part of the URL. Has anyone ever used sh404SEF to remove the index.php and how did you overcome not loosing your search engine links? I don't want an existing search showing www.domain.com/index.php/sales and it not linking to the correct page which would now be www.domain.com/sales. I guess I could insert the proper redirects in the htaccess file. But I was hoping to avoid having every page of my site in the htaccess file for redirecting. Any help or advice appreciated.
Intermediate & Advanced SEO | | MedGroupMedia0 -
Dev Subdomain Pages Indexed - How to Remove
I own a website (domain.com) and used the subdomain "dev.domain.com" while adding a new section to the site (as a development link). I forgot to block the dev.domain.com in my robots file, and google indexed all of the dev pages (around 100 of them). I blocked the site (dev.domain.com) in robots, and then proceeded to just delete the entire subdomain altogether. It's been about a week now and I still see the subdomain pages indexed on Google. How do I get these pages removed from Google? Are they causing duplicate content/title issues, or does Google know that it's a development subdomain and it's just taking time for them to recognize that I deleted it already?
Intermediate & Advanced SEO | | WebServiceConsulting.com0 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
All In One SEO PACK Configuration - Index or Noindex?
I'm finding conflicting information about the right way to configure the All in One SEO Pack wordpress plugin. Do I index or noindex for the items below? Use noindex for Categories - yes or no? Use noindex for Archives - yes or no? Use noindex for Tag Archives - yes or no?
Intermediate & Advanced SEO | | webestate0