Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Best way to permanently remove URLs from the Google index?
-
We have several subdomains we use for testing applications. Even if we block with robots.txt, these subdomains still appear to get indexed (though they show as blocked by robots.txt.
I've claimed these subdomains and requested permanent removal, but it appears that after a certain time period (6 months)? Google will re-index (and mark them as blocked by robots.txt).
What is the best way to permanently remove these from the index? We can't use login to block because our clients want to be able to view these applications without needing to login.
What is the next best solution?
-
I agree with Paul, The Google is re indexing the pages because you have few linking pointing back to these sub domains. The best idea us to restrict Google crawler by using no-index , no-follow tag and remove the instruction available in the robots.txt...
This way Google will neither crawl nor follow the activity on the page and it will get permanently remove from Google Index.
-
Yup - Chris has the solution. The robots.txt disallow directive simply instructs the crawler not to crawl, it doesn't have any instructions regarding removing URLs from the index. I'm betting there are other pages linking in to the subdomains that the bots are following to find and index as the URL Removal requests are expiring.
Do note though that when you add the no-index meta-robots tag, you're going to need to remove the robots.txt disallow directive. Otherwise the crawlers won't make any attempt to crawl all the pages and so won't even discover most of the no-index requests.
Paul
[Edited to add - there's no reason you can't implement the no-index meta-tags and then also again request removal via the Webmaster Tools removal tool. Kind of a "belt & suspenders approach. The removal request will get it out quicker, and the meta-no-index will do the job of keeping it out. Remember to do this in Bing Webmaster Tools as well.]
-
Wouldn't a noindex meta tag on each page take care of it?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can Google Crawl & Index my Schema in CSR JavaScript
We currently only have one option for implementing our Schema. It is populated in the JSON which is rendered by JavaScript on the CLIENT side. I've heard tons of mixed reviews about if this will work or not. So, does anyone know for sure if this will or will not work. Also, how can I build a test to see if it does or does not work?
Intermediate & Advanced SEO | | MJTrevens0 -
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
Removing .html from URLs - impact of rankings?
Good evening Mozzers. Couple of questions which I hope you can help with. Here's the first. I am wondering, are we likely to see ranking changes if we remove the .html from the sites URLs. For example website.com/category/sub-category.html Change to: website.com/category/sub-category/ We will of course make sure we 301 redirect to the new, user friendly URLs, but I am wondering if anyone has had previous experience of implementing this change and how it has effected rankings. By having the .html in the URLs, does this stop link juice being flowed back to the root category? Second question: If one page can be loaded with and without a forward slash "/" at the end, is this a duplicate page, or would Google consider this as the same page? Would like to eliminate duplicate content issues if this is the case. For example: website.com/category/ and website.com/category Duplicate content/pages?
Intermediate & Advanced SEO | | Jseddon920 -
301 vs 410 redirect: What to use when removing a URL from the website
We are in the process of detemining how to handle URLs that are completely removed from our website? Think of these as listings that have an expiration date (i.e. http://www.noodle.org/test-prep/tphU3/sat-group-course). What is the best practice for removing these listings (assuming not many people are linking to them externally). 301 to a general page (i.e. http://www.noodle.org/search/test-prep) Do nothing and leave them up but remove from the site map (as they are no longer useful from a user perspective) return a 404 or 410?
Intermediate & Advanced SEO | | abargmann0 -
How Do You Remove Video Thumbnails From Google Search Result Pages?
This is going to be a long question, but, in a nutshell, I am asking if anyone knows how to remove video thumbnails from Google's search result pages? We have had video thumbnails show up next to many of our organic listings in Google's search result pages for several months. To be clear, these are organic listings for our site, not results from performing a video search. When you click on the thumbnail or our listing title, you go to the same page on our site - a list of products or the product page. Although it was initially believed that these thumbnails drew the eye to our listings and that we would receive more traffic, we are actually seeing severe year over year declines in traffic to our category pages with thumbnails vs. category pages without thumbnails (where average rank remained relatively constant). We believe this decline is due to several things: An old date stamp that makes our listing look outdated (despite the fact that we can prove Google has spidered and updated their cache of these pages as recent as 2 days ago). We have no idea where Google is getting this datestamp from. An unrelated thumbnail to the page title, etc. - sometimes a picture of a man's face when the category is for women's handbags A difference in intent - user intends to shop or browse, not watch a video. They skip our listing because it looks like a video even though both the thumbnail and our listing click through to a category page of products. So we want to remove these video thumbnails from Google's search results without removing our pages from the index. Does anyone know how to do this? We believed that this connection between category page and video was happening in our video sitemap. We have removed all reference to video and category pages in the sitemap. After making this change and resubmitting the sitemap in Webmaster Tools, we have not seen any changes in the search results (it's been over 2 weeks). I've been reading and it appears many believe that Google can identify video embedded in pages. That makes sense. We can certainly remove videos from our category pages to truly remove the connection between category page URL and video thumbnail. However, I don't believe this is enough because in some cases you can find video thumbnails next to listings where the page has not had a video thumbnail in months (example: search for "leather handbags" and find www.ebags.com/category/handbags/m/leather - that video does not exist on that page and has not for months. Similarly, do a search for "handbags" and find www.ebags.com/department/handbags. That video has not been on that page since 2010. Any ideas?
Intermediate & Advanced SEO | | SharieBags0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Best possible linking on site with 100K indexed pages
Hello All, First of all I would like to thank everybody here for sharing such great knowledge with such amazing and heartfelt passion.It really is good to see. Thank you. My story / question: I recently sold a site with more than 100k pages indexed in Google. I was allowed to keep links on the site.These links being actual anchor text links on both the home page as well on the 100k news articles. On top of that, my site syndicates its rss feed (Just links and titles, no content) to this page. However, the new owner made a mess, and now the site could possibly be seen as bad linking to my site. Google tells me within webmasters that this particular site gives me more than 400K backlinks. I have NEVER received one single notice from Google that I have bad links. That first. But, I was worried that this page could have been the reason why MY site tanked as bad as it did. It's the only source linking so massive to me. Just a few days ago, I got in contact with the new site owner. And he has taken my offer to help him 'better' his site. Although getting the site up to date for him is my main purpose, since I am there, I will also put effort in to optimizing the links back to my site. My question: What would be the best to do for my 'most SEO gain' out of this? The site is a news paper type of site, catering for news within the exact niche my site is trying to rank. Difference being, his is a news site, mine is not. It is commercial. Once I fix his site, there will be regular news updates all within the niche we both are in. Regularly as in several times per day. It's news. In the niche. Should I leave my rss feed in the side bars of all the content? Should I leave an achor text link on the sidebar (on all news etc.) If so: there can be just one keyword... 407K pages linking with just 1 kw?? Should I keep it to just one link on the home page? I would love to hear what you guys think. (My domain is from 2001. Like a quality wine. However, still tanked like a submarine.) ALL SEO reports I got here are now Grade A. The site is finally fully optimized. Truly nice to have that confirmation. Now I hope someone will be able to tell me what is best to do, in order to get the most SEO gain out of this for my site. Thank you.
Intermediate & Advanced SEO | | richardo24hr0 -
Google News URL Structure
Hi there folks I am looking for some guidance on Google News URLs. We are restructuring the site. A main traffic driver will be the traffic we get from Google News. Most large publishers use: www.site.com/news/12345/this-is-the-title/ Others use www.example.com/news/celebrity/12345/this-is-the-title/ etc. www.example.com/news/celebrity-news/12345/this-is-the-title/ www.example.com/celebrity-news/12345/this-is-the-title/ (Celebrity is a channel on Google News so should we try and follow that format?) www.example.com/news/celebrity-news/this-is-the-title/12345/ www.example.com/news/celebrity-news/this-is-the-title-12345/ (unique ID no at the end and part of the title URL) www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ Others include the date. So as you can see there are so many combinations and there doesnt seem to be any unity across news sites for this format. Have you any advice on how to structure these URLs? Particularly if we want to been seen as an authority on the following topics: fashion, hair, beauty, and celebrity news - in particular "celebrity name" So should the celebrity news section be www.example.com/news/celebrity-news/celebrity-name/this-is-the-title-12345/ or what? This is for a completely new site build. Thanks Barry
Intermediate & Advanced SEO | | Deepti_C0