How preproduction website is getting indexed in Google.
-
Hi team,
Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
-
As Eric hinted, the best method to prevent any pages being indexed would be to use htaccess password protection dialog on your development site. It's fairly easy to implement. You can find instructions to do so here: http://www.htaccesstools.com/articles/password-protection/
-
Hi Anoop! Have everyone's answers helped? Do you still have any questions?
-
Anoop, when a 'development' or 'preproduction' website or subdomain is getting indexed, that means that you haven't stopped the search engines from crawling it. The search engines, especially Google, are very aggressive at crawling, and they will crawl just about any URL that they find. It seems as though all you have to do is visit that page and it's going to get crawled.
Best way to stop Google from crawling (then indexing) a website is to stop it from getting crawled using the robots.txt file. Keep in mind, though, that even if you tell them to stay out of it using the robots.txt file they will still index those URLs.
The only way to stop Google from crawling would be to password protect the website or make it available only on a private server, or available via VPN only.
-
In addition to noindexing the pages using the meta tag, if you have WMT / Search Console set up, you can request Google remove those URLs from their index for the time being. I've found that this may take up to a couple of hours from the removal request to the time of actual removal.
As to how they were found, there's a good chance that Google crawled a link to a preproduction webpage and went from there.
-
Hi
To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the section of your page:
To prevent only Google web crawlers from indexing a page:
You should be aware that some search engine web crawlers might interpret the
noindex
directive differently. As a result, it is possible that your page might still appear in results from other search engines.here is complete guide: https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?csw=1
-
Hi,
Have you noindexed & nofollowed the site and pages? I would also suggest you block all crawlers by disallowing access in the robots.txt file.
Do you know if this has all been done?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My brand name has 2 words but Google only indexing as 1 word. Is there a fix?
Hi all...I'm at a loss. I've never had this happen. Google only shows pages of my site when I search the brand name as one word. When I Google the site as one word BrandBrand- it only shows my blog page and about us page plus Twitter and Facebook on page 1. The homepage does not show up at all. When I Google the site as two words Brand Brand - My Facebook page is on page 1 but nothing else. The homepage isn't showing up at all. When I search both words on Bing and Yahoo both are indexing it as two words and shows on page 1. Any ideas?
Technical SEO | | TexasBlogger0 -
Page missing from Google index
Hi all, One of our most important pages seems to be missing from the Google index. A number of our collections pages (e.g., http://perfectlinens.com/collections/size-king) are thin, so we've included a canonical reference in all of them to the main collection page (http://perfectlinens.com/collections/all). However, I don't see the main collection page in any Google search result. When I search using "info:http://perfectlinens.com/collections/all", the page displayed is our homepage. Why is this happening? The main collection page has a rel=canonical reference to itself (auto-generated by Shopify so I can't control that). Thanks! WUKeBVB
Technical SEO | | leo920 -
Why my website does not index?
I made some changes in my website after that I try webmaster tool FETCH AS GOOGLE but this is 2nd day and my new pages does not index www. astrologersktantrik .com
Technical SEO | | ramansaab0 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
Web page is showing up on Google but doesn't show when it was cached, so is it indexed?
Hey everyone So I created a new page on a WordPress website, it was live for a few hours till I changed my mind & switched it back to a draft. Just out of curiosity I did the Site:www.example.com/Example search on Google to see if it had been indexed & apparently it had but when I click on cached to see what time it got indexed at exactly it's showing me an error. So does this mean it is indexed or not?
Technical SEO | | conversiontactics0 -
Google refuses to index our domain. Any suggestions?
A very similar question was asked previously. (http://www.seomoz.org/q/why-google-did-not-index-our-domain) We've done everything in that post (and comments) and then some. The domain is http://www.miwaterstewardship.org/ and, so far, we have: put "User-agent: * Allow: /" in the robots.txt (We recently removed the "allow" line and included a Sitemap: directive instead.) built a few hundred links from various pages including multiple links from .gov domains properly set up everything in Webmaster Tools submitted site maps (multiple times) checked the "fetch as googlebot" display in Webmaster Tools (everything looks fine) submitted a "request re-consideration" note to Google asking why we're not being indexed Webmaster Tools tells us that it's crawling the site normally and is indexing everything correctly. Yahoo! and Bing have both indexed the site with no problems and are returning results. Additionally, many of the pages on the site have PR0 which is unusual for a non-indexed site. Typically we've seen those sites have no PR at all. If anyone has any ideas about what we could do I'm all ears. We've been working on this for about a month and cannot figure this thing out. Thanks in advance for your advice.
Technical SEO | | NetvantageMarketing0 -
Getting Google to index new pages
I have a site, called SiteB that has 200 pages of new, unique content. I made a table of contents (TOC) page on SiteB that points to about 50 pages of SiteB content. I would like to get SiteB's TOC page crawled and indexed by Google, as well as all the pages it points to. I submitted the TOC to Pingler 24 hours ago and from the logs I see the Googlebot visited the TOC page but it did not crawl any of the 50 pages that are linked to from the TOC. I do not have a robots.txt file on SiteB. There are no robot meta tags (nofollow, noindex). There are no 'rel=nofollow' attributes on the links. Why would Google crawl the TOC (when I Pinglered it) but not crawl any of the links on that page? One other fact, and I don't know if this matters, but SiteB lives on a subdomain and the URLs contain numbers, like this: http://subdomain.domain.com/category/34404 Yes, I know that the number part is suboptimal from an SEO point of view. I'm working on that, too. But first wanted to figure out why Google isn't crawling the TOC. The site is new and so hasn't been penalized by Google. Thanks for any ideas...
Technical SEO | | scanlin0 -
How can I get Google to crawl my site daily?
I was wndering if there was a trick to getting google to crawl my website daily?
Technical SEO | | labradoodlelocator0