Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
How preproduction website is getting indexed in Google.
-
Hi team,
Can anybody please help me to find how my preproduction website and urls are getting indexed in Google.
-
As Eric hinted, the best method to prevent any pages being indexed would be to use htaccess password protection dialog on your development site. It's fairly easy to implement. You can find instructions to do so here: http://www.htaccesstools.com/articles/password-protection/
-
Hi Anoop! Have everyone's answers helped? Do you still have any questions?
-
Anoop, when a 'development' or 'preproduction' website or subdomain is getting indexed, that means that you haven't stopped the search engines from crawling it. The search engines, especially Google, are very aggressive at crawling, and they will crawl just about any URL that they find. It seems as though all you have to do is visit that page and it's going to get crawled.
Best way to stop Google from crawling (then indexing) a website is to stop it from getting crawled using the robots.txt file. Keep in mind, though, that even if you tell them to stay out of it using the robots.txt file they will still index those URLs.
The only way to stop Google from crawling would be to password protect the website or make it available only on a private server, or available via VPN only.
-
In addition to noindexing the pages using the meta tag, if you have WMT / Search Console set up, you can request Google remove those URLs from their index for the time being. I've found that this may take up to a couple of hours from the removal request to the time of actual removal.
As to how they were found, there's a good chance that Google crawled a link to a preproduction webpage and went from there.
-
Hi
To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the section of your page:
To prevent only Google web crawlers from indexing a page:
You should be aware that some search engine web crawlers might interpret the
noindex
directive differently. As a result, it is possible that your page might still appear in results from other search engines.here is complete guide: https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag?csw=1
-
Hi,
Have you noindexed & nofollowed the site and pages? I would also suggest you block all crawlers by disallowing access in the robots.txt file.
Do you know if this has all been done?
-Andy
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URLs dropping from index (Crawled, currently not indexed)
I've noticed that some of our URLs have recently dropped completely out of Google's index. When carrying out a URL inspection in GSC, it comes up with 'Crawled, currently not indexed'. Strangely, I've also noticed that under referring page it says 'None detected', which is definitely not the case. I wonder if it could be something to do with the following? https://www.seroundtable.com/google-ranking-index-drop-30192.html - It seems to be a bug affecting quite a few people. Here are a few examples of the URLs that have gone missing: https://www.ihasco.co.uk/courses/detail/sexual-harassment-awareness-training https://www.ihasco.co.uk/courses/detail/conflict-resolution-training https://www.ihasco.co.uk/courses/detail/prevent-duty-training Any help here would be massively appreciated!
Technical SEO | | iHasco0 -
Google has deindexed a page it thinks is set to 'noindex', but is in fact still set to 'index'
A page on our WordPress powered website has had an error message thrown up in GSC to say it is included in the sitemap but set to 'noindex'. The page has also been removed from Google's search results. Page is https://www.onlinemortgageadvisor.co.uk/bad-credit-mortgages/how-to-get-a-mortgage-with-bad-credit/ Looking at the page code, plus using Screaming Frog and Ahrefs crawlers, the page is very clearly still set to 'index'. The SEO plugin we use has not been changed to 'noindex' the page. I have asked for it to be reindexed via GSC but I'm concerned why Google thinks this page was asked to be noindexed. Can anyone help with this one? Has anyone seen this before, been hit with this recently, got any advice...?
Technical SEO | | d.bird0 -
No index on subdomains
Hi, We have a subdomain that is appearing in the search results - I want to hide this as it looks really bad. If I were to add the no index tag to the sub domain would URL would this affect the whole domain or just that sub domain? The main domain is vitally important - it is just that sub domain I need to hide. Many thanks
Technical SEO | | Creditsafe0 -
Image Indexing Issue by Google
Hello All,My URL is: www.thesalebox.comI have Submitted my image Sitemap in google webmaster tool on 10th Oct 2013,Still google could not indexing any of my web images,Please refer my sitemap - www.thesalebox.com/AppliancesHomeEntertainment.xml and www.thesalebox.com/Hardware.xmland my webmaster status and image indexing status are below,
Technical SEO | | CommercePunditCan you please help me, why my images are not indexing in google yet? is there any issue? please give me suggestions?Thanks!
0 -
Unnecessary pages getting indexed in Google for my blog
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog. I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin. But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file. Have a look at my robots.txt file here: http://dapazze.com/robots.txt Please help me out to solve this problem permanently?
Technical SEO | | rahulchowdhury0 -
How to remove a sub domain from Google Index!
Hello, I have a website having many subdomains having same copy of content i think its harming my SEO for that site since abc and xyz sub domains do have same contents. Thus i require to know i have already deleted required subdomain DNS RECORDS now how to have those pages removed from Google index as well ? The DNS Records no more exists for those subdomains already.
Technical SEO | | anand20100 -
Does Google index XML files?
Does Google or other search engines include XML files in their index? More specifically, I am wondering how Google knows the difference between an xml filetype and an RSS feed.
Technical SEO | | nicole.healthline0 -
How do I get Google to display categories instead of the URL in results?
I've seen that for some domains Google will show a nice clickable site heirarchy in place of the actual URL of a search result. See attached for an example. How do I go about achieving this type of results? categorized.png
Technical SEO | | Carlito-2569610