How long after disallowing Googlebot from crawling a domain until those pages drop out of their index?
-
We recently had Google crawl a version of the site we that we had thought we had disallowed already. We have corrected the issue of them crawling the site, but pages from that version are still appearing in the search results (the version we want them to not index and serve up is our .us domain which should have been blocked to them).
My question is this: How long should I expect that domain (the .us we don't want to appear) to stay in their index after disallowing their bot? Is this a matter of days, weeks, or months?
-
If it is the case that no URL for .us should exist (there are not new URLs) then you can remove pretty swiftly in Webmaster Tools >> Google Index >> Remove URLs >> select the root URL and select to remove all directories that come from it.
-
Hi there,
Crawling and indexing are processes which can take some time and which rely on many factors. In general, we cannot make predictions or guarantees about when or if your URLs will be crawled or indexed.
Hope it helps you.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site Crawl -> Duplicate Page Content -> Same pages showing up with duplicates that are not
These, for example: | https://im.tapclicks.com/signup.php/?utm_campaign=july15&utm_medium=organic&utm_source=blog | 1 | 2 | 29 | 2 | 200 |
Technical SEO | | writezach
| https://im.tapclicks.com/signup.php?_ga=1.145821812.1573134750.1440742418 | 1 | 1 | 25 | 2 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=tapclicks&utm_medium=blog&utm_campaign=brightpod-article | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=tapclicks&utm_medium=marketplace&utm_campaign=homepage | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?utm_source=blog&utm_campaign=first-3-must-watch-videos | 1 | 119 | 40 | 4 | 200 |
| https://im.tapclicks.com/signup.php?_ga=1.159789566.2132270851.1418408142 | 1 | 5 | 31 | 2 | 200 |
| https://im.tapclicks.com/signup.php/?utm_source=vocus&utm_medium=PR&utm_campaign=52release | Any suggestions/directions for fixing or should I just disregard this "High Priority" moz issue? Thank you!0 -
I have 3500 pages crawled by Google, - why is SEOMOZ only able to crawl 400 of these ?
I added my site almost two weeks ago to the PRO DashBoard, and so far only 404 pages has been crawled, - but I know for a fact that there is 3500 pages that should be crawled. Other search engines has no problem in crawling and indexing these pages, so what can be wrong here ?
Technical SEO | | haybob270 -
Unnecessary pages getting indexed in Google for my blog
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog. I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin. But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file. Have a look at my robots.txt file here: http://dapazze.com/robots.txt Please help me out to solve this problem permanently?
Technical SEO | | rahulchowdhury0 -
Duplicate index.php/webpage pages on website. Help needed!
Hi Guys, Having a really frustrating problem with our website. It is a Joomla 1.7 site and we have some duplicate page issues. What is happening is that we have a webpage, lets say domain.com/webpage1 and then we also have domain.com/index.php/webpage1. Google is seeing these as duplicate pages and is causing me some real SEO problems. I have tried setting up a 301 redirect but it wn't let me redirect /index.php/webpage1 to /webpage1. Anyone have any ideas or plugins that can be used to sort this out? Any help will be really appreciated! Matt.
Technical SEO | | MatthewBarby0 -
Pages not Indexed after a successful Google Fetch
I am trying to understand why google isn't indexing key content on my site. www.BeyondTransition.com is indexed and new pages show up in a couple of hours. My key content is 6 pages of information for each of 3000 events (driven by mySQL on a wordpress platform). These pages are reached via a search page, but no direct navigation from the home page. When I link to an event page from an indexed page it doesn't show up in search results. When I use fetch on webmaster tools the fetch is successful but is then not indexed - or if it does appear in results it's directed to the internal search page e.g. http://www.beyondtransition.com/site/races/course/race110003/ has been fetched and submitted with links but when I search for BeyondTransition Ironman Cozumel I get these results.... So what have I done wrong and how do I go about fixing it? All thoughts and advice appreciated Thanks Denis
Technical SEO | | beyondtransition0 -
IP address URLs being indexed, 301 to domain?
I apoligize if this question as been asked before, I couldnt' find in the Q&A though. I noticed Google has been indexing our IP address for some pages (ie: 123.123.123.123/page.html instead of domain.com/page.html). I suspect this is possibly due to a few straggler relative links instead of absolute, or possibly something else I'm not thinking of. My less-evasive solution a few months back was to ensure canonical tags were on all pages, and then replaced any relative links w/ absolutes. This does not seem to be fixing the problem though, as recently as today new pages were scooped up with the IP address. My next thought is to 301 redirect any IP address URL to the domain, but I was afraid that may be too drastic and that the canonical should be sufficient (which it doesn't seem to be). Has anyone dealt with this issue? Do you think the 301 would be a safe move, any other suggestions? thanks.
Technical SEO | | KT6840 -
Why this page doesn't get indexed?
Hi, I've just taken over development and SEO for a site and we're having difficulty getting some key pages indexed on our site. They are two clicks away from the homepage, but still not getting indexed. They are recently created pages, with unique content on. The architecture looks like this:Homepage >> Car page >> Engine specific pageWhenever we add a new car, we link to its 'Car page' and it gets indexed very quickly. However the 'Engine pages' for that car don't get indexed, even after a couple of weeks. An example of one of these index pages are - http://www.carbuzz.co.uk/car-reviews/Volkswagen/Beetle-New/2.0-TSISo, things we've checked - 1. Yes, it's not blocked by robots.txt2. Yes, it's in the sitemap (http://www.carbuzz.co.uk/sitemap.xml)3. Yes, it's viewable to search spiders (e.g. the link is present in the html source)This page doesn't have a huge amount of unique content. We're a review aggregator, but it still does have some. Any suggestions as to why it isn't indexed?Thanks, David
Technical SEO | | soulnafein0 -
Google indexing directory folder listing page
Google somehow managed to find several of our images index folders and decided to include them into their index. Example: websitesite.com/category/images/ is what you'll see when doing a site:website.com search. So, I have two-part question: 1) Does this hurt our site's ability to rank in any way?
Technical SEO | | invision
Because all Google sees is just a directory listing page with a bunch of links to images in the folder. 2) If there could be any negative effect, what is the best way to get these folders out of Google's index?
I could block via robots.txt, but I'm afraid it will also block all the images in that folder from being indexed in Google image search. I could also turn off directory listing in cpanel / htaccess, but then that gives is a 403 forbidden. Will this hurt the site in anyway and would it prevent Google from indexing the images in the directory? Thanks,
Tony0