How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My website is my name. Overnight it went from being the number one google search to not showing up at all when you google my name. Why would this happen?
I built my website via square space. It is my name. If you google my name it was the number one hit. Suddenly 2 weeks ago it doesn't show up AT ALL. I went through square spaces SEO check list, secured my site etc. Still doesn't show up. Why would this happen all of the sudden and What can I do? Thank you!
Intermediate & Advanced SEO | | Jbark0 -
Stuck on the 2nd page of google! Help
I run a McAfee Technical Support website. I has been 2.3 months since I have been practicing seo on it. It was slick until it appeared on the second page of google. But now it doesnt rank up as it's frozen. Can i get any advices and suggestions for my website to break the 2nd page cage. My website:-** mcafee.com/activate**
Intermediate & Advanced SEO | | six_figures0 -
Why rankings dropped from 2 page to 8th page and no penalization?
Dear Sirs, a client of mine for more than 7 years used to have his home page (www.egrecia.es) between 1st and 2nd page in the Google Serps and suddenly went down to 8 page. The keyword in question is "Viajes a Grecia". It has a good link profile as we have built links in good newspapers from Spain, and according to Moz it has a 99% on-page optimization for that keyword, why why why ??? What could I do to solve this? PD: It has more than 20 other keywords in 1st position, so why this one went so far down? Thank you in advance !
Intermediate & Advanced SEO | | Tintanus0 -
If a page ranks in the wrong country and is redirected, does that problem pass to the new page?
Hi guys, I'm having a weird problem: A new multilingual site was launched about 2 months ago. It has correct hreflang tags and Geo targetting in GSC for every language version. We redirected some relevant pages (with good PA) from another website of our client's. It turned out that the pages were not ranking in the correct country markets (for example, the en-gb page ranking in the USA). The pages from our site seem to have the same problem. Do you think they inherited it due to the redirects? Is it possible that Google will sort things out over some time, given the fact that the new pages have correct hreflangs? Is there stuff we could do to help ranking in the correct country markets?
Intermediate & Advanced SEO | | ParisChildress1 -
Google Adsbot crawling order confirmation pages?
Hi, We have had roughly 1000+ requests per 24 hours from Google-adsbot to our confirmation pages. This generates an error as the confirmation page cannot be viewed after closing or by anyone who didn't complete the order. How is google-adsbot finding pages to crawl that are not linked to anywhere on the site, in the sitemap or linked to anywhere else? Is there any harm in a google crawler receiving a higher percentage of errors - even though the pages are not supposed to be requested. Is there anything we can do to prevent the errors for the benefit of our network team and what are the possible risks of any measures we can take? This bot seems to be for evaluating the quality of landing pages used in for Adwords so why is it trying to access confirmation pages when they have not been set for any of our adverts? We included "Disallow: /confirmation" in the robots.txt but it has continued to request these pages, generating a 403 page and an error in the log files so it seems Adsbot doesn't follow robots.txt. Thanks in advance for any help, Sam
Intermediate & Advanced SEO | | seoeuroflorist0 -
Google places page related places links to competitor
I noticed on a lot of Google places pages i create for my clients Google seems to put related places links at the bottom of the page which links directly to their competitors. how can i remove control or avoid these links been placed? Also any tips on improving the places page would be greatly appreciated thanks
Intermediate & Advanced SEO | | Bristolweb0 -
How to Optimize Product Page for Better Ranking in Google?
Today, I was searching for product page SEO on Google for better optimization of product pages and category level pages. I want to introduce about my website structure. Root category: http://www.vistastores.com/outdoor Sub category: http://www.vistastores.com/outdoor-umbrellas End level category: http://www.vistastores.com/patio-umbrellas Product page: http://www.vistastores.com/patio-umbrellas-california-umbrella-slpt758-f13-red.html I'm doing link building for sub category & end level category page. But, I'm not able to do inbound marketing for product level URLs due to resource problem. But, I have checked that... There are many competitors who are rank well with long trail keyword... I'm selling same product but not ranking well... What is reason behind it? I don't know... Just look at Olefin Red Patio Umbrella search result. I want to rank in top 3 with this kind of long trail keywords... I have made study of following video and article. http://www.seomoz.org/blog/ecommerce-seo-making-product-pages-into-great-content-whiteboard-friday http://www.distilled.net/blog/seo/in-search-of-the-perfect-seo-product-page/ I have done all things which were described by them... But, I'm still waiting for good ranking...
Intermediate & Advanced SEO | | CommercePundit0 -
Is 404'ing a page enough to remove it from Google's index?
We set some pages to 404 status about 7 months ago, but they are still showing in Google's index (as 404's). Is there anything else I need to do to remove these?
Intermediate & Advanced SEO | | nicole.healthline0