How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why do SEO agencies ask for access to our Google Search Console and Google Tag Manager?
What do they need GTM for? And what is the use case for setting up Google Search Console?
Intermediate & Advanced SEO | | NBJ_SM0 -
How does Googlebot evaluate performance/page speed on Isomorphic/Single Page Applications?
I'm curious how Google evaluates pagespeed for SPAs. Initial payloads are inherently large (resulting in 5+ second load times), but subsequent requests are lightning fast, as these requests are handled by JS fetching data from the backend. Does Google evaluate pages on a URL-by-URL basis, looking at the initial payload (and "slow"-ish load time) for each? Or do they load the initial JS+HTML and then continue to crawl from there? Another way of putting it: is Googlebot essentially "refreshing" for each page and therefore associating each URL with a higher load time? Or will pages that are crawled after the initial payload benefit from the speedier load time? Any insight (or speculation) would be much appreciated.
Intermediate & Advanced SEO | | mothner1 -
301 redirect for page 2, page 3 etc of an article or feed
Hey guys, We're looking to move a blog feed we have to a new static URL page. We are using 301 redirects but I'm unsure of what to regarding page 2, page 3 etc. of the feed. How do I make sure those urls are being redirected as well? For example: Moving FloridaDentist.com/blog/dental-tips/ to a new page url FloridaDentist.com/dental-tips. So, we are using a 301 on that old url to the new one. My questions is what to do with the other pages like FloridaDentist.com/blog/dental-tips/page/3. How do we make sure that page is also 301'd to the new main url?
Intermediate & Advanced SEO | | RickyShockley0 -
Google Cache Is Blank for Text-only
Hi, I'm doing some SEO for www.suprafootwear.com, and for some reason when I go to text-only in google cache, nothing shows up. http://webcache.googleusercontent.com/search?q=cache:suprafootwear.com&es_sm=91&strip=1 That seems to be the case for all of the different pages on the site, but the content is still appearing on the serp. I have never seen this before, and I'm not sure what's happening. Any help would be greatly appreciated. Thanks!
Intermediate & Advanced SEO | | bigwavew0 -
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page? If I have 4 or 5 different hashtag link section pages , consolidated into one HTML Page, no chance to get one of the Hashtag Pages to appear as a search result? like, if under one Single Page Travel Guide I have two essential sections: #Attractions #Visa no chance to direct search queries for Visa directly to the Hashtag Link Section of #Visa? Thanks for any help
Intermediate & Advanced SEO | | Muhammad_Jabali0 -
Wordpress - Dynamic pages vs static pages
Hi, Our site has over 48,000 indexed links, with a good mix of pages, posts and dynamic pages. For the purposes of SEO and the recent talk of "fresh content" - would it be better to keep dynamic pages as they are or manually create static pages/ subpages. The one noticable downside with dynamic pages is that they arent picked up by any sitemap plugins, you need to manually create a separate sitemap just for these dynamic links. Any thoughts??
Intermediate & Advanced SEO | | danialniazi1 -
Google SERPs do not display "cached"
When I am signed in with Google and searching sites, the snippets do not display the "cached" link. Not good since I am trying to see when a particular page was crawled. If I login to another server that I never use to browse and search from there the "cache" link does show up. Assumption: google knows who I am on my machine and is "helping" me.......but is there an easy way to turn this help off?
Intermediate & Advanced SEO | | Eyauuk0 -
My page has fallen off the face of the earth on Google. What happened?
I have checked all of the usual things. My page has not lost any links or authority. It is not black listed or any other obvious sign. What's going on? This has just happened within the past 3 days.
Intermediate & Advanced SEO | | Tormz0