How Google Carwler Cached Orphan pages and directory?
-
I have website www.test.com
I have made some changes in live website and upload it to "demo" directory (which is recently created) for client approval.
Now, my demo link will be www.test.com/demo/
I am not doing any type of link building or any activity which pass referral link to www.test.com/demo/
Then how Google crawler find it and cached some pages or entire directory?
Thanks
-
Try putting the URL into Google and see if you find any pages linking to it.
I knew a company that created a test site that was a copy of a live site (made with a specific hosted CMS). Didn't exclude the test site in robots because "we all know we won't link to it so it'll be ok". Site got indexed, and it was because a person at the company was having problems with the implementation of the test site, went to the help forum (which person didn't think would be indexed) and posted the URL to the test site.
I found the above by just putting in the URL of the test site into Google, and I saw the post in the help desk. You might try the same to see if somehow there is a rogue link.
-
Is google crawling our mails?
Is it possible?
-
Yup, correct.
I was certain I'd replied to this
Anyway, you ever notice how the ads in gmail are always relevant to the content of your emails? Google are totally reading them
-
The <conspiracy hat="">side of things was him commenting that Google is sometimes accused of processing everything in Gmail and could have possibly pulled your link to the demo directory from that.</conspiracy>
-
Hi Barry,
Yes, We were used Gmail for reporting.
Is it make any sense??
-
<conspiracy-hat></conspiracy-hat>
Did either you or your client use gmail when you sent him the demo link?
Regardless, Dan's advice to noindex and block the directory from spiders is the future when doing development work.
-
Hi JoelHit,
NO, There is not any single refferal link to "Demo" directory from entire website and also from third party websites.
I am aware about Google Crawling and Indexing Systems.
Thanks.
-
Hi Thetjo,
I know about it.
My question is that how Google Crawl it without any referral link?
Thanks.
-
Hi Dan,
No, i am not exclude "demo" directory from robots.txt for any search engine.
I am not using wordpress its simple stattic HTML website (Not using any type of CMS).
-
Did this actually happen or are we talking about a hypothetical situation here? It could be that there is a link to the demo directory you've overlooked? Has the /demo folder perhaps been used in the past and there were still old links to it?
As a meta-solution to this problem: prevent crawlers and nosy people from accessing the content by adding a .htpasswd login to the area used for client approval.
-
Did you block the /demo/ directory in your robots.txt file? This is step number one to try and ensure they don't get crawled. Also, are you using wordpress? If so, wordpress automatically pings search engines when you add a post and if you use the common sitemap plugin, when it creates the sitemap it submits it automatically to Google, so that's another way Google could have found it.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does redirecting a duplicate page NOT in Google‘s index pass link juice? (External links not showing in search console)
Hello! We have a powerful page that has been selected by Google as a duplicate page of another page on the site. The duplicate is not indexed by Google, and the referring domains pointing towards that page aren’t recognized by Google in the search console (when looking at the links report). My question is - if we 301 redirect the duplicate page towards the one that Google has selected as canonical, will the link juice be passed to the new page? Thanks!
Intermediate & Advanced SEO | | Lewald10 -
Why does Google display the home page rather than a page which is better optimised to answer the query?
I have a page which (I believe) is well optimised for a specific keyword (URL, title tag, meta description, H1, etc). yet Google chooses to display the home page instead of the page more suited to the search query. Why is Google doing this and what can I do to stop it?
Intermediate & Advanced SEO | | muzzmoz0 -
What can cause for a service page to rank in Google's Answer Box?
Hello Everyone, Have recently seen a Google result for "vps hosting" showing service page details in Answer Box. I would really like to know, what can cause a service page to appear in the Answer Box? Have attached a screenshot of result page. CaRiWtQUcAALn9n.png CaRiWtQUcAALn9n.png
Intermediate & Advanced SEO | | eukmark0 -
How many times will Google read a page?
Hello! Do you know if Google reads a page more than once? We want to include a very robust menu that has a lot of links, so we were thinking about coding a very simple page that loads first and immediately loading the other code that has all the links thinking that perhaps Google will only read the first version but won't read it the second time with all the links. Do you know if we will get penalized? I'm not sure if I got the idea across, let me know if I need to expand more. Thanks,
Intermediate & Advanced SEO | | alinaalvarez0 -
Organic Listings showing Google Tag Manager + Google Page Title...?
I'm a bit stumped with this. I optimise all my titles etc for Australia - and now the organic liatings are showing something strange. For example ( we sell health supplements ) Meta title = "My Product , Buy Online Australia" If I type "My Product" - the title in the organic listings says "My Product - My Company Limited" - and the only place I can see it getting that from is a combination of Meta Data used in Google Tag Manager + the Name on my Google places page. This is much more obvious for categories.. but it's a pain in the butt. If I type "My Product Australia" Then the original "My Product , Buy Online Australia" comes up. Any ideas on policy etc? I have taken the "Limited" off the Google business page - so hopefully this will change over time - but I can't find any information on why google would do something like this. If you had shed any light on this - would be much appreciated.
Intermediate & Advanced SEO | | s_EOgi_Bear0 -
Should we show(to google) different city pages on our website which look like home page as one page or different? If yes then how?
On our website, we show events from different cities. We have made different URL's for each city like www.townscript.com/mumbai, www.townscript.com/delhi. But the page of all the cities looks similar, only the events change on those different city pages. Even our home URL www.townscript.com, shows the visitor the city which he visited last time on our website(initially we show everyone Mumbai, visitor needs to choose his city then) For every page visit, we save the last visited page of a particular IP address and next time when he visits our website www.townscript.com, we show him that city only which he visited last time. Now, we feel as the content of home page, and city pages is similar. Should we show these pages as one page i.e. Townscript.com to Google? Can we do that by rel="canonical" ? Please help me! As I think all of these pages are competing with each other.
Intermediate & Advanced SEO | | sanchitmalik0 -
Link Removal Request Sent to Google, Bad Pages Gone from Index But Still Appear in Webmaster Tools
| On June 14th the number of indexed pages for our website on Google Webmaster tools increased from 676 to 851 pages. Our ranking and traffic have taken a big hit since then. The increase in indexed pages is linked to a design upgrade of our website. The upgrade was made June 6th. No new URLS were added. A few forms were changed, the sidebar and header were redesigned. Also, Google Tag Manager was added to the site. My SEO provider, a reputable firm endorsed by MOZ, believes the extra 175 pages indexed by Google, pages that do not offer much content, may be causing the ranking decline. My developer submitted a page removal request to Google via Webmaster tools around June 20th. Now when a Google search is done for site:www.nyc-officespace-leader.com 851 results display. Would these extra pages cause a drop in ranking? My developer issued a link removal request for these pages around June 20th and the number in the Google search results appeared to drop to 451 for a few days, now it is back up to 851. In Google Webmaster Tools it is still listed as 851 pages. My ranking drop more and more everyday. At the end of displayed Google Search Results for site:www.nyc-officespace-leader.comvery strange URSL are displaying like:www.nyc-officespace-leader.com/wp-content/plugins/... If we can get rid of these issues should ranking return to what it was before?I suspect this is an issue with sitemaps and Robot text. Are there any firms or coders who specialize in this? My developer has really dropped the ball. Thanks everyone!! Alan |
Intermediate & Advanced SEO | | Kingalan10 -
How are pages ranked when using Google's "site:" operator?
Hi, If you perform a Google search like site:seomoz.org, how are the pages displayed sorted/ranked? Thanks!
Intermediate & Advanced SEO | | anthematic0