If a page isn't linked to or directly sumitted to a search engine can it get indexed?
-
Hey Guys,
I'm curious if there are ways a page can get indexed even if the page isn't linked to or hasn't been submitted to a search engine.
To my knowledge the following page on our website is not linked to and we definitely didn't submit it to Google - but it's currently indexed:
<cite>takelessons.com/admin.php/adminJobPosition/corp</cite>
Anyone have any ideas as to why or how this could have happened? Hopefully I'm missing something obvious
Thanks,
Jon
-
You're welcome Jon.
That's a good question. I don't know the official answer on that one, though suspect that Google does check to see if the page exists, mainly because often it will be a valid URL that somebody types in the search box instead of the address bar. If Google don't have that page in their index, they'd at least like to consider adding it.
http://www.google.com/addurl.html is the URL adding page for Google as you'll already know. As well as Google relying on people to use this form, they will also, I suspect, crawl URLs that are entered into a search box. Makes sense to me that Google would at least visit these pages searched on, though can't be sure.
Regards
Simon
-
Thanks for the responses guys!
Are either of you aware of whether or not Google would ever index a page if someone searched for it? For example, say someone did a Google search for the URL I specified above. Would Google ever get curious and then try and crawl it?
Thanks again for taking the time to help out
-
Hey there Jon.
It seems that it was last indexed almost a month ago on Oct. 21, 2011. I would suggest that you follow Simon's advice on the NoFollow tag.
Additionally, check your GWT to ensure that it is not indexed...If it is, then try to resubmit it to get indexed (I know, I know, it doesn't make sense), but it will send out a message to crawl it. NoFollow, NoIndex tells the spider NOT HERE....Anywho, good luck with that and let us know how it turned out!
Cheers!
P.S. At least it's not indexed in Bing
-
Hi Jon
This is a strange one, I too haven't found a link to your admin page. There could have been one at some point in the past. Google's bot is rather clever at finding pages on the so-called 'invisible web', so best to request non-indexing of pages that you don't want indexed (covered below).
You're right in thinking that search bots follow links to find and index pages, whether it be an external or internal site link or a sitemap link. They also find pages through actual submissions.
- I'd suggest modifying the Robots tag on your admin page, include a 'NoIndex' just before the NoFollow'.
- Also include a Disallow command in your Robots.txt file for this admin page, and perhaps all pages within the admin section.
- Then, request the URL be removed from Google's index via Google Webmaster Tools ("Site configuration", "Crawler access", "Remove URL").
Hope that helps,
Regards
Simon
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Redirects and site map isn't showing
We had a malware hack and spent 3 days trying to get Bluehost to fix things. Since they have made changes 2 things are happening: 1. Our .xml sitemap cannot be created https://www.caffeinemarketing.co.uk/sitmap.xml we have tried external tools 2. We had 301 redirects from the http (www and non www versions) nad the https;// (non www version) throughout the whole website to https://www.caffeinemarketing.co.uk/ and subsequent pages Whilst the redirects seem to be happening, when you go into the tools such as https://httpstatus.io every version of every page is a 200 code only whereas before ther were showing the 301 redirects Have Bluehost messed things up? Hope you can help thanks
Technical SEO | | Caffeine_Marketing0 -
What to do about spam links I didn't create?
I have dropped in rankings 3-5 points over the past 6 months and have been trying to figure out why. One thing I found was a ton of my pictures on a image net ring. I obviously didn't put those photos there or give permission to use them. It looks like an offshore website. How do we deal with these type of bad links?
Technical SEO | | CalicoKitty20000 -
301 Redirects, Sitemaps and Indexing - How to hide redirected urls from search engines?
We have several pages in our site like this one, http://www.spectralink.com/solutions, which redirect to deeper page, http://www.spectralink.com/solutions/work-smarter-not-harder. Both urls are listed in the sitemap and both pages are being indexed. Should we remove those redirecting pages from the site map? Should we prevent the redirecting url from being indexed? If so, what's the best way to do that?
Technical SEO | | HeroDesignStudio0 -
I can't crawl the archive of this website with Screaming Frog
Hi I'm trying to crawl this website (http://zeri.info/) with Screaming Frog but because of some technical issue with their site (i can't find what is causing it) i'm able to crawl only the first page of each category (ex. http://zeri.info/sport/) and then it will go to crawl each page of their archive (hundreds of thousands of pages) but it won't crawl the links inside these pages. Thanks a lot!
Technical SEO | | gjergjshala0 -
Can I speed up removal of cache for 301'd page on unverified website?
I recently asked another website to remove a page from their website (I have no control over this website) and they have now 301'd this old URL to another - this is just what I wanted. My only aim now is to see the Google cache removed for that page as quickly as possible.
Technical SEO | | Mark_Reynolds
I'm not sure that asking the website to remove the url via WMT is the right way to go and assume I should just be waiting for Google to pick up the 301 and naturally remove the cache. But are there any recommended methods I can use to speed this process up? The old URL was last cached on 3 Oct 2014 so not too long ago. I don't think the URL is linked from any other page on the Internet now, but I guess it would still be in Google's list of URLs to crawl. Should I sit back and wait (who knows how long that would take?) or would adding a link to the old URL from a website I manage speed things up? Or would it help to submit the old URL to Google's Submission tool? URL0 -
Pages Indexed Not Changing
I have several sites that I do SEO for that are having a common problem. I have submitted xml sitemaps to Google for each site, and as new pages are added to the site, they are added to the xml sitemap. To make sure new pages are being indexed, I check the number of pages that have been indexed vs. the number of pages submitted by the xml sitemap every week. For weeks now, the number of pages submitted has increased, but the number of pages actually indexed has not changed. I have done searches on Google for the new pages and they are always added to the index, but the number of indexed pages is still not changing. My initial thought was as new pages are added to the index, old ones are being dropped. But I can't find evidence of that, or understand why that would be the case. Any ideas on why this is happening? Or am I worrying about something that I shouldn't even be concerned with since new pages are being indexed?
Technical SEO | | ang1 -
Lost ranking and can't figure out why
My page http://www.drschulmanplasticsurgery.com/body/buttock-lift-augmentation-new-york-city/ recently moved from first page to past the 15th. I was never penalized on the last update and have very few links pointing to this page. I can't figure out why i just moved so far back. Can anyone offer some advice?
Technical SEO | | Roots70