How can Google index a page that it can't crawl completely?
-
I recently posted a question regarding a product page that appeared to have no content. [http://www.seomoz.org/q/why-is-ose-showing-now-data-for-this-url]
What puzzles me is that this page got indexed anyway. Was it indexed based on Google knowing that there was once content on the page? Was it indexed based on the trust level of our root domain?
What are your thoughts? I'm asking not only because I don't know the answer, but because I know the argument is going to be made that if Google indexed the page then it must have been crawlable...therefore we didn't really have a crawlability problem.
Why Google index a page it can't crawl?
-
Yep. If you had links to that page from other authority pages, the pagerank/authority would transfer over, even with the indexing issue.
-
Awesome explanation Oleg. We had some other product pages (128) to be exact, that fell victim to the same coding error. I found it interesting that not only were most of them indexed, some of them actually had PageAuthority and or PageRank.
I am thinking Google may have allocated authority to some of these product pages because they had decent link profiles, despite Googlebot not being able to access the whole page. Is that possible?
-
It has crawled and indexed the page - check out the cached copy.
If you view the source, you can see that there is some HTML code but it seems to get cut off prematurely (perhaps due to a coding error). But that HTML code was enough to get the page indexed, but I would be suprised to see if it ranks for any terms. i.e. a search for the pages title does not return the correct url - "Shure SLX24/SM58 | Wireless Microphone System - CCI Solutions"
So G recognizes a page is there but see's think's it is blank - which is why it is indexed but won't rank for anything.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google indexed wrong pages of my website.
When I google site:www.ayurjeewan.com, after 8 pages, google shows Slider and shop pages. Which I don't want to be indexed. How can I get rid of these pages?
Intermediate & Advanced SEO | | bondhoward0 -
Google + pages and SEO results...
Hi, Can anyone give me insight into how people are getting away with naming their business by the SEO search term, creating a BS Google + page, then having that page rank high in the search results. I am speaking specifically about the results you get when you Google: "Los Angeles DUI Lawyer". As you can see from my attached screenshot (I'm doing the search in Los Angeles), the FIRST listing is a Google + business. Strangely, the phone number listed doesn't actually take you to a DUI attorney, but rather to some marketing group that never answers the phone. Can anyone give me insight into why Google even allows this? I just find it odd that Google cares so much about the user experience, but have the first result be something completely misleading. I know it sounds like I'm just jealous (which I am, a little), but I find it disheartening that we work so hard on SEO, and someone takes the top spot with an obvious BS page. UupqBU9
Intermediate & Advanced SEO | | mrodriguez14400 -
Can too many "noindex" pages compared to "index" pages be a problem?
Hello, I have a question for you: our website virtualsheetmusic.com includes thousands of product pages, and due to Panda penalties in the past, we have no-indexed most of the product pages hoping in a sort of recovery (not yet seen though!). So, currently we have about 4,000 "index" page compared to about 80,000 "noindex" pages. Now, we plan to add additional 100,000 new product pages from a new publisher to offer our customers more music choice, and these new pages will still be marked as "noindex, follow". At the end of the integration process, we will end up having something like 180,000 "noindex, follow" pages compared to about 4,000 "index, follow" pages. Here is my question: can this huge discrepancy between 180,000 "noindex" pages and 4,000 "index" pages be a problem? Can this kind of scenario have or cause any negative effect on our current natural SEs profile? or is this something that doesn't actually matter? Any thoughts on this issue are very welcome. Thank you! Fabrizio
Intermediate & Advanced SEO | | fablau0 -
What may cause a page not to be indexed (be de-indexed)?
Hi All, I have a main category page, a landing page, that does not appear in the SERPS at all (even if I serach for a whole sentence from it). This page once ranked high. What may cause such a punishment for a specific page? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
How can I block unwanted urls being indexed on google?
Hi, I have to block unwanted urls (not that page) from being indexed on google. I have to block urls like example.com/entertainment not the exact page example.com/entertainment.aspx . Is there any other ways other than robot.txt? If i add this to robot.txt will that block my other url too? Or should I make a 301 redirection from example.com/entertainment to example.com/entertainment.aspx. Because some of the unwanted urls are linked from other sites. thanks in advance.
Intermediate & Advanced SEO | | VipinLouka780 -
Most Painless way of getting Duff Pages out of SE's Index
Hi, I've had a few issues that have been caused by our developers on our website. Basically we have a pretty complex method of automatically generating URL's and web pages on our website, and they have stuffed up the URL's at some point and managed to get 10's of thousands of duff URL's and pages indexed by the search engines. I've now got to get these pages out of the SE's indexes as painlessly as possible as I think they are causing a Panda penalty. All these URL's have an addition directory level in them called "home" which should not be there, so I have: www.mysite.com/home/page123 instead of the correct URL www.mysite.com/page123 All these are totally duff URL's with no links going to them, so I'm gaining nothing by 301 redirects, so I was wondering if there was a more painless less risky way of getting them all out the indexes (IE after the stuff up by our developers in the first place I'm wary of letting them loose on 301 redirects incase they cause another issue!) Thanks
Intermediate & Advanced SEO | | James770 -
We are changing ?page= dynamic url's to /page/ static urls. Will this hurt the progress we have made with the pages using dynamic addresses?
Question about changing url from dynamic to static to improve SEO but concern about hurting progress made so far.
Intermediate & Advanced SEO | | h3counsel0