How can Google index a page that it can't crawl completely?
-
I recently posted a question regarding a product page that appeared to have no content. [http://www.seomoz.org/q/why-is-ose-showing-now-data-for-this-url]
What puzzles me is that this page got indexed anyway. Was it indexed based on Google knowing that there was once content on the page? Was it indexed based on the trust level of our root domain?
What are your thoughts? I'm asking not only because I don't know the answer, but because I know the argument is going to be made that if Google indexed the page then it must have been crawlable...therefore we didn't really have a crawlability problem.
Why Google index a page it can't crawl?
-
Yep. If you had links to that page from other authority pages, the pagerank/authority would transfer over, even with the indexing issue.
-
Awesome explanation Oleg. We had some other product pages (128) to be exact, that fell victim to the same coding error. I found it interesting that not only were most of them indexed, some of them actually had PageAuthority and or PageRank.
I am thinking Google may have allocated authority to some of these product pages because they had decent link profiles, despite Googlebot not being able to access the whole page. Is that possible?
-
It has crawled and indexed the page - check out the cached copy.
If you view the source, you can see that there is some HTML code but it seems to get cut off prematurely (perhaps due to a coding error). But that HTML code was enough to get the page indexed, but I would be suprised to see if it ranks for any terms. i.e. a search for the pages title does not return the correct url - "Shure SLX24/SM58 | Wireless Microphone System - CCI Solutions"
So G recognizes a page is there but see's think's it is blank - which is why it is indexed but won't rank for anything.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
I'm looking for a bulk way to take off from the Google search results over 600 old and inexisting pages?
When I search on Google site:alexanders.co.nz still showing over 900 results. There are over 600 inexisting pages and the 404/410 errrors aren't not working. The only way that I can think to do that is doing manually on search console using the "Removing URLs" tool but is going to take ages. Any idea how I can take down all those zombie pages from the search results?
Intermediate & Advanced SEO | | Alexanders1 -
Prioritise a page in Google/why is a well-optimised page not ranking
Hello I'm new to Moz Forums and was wondering if anyone out there could help with a query. My client has an ecommerce site selling a range of pet products, most of which have multiple items in the range for difference size animals i.e. [Product name] for small dog
Intermediate & Advanced SEO | | LauraSorrelle
[Product name] for medium dog
[Product name] for large dog
[Product name] for extra large dog I've got some really great rankings (top 3) for many keyword searches such as
'[product name] for dogs'
'[product name]' But these rankings are for individual product pages, meaning the user is taken to a small dog product page when they might have a large dog or visa versa. I felt it would be better for the users (and for conversions and bounce rates), if there was a group page which showed all products in the range which I could target keywords '[product name]', '[product name] for dogs'. The page would link through the the individual product pages. I created some group pages in autumn last year to trial this and, although they are well-optimised (score of 98 on Moz's optimisation tool), they are not ranking well. They are indexed, but way down the SERPs. The same group page format has been used for the PPC campaign and the difference to the retention/conversion of visitors is significant. Why are my group pages not ranking? Is it because my client's site already has good rankings for the target term and Google does not want to show another page of the site and muddy results?
Is there a way to prioritise the group page in Google's eyes? Or bring it to Google's attention? Any suggestions/advice welcome. Thanks in advance Laura0 -
New website won't rank for branded keywords in Google, but does in Bing
We launched a website in October www.butterfly.com. The branded product name "Butterfly Body Liners" will not rank until page 2 of Google, but it ranks #1 in Bing. Organic traffic never really picked up so it's not easy to tell if it's been "hit" by any penalty. The strange thing is, this website: http://archive.is/PQZdO is ranking #1. This is an archived version of the site. Does anyone have any insight as to why this is happening?
Intermediate & Advanced SEO | | LaughlinConstable0 -
Best way for Google and Bing not to crawl my /en default english pages
Hi Guys, I just transferred my old site to a new one and now have sub folder TLD's. My default pages from the front end and sitemap don't show /en after www.mysite.com. The only translation i have is in spanish where Google will crawl www.mysite.com/es (spanish). 1. On the SERPS of Google and Bing, every url that is crawled, shows the extra "/en" in my TLD. I find that very weird considering there is no physical /en in my urls. When i select the link it automatically redirects to it's default and natural page (no /en). All canonical tags do not show /en either, ONLY the SERPS. Should robots.txt be updated to "disallow /en"? 2. While i did a site transfer, we have altered some of the category url's in our domain. So we've had a lot of 301 redirects, but while searching specific keywords in the SERPS, the #1 ranked url shows up as our old url that redirects to a 404 page, and our newly created url shows up as #2 that goes to the correct page. Is there anyway to tell Google to stop showing our old url's in the SERP's? And would the "Fetch as Google" option in GWT be a great option to upload all of my url's so Google bots can crawl the right pages only? Direct Message me if you want real examples. THank you so much!
Intermediate & Advanced SEO | | Shawn1240 -
Urgent Site Migration Help: 301 redirect from legacy to new if legacy pages are NOT indexed but have links and domain/page authority of 50+?
Sorry for the long title, but that's the whole question. Notes: New site is on same domain but URLs will change because URL structure was horrible Old site has awful SEO. Like real bad. Canonical tags point to dev. subdomain (which is still accessible and has robots.txt, so the end result is old site IS NOT INDEXED by Google) Old site has links and domain/page authority north of 50. I suspect some shady links but there have to be good links as well My guess is that since that are likely incoming links that are legitimate, I should still attempt to use 301s to the versions of the pages on the new site (note: the content on the new site will be different, but in general it'll be about the same thing as the old page, just much improved and more relevant). So yeah, I guess that's it. Even thought the old site's pages are not indexed, if the new site is set up properly, the 301s won't pass along the 'non-indexed' status, correct? Thanks in advance for any quick answers!
Intermediate & Advanced SEO | | JDMcNamara0 -
How Long Does it Take for Rel Canonical to De-Index / Re-Index a Page?
Hi Mozzers, We have 2 e-commerce websites, Website A and Website B, sharing thousands of pages with duplicate product descriptions. Currently only the product pages on Website B are indexing, and we want Website A indexed instead. We added the rel canonical tag on each of Website B's product pages with a link towards the matching product on Page A. How long until Website B gets de-indexed and Website A gets indexed instead? Did we add the rel canonical tag correctly? Thanks!
Intermediate & Advanced SEO | | Travis-W0 -
Robots.txt file - How to block thosands of pages when you don't have a folder path
Hello.
Intermediate & Advanced SEO | | Unity
Just wondering if anyone has come across this and can tell me if it worked or not. Goal:
To block review pages Challenge:
The URLs aren't constructed using folders, they look like this:
www.website.com/default.aspx?z=review&PG1234
www.website.com/default.aspx?z=review&PG1235
www.website.com/default.aspx?z=review&PG1236 So the first part of the URL is the same (i.e. /default.aspx?z=review) and the unique part comes immediately after - so not as a folder. Looking at Google recommendations they show examples for ways to block 'folder directories' and 'individual pages' only. Question:
If I add the following to the Robots.txt file will it block all review pages? User-agent: *
Disallow: /default.aspx?z=review Much thanks,
Davinia0 -
How can I change my website's content on specific pages without affecting ranking for specific keywords?
My client's website (www.nursevillage.com) content has not been touched for 4 years and we are currently ranking #1 for "per diem nursing". They do not want to make any changes to the site in fear that it might decrease our rankings. We want to try to use utilize that keyword ranking on specific pages (www.nursevillage.com/nv/content/careeroptions/perdiem.jsp ) ranking for "per diem nursing" and try redirecting traffic or placing some banners and links on that page to specific pages or other sites related to "per diem nursing" jobs so we can get nurses to apply to our new nursing jobs. Any advice on why "per diem nursing" is ranking so high for us and what we can change on the site without messing up our ranking would be greatly appreciated. Thanks
Intermediate & Advanced SEO | | ryanperea1000