Indexation of content from internal pages (registration) by Google
-
Hello,
we are having quite a big amount of content on internal pages which can only be accessed as a registered member.
What are the different options the get this content indexed by Google?
In certain cases we might be able to show a preview to visitors. In other cases this is not possible for legal reasons.
Somebody told me that there is an option to send the content of pages directly to google for indexation. Unfortunately he couldn't give me more details. I only know that this possible for URLs (sitemap). Is there really a possibility to do this for the entire content of a page without giving google access to crawl this page?
Thanks
Ben
-
The issue is that Google won't and shouldn't index pages that are restricted.
This is best for user experience. Most people won't sign in to view the content.
You basically have to create two sites. One that is visible to all, and Google where you show or preview a bit. then the other that is protected.
-
Thanks, I will check wether this meets the legal requirements (see my reply to Brents answer).
-
As I mentioned we have 2 cases.
In the first case, we can show a preview.
In the second case we can only show the content to a certain audience (which is a legal question). So the free registration is a legal requirement. Still people will be looking for it via Google. Since the content found on those pages is useful for a fairly large audience so why wouldn't we want Google to index the pages. Of course without Google knowing that there is relevant content on those pages, they will neither index nor propperly rank those pages.
-
I found this information for you but you should definitely check that it doesn't break any of Google's guidelines before incorporating it to your website.
This is a simple code to allow bots to bypass the password on password protected pages
$allow_inside = ($is_logged_in) || substr_count($_SERVER['HTTP_USER_AGENT'],'Googlebot');
http://davidwalsh.name/google-password-protected-areas
The reference post is older, so this code could have been updated
-
Fetch as Google Bot is submit to index. This is why I believe it should work with it.
-
I guess my questions is why would you want Google to index something that is only available to registered users?
In order for it to be indexed, it has to be open to everyone.
You will have to figure out what can be shown as a preview and what can't. If you want something to be indexed, then you will have to create a separate section for your preview content (since Google won't index your protected content.)
-
Hi Istvan,
"The Fetch as Googlebot tool lets you see a page as Googlebot sees it."
Since Googlebot has no access to the entire site (login required) it will probably not display anything (just tried it logged in and it would not display any of the content). How could this tool theoretically help us indexing the content of the internal page?
Ben
-
Hi Ben,
Maybe fetch as Google Bot can be a solution to your issue. But not 100% sure of this.
Gr.,
Istvan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google does not index image sitemap
Hi, we put an image sitemap in the searchconsole/webmastertools http://www.sillasdepaseo.es/sillasdepaseo/sitemap-images.xml it contains only the indexed products and all images on the pages. We also claimed the CDN in the searchconsole http://media.sillasdepaseo.es/ It has been 2 weeks now, Google indexes the pages, but not the images. What can we do? Thanks in advance. Dieter Lang
Intermediate & Advanced SEO | | Storesco0 -
Old pages STILL indexed...
Our new website has been live for around 3 months and the URL structure has completely changed. We weren't able to dynamically create 301 redirects for over 5,000 of our products because of how different the URL's were so we've been redirecting them as and when. 3 months on and we're still getting hundreds of 404 errors daily in our Webmaster Tools account. I've checked the server logs and it looks like Bing Bot still seems to want to crawl our old /product/ URL's. Also, if I perform a "site:example.co.uk/product" on Google or Bing - lots of results are still returned, indicating the both still haven't dropped them from their index. Should I ignore the 404 errors and continue to wait for them to drop off or should I just block /product/ in my robots.txt? After 3 months I'd have thought they'd have naturally dropped off by now! I'm half-debating this: User-agent: *
Intermediate & Advanced SEO | | LiamMcArthur
Disallow: /some-directory-for-all/* User-agent: Bingbot
User-agent: MSNBot
Disallow: /product/ Sitemap: http://www.example.co.uk/sitemap.xml0 -
Pages are Indexed but not Cached by Google. Why?
Here's an example: I get a 404 error for this: http://webcache.googleusercontent.com/search?q=cache:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all But a search for qjamba restaurant coupons gives a clear result as does this: site:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all What is going on? How can this page be indexed but not in the Google cache? I should make clear that the page is not showing up with any kind of error in webmaster tools, and Google has been crawling pages just fine. This particular page was fetched by Google yesterday with no problems, and even crawled again twice today by Google Yet, no cache.
Intermediate & Advanced SEO | | friendoffood2 -
New Web Page Not Indexed
Quick question with probably a straightforward answer... We created a new page on our site 4 days ago, it was in fact a mini-site page though I don't think that makes a difference... To date, the page is not indexed and when I use 'Fetch as Google' in WT I get a 'Not Found' fetch status... I have also used the'Submit URL' in WT which seemed to work ok... We have even resorted to 'pinging' using Pinglar and Ping-O-Matic though we have done this cautiously! I know social media is probably the answer but we have been trying to hold back on that tactic as the page relates to a product that hasn't quite launched yet and we do not want to cause any issues with the vendor! That said, I think we might have to look at sharing the page socially unless anyone has any other ideas? Many thanks Andy
Intermediate & Advanced SEO | | TomKing0 -
Duplicate content within sections of a page but not full page duplicate content
Hi, I am working on a website redesign and the client offers several services and within those services some elements of the services crossover with one another. For example, they offer a service called Modelling and when you click onto that page several elements that build up that service are featured, so in this case 'mentoring'. Now mentoring is common to other services therefore will feature on other service pages. The page will feature a mixture of unique content to that service and small sections of duplicate content and I'm not sure how to treat this. One thing we have come up with is take the user through to a unique page to host all the content however some features do not warrant a page being created for this. Another idea is to have the feature pop up with inline content. Any thoughts/experience on this would be much appreciated.
Intermediate & Advanced SEO | | J_Sinclair0 -
Google + pages and SEO results...
Hi, Can anyone give me insight into how people are getting away with naming their business by the SEO search term, creating a BS Google + page, then having that page rank high in the search results. I am speaking specifically about the results you get when you Google: "Los Angeles DUI Lawyer". As you can see from my attached screenshot (I'm doing the search in Los Angeles), the FIRST listing is a Google + business. Strangely, the phone number listed doesn't actually take you to a DUI attorney, but rather to some marketing group that never answers the phone. Can anyone give me insight into why Google even allows this? I just find it odd that Google cares so much about the user experience, but have the first result be something completely misleading. I know it sounds like I'm just jealous (which I am, a little), but I find it disheartening that we work so hard on SEO, and someone takes the top spot with an obvious BS page. UupqBU9
Intermediate & Advanced SEO | | mrodriguez14400 -
Duplicate Content/ Indexing Question
I have a real estate Wordpress site that uses an IDX provider to add real estate listings to my site. A new page is created as a new property comes to market and then the page is deleted when the property is sold. I like the functionality of the service but it creates a significant amount of 404's and I'm also concerned about duplicate content because anyone else using the same service here in Las Vegas will have 1000's of the exact same property pages that I do. Any thoughts on this and is there a way that I can have the search engines only index the core 20 pages of my site and ignore future property pages? Your advice is greatly appreciated. See link for example http://www.mylvcondosales.com/mandarin-las-vegas/
Intermediate & Advanced SEO | | AnthonyLasVegas0 -
How long until Sitemap pages index
I recently submitted an XML sitemap on Webmaster tools: http://www.uncommongoods.com/sitemap.xml Once Webmaster tools downloads it, how long do you typically have to wait until the pages index ?
Intermediate & Advanced SEO | | znotes0