How do the Quoras of this world index their content?
-
I am helping a client index lots and lots of pages, more than one million pages. They can be seen as questions on Quora. In the Quora case, users are often looking for the answer on a specific question, nothing else.
On Quora there is a structure setup on the homepage to let the spiders in. But I think mostly it is done with a lot of sitemaps and internal linking in relevancy terms and nothing else... Correct? Or am I missing something?
I am going to index about a million question and answers, just like Quora. Now I have a hard time dealing with structuring these questions without just doing it for the search engines. Because nobody cares about structuring these questions. The user is interested in related questions and/or popular questions, so I want to structure them in that way too.
This way every question page will be in the sitemap, but not all questions will have links from other question pages linking to them. These questions are super longtail and the idea is that when somebody searches this exact question we can supply them with the answer (onpage will be perfectly optimised for people searching this question). Competition is super low because it is all unique user generated content.
I think best is just to put them in sitemaps and use an internal linking algorithm to make the popular and related questions rank better. I could even make sure every question has at least one other page linking to it, thoughts?
Moz, do you think when publishing one million pages with quality Q/A pages, this strategy is enough to index them and to rank for the question searches? Or do I need to design a structure around it so it will all be crawled and each question will also receive at least one link from a "category" page.
-
Wow, that is insane right?
https://www.quora.com/sitemap/questions?page_id=50
I wonder how long this carries on.
-
Quora don't seem to have a XML sitemap but a HTML one :
[https://www.quora.com/robots.txt](https://www.quora.com/robots.txt) refers to [https://www.quora.com/sitemap](https://www.quora.com/sitemap)
-
Yes there are many challenges and external linking is definitely one of them.
What do you think about sitemaps to get this longtail indexed? I think that a lot can be indexed by submitting the sitemaps.
-
There are many challenges to building a really large site. Most of them are related to building the site, but one that often kills the success of the site is the ability to get the pages into the index and keep them there. This requires a steady flow of spiders into the deepest pages of the site. If you don't have continuous and repetitive spider flow the pages will be indexed, but then forgotten, before the spiders return.
An effective way to get deep spidering is have powerful links permanently connected to many deep hub pages throughout the site. This produces a flow of spiders into the site and forces them to chew their way out, indexing pages as they go. These links must be powerful or the spiders will index a couple of pages and die. These links must be permanent because if they are removed the flow of spiders will stop and pages in the index will be forgotten.
The goal of the hub pages is to create spider webs through the site that allow spiders to index all of the pages on short link paths, rather than requiring the spiders to crawl through long tunnels of many consecutive links to get everything indexed.
Lots of people can build a big site, but only some of those people have the resources to get the powerful, permanent links that are required to get the pages indexed and keep them in the index. You can't rely on internal links alone for the powerful, permanent links because most spiders that enter any site come from external sources rather than spontaneously springing up deep in the bowels of your website.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Not Indexing App Content
Hello Mozzers I recently noticed that there has been an increase in crawl errors reported in Google Search console & Google has stopped indexing our app content. Could this be due to the fact that there is a mismatch between the host path name mentioned within the android deeplink (within the alternate tag) and the actual URL of the page. For instance on the following desktop page http://www.example.com.au/page-1 the android deeplink points to http://www.example.com.au/android-app://com.example/http/www.example.com.au/4652374 Please note that the content on both pages (desktop & android) is same.Is this is a correct setup or am I doing something wrong here? Any help would be much appreciated. Thank you so much in advance.
Intermediate & Advanced SEO | | InMarketingWeTrust0 -
Weird Indexation Issue
On this webpage, we have an interactive graphic that allows users to click a navigational element and learn more about an anatomical part of the knee or a knee malady. For example, a user could click "Articular Cartilage" and they will land on this page: http://www.neocartimplant.com/knee-anatomy-maladies/anatomy/articular-cartilage The weird thing is whether you perform a Google Search for the above URL or for a string of text on that URL (i.e. "Articular cartilage is hyaline cartilage (as opposed to menisci, which consists of fibrocartilage) on the articular surfaces, or the ends, of bones. This thin, smooth tissue lines both joint surfaces where the bones come together to form the knee. ") the following page ranks: http://www.neocartimplant.com/anatmal/knee-anatomy-maladies/anatomy/articular-cartilage.php I have two questions: 1 - Any idea on how the Googlebot is getting to that page?
Intermediate & Advanced SEO | | davidangotti
2 - How should I get the Googlebot to index the correct page (http://www.neocartimplant.com/knee-anatomy-maladies/anatomy/articular-cartilage)? Thanks in advance for your help!0 -
HTTP Pages Indexed as HTTPS
My site used to be entirely HTTPS. I switched months ago so that all links in the pages that the public has access to are now http only. But I see now that when I do a site:www.qjamba.com, the results include many pages with https in the beginning (including the home page!), which is not what I want. I can redirect to http but that doesn't remove https from the indexing, right? How do I solve this problem? sample of results: Qjamba: Free Local and Online Coupons, coupon codes ... **<cite class="_Rm">https://www.qjamba.com/</cite>**One and Done savings. Printable coupons and coupon codes for thousands of local and online merchants. No signups, just click and save. Chicnova online coupons and shopping - Qjamba **<cite class="_Rm">https://www.qjamba.com/online-savings/Chicnova</cite>**Online Coupons and Shopping Savings for Chicnova. Coupon codes for online discounts on Apparel & Accessories products. Singlehop online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/singlehop</cite>Online Coupons and Shopping Savings for Singlehop. Coupon codes for online discounts on Business & Industrial, Service products. Automotix online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/automotix</cite>Online Coupons and Shopping Savings for Automotix. Coupon codes for online discounts on Vehicles & Parts products. Online Hockey Savings: Free Local Fast | Qjamba **<cite class="_Rm">www.qjamba.com/online-shopping/hockey</cite>**Find big online savings at popular and specialty stores on Hockey, and more. Hitcase online coupons and shopping - Qjamba **<cite class="_Rm">www.qjamba.com/online-savings/hitcase</cite>**Online Coupons and Shopping Savings for Hitcase. Coupon codes for online discounts on Electronics, Cameras & Optics products. Avanquest online coupons and shopping - Qjamba <cite class="_Rm">https://www.qjamba.com/online-savings/avanquest</cite>Online Coupons and Shopping Savings for Avanquest. Coupon codes for online discounts on Software products.
Intermediate & Advanced SEO | | friendoffood0 -
How to Index Faster?
Hello, I have a new website and updated fresh content regularly. My indexing status is very slow. When I search how to improve my indexing rate by Google, I found most of the members of Moz community replied there is no certain technique to improve your indexing. Apart from this you should keep posting fresh content more and more and wait for Google Indexing. Some of them asked for submitting sitemap and share posts on Twitter, Facebook and Google Plus. Well the above comments are from the year of 2012. I'm curious to know is there any new technique or methods are used to improve indexing rate? Need your suggestions! Thanks.
Intermediate & Advanced SEO | | TopLeagueTechnologies0 -
Indexing Dynamic Pages
http://www.oreillyauto.com/site/c/search/Wiper+Blade/03300/C0047.oap?make=Honda&model=Accord&year=2005&vi=1430764 How is O'Reilly getting this page indexed? It shows up in organic results for [2005 honda accord windshield wiper size].
Intermediate & Advanced SEO | | Kingof50 -
Is legacy duplicate content an issue?
I am looking for some proof, or at least evidence to whether or not sites are being hurt by duplicate content. The situation is, that there were 4 content rich newspaper/magazine style sites that were basically just reskins of each other. [ a tactic used under a previous regime 😉 ] The least busy of the sites has since been discontinued & 301d to one of the others, but the traffic was so low on the discontinued site as to be lost in noise, so it is unclear if that was any benefit. Now for the last ~2 years all the sites have had unique content going up, but there are still the archives of articles that are on all 3 remaining sites, now I would like to know whether to redirect, remove or rewrite the content, but it is a big decision - the number of duplicate articles? 263,114 ! Is there a chance this is hurting one or more of the sites? Is there anyway to prove it, short of actually doing the work?
Intermediate & Advanced SEO | | Fammy0 -
De Index Section of Page?
Hey all! We're having a couple of issues with a certain section of our page that we don't want to index. Basically, our cross sells change really quickly, and big G is ranking them and linking to them even when they've long gone. Is it possible to put some kind of no index tag for a specific section of the page? See below 🙂 http://www.freestylextreme.com/uk/Home/Brands/DC-Shoe-Co-/Mens-DC-Shoe-Co-Hoodies-and-Sweaters/DC-Black-Rob-Dyrdek-Official-Sweater.aspx Thanks!
Intermediate & Advanced SEO | | elbeno0 -
Duplicate content
Is there manual intervention required for a site that has been flagged for duplicate content to get back to its original rankings, once the duplicated content has been removed? Background: Our site recently experienced a significant drop in traffic around the time that a chunk of content from other sites (ie. duplicate) went live. While it was not an exact replica of the pages on other sites, there was quite a bit of overlap. That content has since been removed, but our traffic hasn't improved. What else can we do to improve our ranking?
Intermediate & Advanced SEO | | jamesti0