Indexing a several millions pages new website
-
Hello everyone,
I am currently working for a huge classified website who will be released in France in September 2013.
The website will have up to 10 millions pages. I know the indexing of a website of such size should be done step by step and not in only one time to avoid a long sandbox risk and to have more control about it.
Do you guys have any recommandations or good practices for such a task ? Maybe some personal experience you might have had ?
The website will cover about 300 jobs :
- In all region (= 300 * 22 pages)
- In all departments (= 300 * 101 pages)
- In all cities (= 300 * 37 000 pages)
Do you think it would be wiser to index couple of jobs by couple of jobs (for instance 10 jobs every week) or to index with levels of pages (for exemple, 1st step with jobs in region, 2nd step with jobs in departements, etc.) ?
More generally speaking, how would you do in order to avoid penalties from Google and to index the whole site as fast as possible ?
One more specification : we'll rely on a (big ?) press followup and on a linking job that still has to be determined yet.
Thanks for your help !
Best Regards,
Raphael
-
Hello everyone,
Thanks for sharing your experience and your answers, it's greatly appreciated.
The website is build in order to avoid cookie cutter pages : each page will have unique content from classifieds (unique because classifieds won't be indexed in the first place, to avoid having too much pages).
The linking is as well though in order for each page to have permanents internal links in a logical way.
I understand from your answers that it is better to take time and to index the site step by step : mostly according to the number and the quality of classifieds (and thus the content) for each jobs/locality. It's not worth to index pages without any classifieds (and thus unique content) as they will be cut off by Google in a near future.
-
I really don't think Google likes it when you release a website that big. It would much rather you build it slowly. I would urge you to have main pages and noindex the sub categories.
-
We worked in partnership with a similar large scale site last year and found the exact same. Google simply cut off 60% of our pages out of the index as they were cookie cutter.
You have to ensure that pages have relevant, unique and worthy content. Otherwise if all your doing is replacing the odd word here and there for the locality and job name its not going to work.
Focus on having an on going SEO campaign for each target audience be that for e.g. by job type / locality / etc.
-
If you plan to get a website that big indexed you will need to have a few things in order...
First, you will need thousands of deep links that connect to hub pages deep within the site. These will force spiders down there and make them chew their way out through the unindexed pages. These must be permanent links. If you remove them then spiders will stop visiting and google will forget your pages. For a 10 million page site you will need thousands of links hitting thousands of hub pages.
Second, for a site this big.... are you going to have substantive amounts of unique content? If your pages are made from a cookie cutter and look like this....
"yada yada yada yada yada yada yada yada SEO job in Paris yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada send application to Joseph Blowe, 11 Anystreet, Paris, France yada yada yada yada yada yada yada yadayada yada yada yada yada yada yada yada yada yada yada yada yada yada yada yada"
.... then Google will index these pages, then a few weeks to a few months later your entire site might receive a Panda penalty and drop from google.
Finally... all of those links needed to get the site in the index... they need to be Penguin proof.
It is not easy to get a big site in the index. Google is tired of big cookie cutter sites with no information or yada yada content. They are quickly toasted these days.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Consolidating product pages during website migration
Hello, We are an e-commerce & content site undergoing a website migration and redesign in the coming months. We will be getting an entirely new website. Many of our URLs will be changing: Current URL setup: www.mysite.com/catalog/SKU12345/product-title-here
Intermediate & Advanced SEO | | katelynroberts
Future URL setup: www.mysite.com/catalog/product-title-here So we're aware we will be using plenty of 301 redirects to achieve this. Further to this though, we currently have a product page for each configuration of a product - for example, a single-sided bookmark has its own page and URL, and the double-sided version of the same bookmark has its own page and URL. In our site redesign, we are hoping to consolidate each of these instances into one product page where users can select single or double-sided and the price will update accordingly. The bookmark URLs would then go from:
_www.mysite.com/catalog/SKU12345/bookmark-single-sided _(call this URL A for simplicity)www.mysite.com/catalog/SKU67890/bookmark-double-sided (call this URL B) To (after migrating to the new URL structure for the new site, and the now-consolidated single- & double-sided product pages):
www.mysite.com/catalog/bookmark (call this URL C) What is the best way to make this transition without losing too much of our SEO value? I understand there is nearly always traffic loss with URL changes but I'd like to at least minimize the damage as best I can. We have backlinks and ranks for many product pages so I want to make sure we pass as much of this as we can. (And is this at all further complicated by the fact that URL A & B won't exist on the new site, and URL C doesn't exist on the current site? Does this impact the use of the 301 redirects and if so, how?) Are we better off to approach this page consolidation after the site migration and treat it as a separate project? This is something that is important to our user experience, and is definitely a change we want to make. Any advice is appreciated - thank you! I'm a fairly beginner-intermediate SEO so this is all somewhat new but I want to be able to at least convey some understanding to our developer of what we need to do. I was able to find this discussion (https://moz.com/community/q/merging-pages-and-seo) which describes a similar situation and solutions if we were just consolidating the pages but doesn't quite have the complicating factor of the entire site migration happening at the same time. Thanks so much!0 -
New websites
Hi Moz community, My company updated and used a new developer to build and re-design their charity websites: www.runforcharity.com, www.cycleforcharity.com and www.sportforcharity.com. This sites were "re-launched" at the beggining of December 2015 and I have now been able to get a good 6 weeks worth of data. I've been religiously using Moz.com for a couple of years and I use it simply for SEO purposes. Our websites are built upon organic traffic being driven to them and I have noticed that the PA on the new sites has taken a hammering. They all appear to have a PA of 1 and I'm at a loss why? It appears that no page has h1 text? Would this be an issue with the developer or something the content team is doing wrong? Any help of advice would be much appreciated. Many thanks Ryan
Intermediate & Advanced SEO | | Bennerya0 -
Our client's web property recently switched over to secure pages (https) however there non secure pages (http) are still being indexed in Google. Should we request in GWMT to have the non secure pages deindexed?
Our client recently switched over to https via new SSL. They have also implemented rel canonicals for most of their internal webpages (that point to the https). However many of their non secure webpages are still being indexed by Google. We have access to their GWMT for both the secure and non secure pages.
Intermediate & Advanced SEO | | RosemaryB
Should we just let Google figure out what to do with the non secure pages? We would like to setup 301 redirects from the old non secure pages to the new secure pages, but were not sure if this is going to happen. We thought about requesting in GWMT for Google to remove the non secure pages. However we felt this was pretty drastic. Any recommendations would be much appreciated.0 -
How can a website have multiple pages of duplicate content - still rank?
Can you have a website with multiple pages of the exact same copy, (being different locations of a franchise business), and still be able to rank for each individual franchise? Is that possible?
Intermediate & Advanced SEO | | OhYeahSteve0 -
How many pages should be on landscapers website
Hi Guys, We have a good website strong onsite and offsite seo. A year ago, we had a 15 pages website for all main keywords we needed and we were on top 3 for most of these keywords in google. We were happy but we wanted more.. So we created lots of unique content targeting long tail keywords and created 100 more pages for the website. In next 4-5 months we lost positions for almost all our main keywords but got lots of longtails SERPs. Trafiic grew but the quality and the conversion rate shrinked. Everybody keep saying that it doesn't matter how many pages you have on the website as long as content is unique and I don't think it is true. I see lots of 3-5 paged websites without any seo in top 3 results in google. Does it mean that if I delete all these 100 pages that I created I will have more chances to get my main keywords SERP back? Basically does the seo juice that you have on domain is spreading across all pages and the more pages you have the less juice every page will get?
Intermediate & Advanced SEO | | vadimmarusin100 -
Drop in number of pages in Bing index
I regularly check our index inclusion and this morning saw that we had dropped from having approx 6,000 pages in Bing's index to less than 100. We still have 13,000 in Bing's image index, and I've seen no similar drop in the number of pages in either Google or Yahoo. I've checked with our dev team and there have been no significant changes to the sitemap or robots file. Has anybody seen anything like this before, or could give any insight into why it might be happening?
Intermediate & Advanced SEO | | GBC0 -
How long for new pages to rank
Hi Guys, Our website has some really good serps for our established keyword phrases some of which are quite competitive. We recently acquired and have begun selling some new brands through our online shop and launched new pages for these brands around 2 months ago. They are quite competitive ("merrell shoes" and "timberland boots" for example in google.co.uk) terms. Do you think we should get some keyword rich links built into these new pages from external sites such as blogs - or is there chances of ranking well driven more off our overall site authority/link profile? In other peoples experience, what is a typical realistic timeframe to start getting meaningful serps on new pages/keyword phrases (I know that is hard to answer - but ball parks figures appreciated). Thank you everyone in advance. Kind Regards (and happy thanksgiving to our US friends)
Intermediate & Advanced SEO | | ConradC
Conrad Cranfield0 -
How to move website to new domain?
We have a website that has run under the same domain name for the past 10 years. We have built up a decent amount of SEO "mojo" (and traffic) over time, however, the original domain name no longer applies to the business model. A little over 1 year ago we started using a new brand name for the website and created a landing page for that domain name. Everything on that landing page links over to pages on the original domain name (to preserve the SEO value that we have built up over the years). We would like to move all (or most) of the pages/content to the new domain name. Would using 301 redirects be the safest, most effective way of doing this? I have heard of other people doing it this way, and often they will see their traffic drop for a few weeks before it eventually comes back. Anyone else had experience with this? What worked? What didn't? Thanks!
Intermediate & Advanced SEO | | seo-mojo0