Getting More Pages Indexed
-
We have a large E-commerce site (magento based) and have submitted sitemap files for several million pages within Webmaster tools. The number of indexed pages seems to fluctuate, but currently there is less than 300,000 pages indexed out of 4 million submitted. How can we get the number of indexed pages to be higher? Changing the settings on the crawl rate and resubmitting site maps doesn't seem to have an effect on the number of pages indexed.
Am I correct in assuming that most individual product pages just don't carry enough link juice to be considered important enough yet by Google to be indexed? Let me know if there are any suggestions or tips for getting more pages indexed.
-
I think that is what did it! lol
-
Yes, you will need internal links to establish your site navigation. Then, external links if you don't have enough PR flow from within your site.
Some powerful sites can support these millions of pages with internal links. If you have a site like that congratulations!
-
Thanks this is helpful. I will work on funneling spiders. I assume I'll need a healthy dose of both internal and external links pointing deep into the site in order to get the spider to start chewing in there? Thanks.
-
Thanks! You enjoyed the chewing spiders?
-
Wow...that was a powerful answer EGOL. Thanks for adding the long answer. The way you worded it created a visualization for me that was very helpful to my understanding as a novice. Thanks.
-
The short answer...
Link deep into the site at multiple points with heavy PR.
The long answer...
If you have a really big site you need a lot of linkjuice to get the entire site indexed and keep it in the index. You also need a good site structure so that spiders can crawl through the site and find every page.
If you have several million pages, my guess is that you will need hundreds of links of at least PR5 or PR6 linking into the site. I would direct each of those links to a deep category page. That will funnel the spiders deep into your site and force them to chew their way out while indexing your pages.
All of those links must be held permanently in place. Because if you pull the links the flow of spiders will stop and google will slowly forget about pages that are not visited by spiders on a regular basis.
If you have weak links or not enough links your site will not be thoroughly crawled and google will forget about your pages as fast as they are discovered.
Big sites require a PR resource.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Product pages getting no internal links in Magento
Hello I think i have a serious problem. Most of my products are not getting internal links.
Technical SEO | | macrovet
I discoverd this when i was running a Crawl Test Tool Report | Moz Here an example of one product.
This product can be navigate to a normal way true the navigation structure on my website. The navigation is http://www.macrovet.nl/scheermachine/scheerapparaat-paard-paardenscheermachine.html
On this page is the product URL: http://www.macrovet.nl/aesculap-econom-equipe-gt674.html
Time Crawled 2014
Title tag: Aesculap Econom Equipe GT674 | Macrovet.nl
Meta Description: Bekijk en bestel een Aesculap Econom Equipe GT674 paardenscheermachine voor de scherpste prijs Macrovet.nl
HTTP Status Code: 200
Referrer http://www.macrovet.nl/sitemap.xml
Link Count: 550
Content-Type Header: text/html; charset=UTF-8
4XX (Client Error): NO
5XX (Server Error): NO
Title Missing or Empty: No
Duplicate Page Content: NO
URLs with Duplicate Page Content (up to 5)
Duplicate Page Title:No
Long URL NO
Overly-Dynamic URL NO
301 (Permanent Redirect) NO
302 (Temporary Redirect) NO
301/302 Target
Meta Refresh NO
Meta Refresh Target
Title Element Too Short NO
Title Element Too Long No
Too Many On-Page Links YES
Missing Meta Description Tag No
Search Engine blocked by robots.txt No
Meta-robots Nofollow No
Meta Robots Tag INDEX,FOLLOW
Rel Canonical Yes
Rel-Canonical Target http://www.macrovet.nl/aesculap-econom-equipe-gt674.html
Blocking All User Agents No
Blocking Google No
Internal Links 0
Linking Root Domains 0
External Links 0
Page Authority 1 Domain Autority 30 Do you have an answer what is wrong, thanks for your answers Regards,
Willem-Johan0 -
41.000 pages indexed two years after it was redirected to a new domain
Hi!Two years ago, we changed the domain elmundodportivo.es to mundodeportivo.com. Apparently, everything was OK, but more than two years later, there are still 41.000 pages indexed in Google (https://www.google.com/search?q=site%3Aelmundodeportivo.es) even though all the domains have been redirected with a 301 redirect. I detected some problems with redirections that were 303 instead of 301, but we fixed that one month ago.A secondary problem is that the pagerank for elmundodportivo.es is 7 yet and mundodeportivo.com is 3.What I'm doing wrong?Thank you all,Oriol
Technical SEO | | MundoDeportivo0 -
Home page indexed but not ranking...interior pages with thin content outrank home page??
I have a Joomla site with a home page that I can't get to rank for anything beyond the company name @ Google - the site works fine @ Bing and Yahoo. The interior pages will rank all day long but the home page never shows up in the results. I have checked the page code out in every tool that I know about and have had no luck....by all account it should be good to go...any thoughts/comments/help would be greatly appreciated. The site is http://www.selectivedesigns.com Thanks! Greg
Technical SEO | | DougHosmer0 -
WordPress - How to stop both http:// and https:// pages being indexed?
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though. Could anyone please explain why this may have happened and how I can fix? Thanks!
Technical SEO | | Clicksjim1 -
Testimonial pages
Is it better to have one long testimonial page on your site, or break it down into several smaller pages with testimonials? First time I've posted on the forum. But I'm excited! Ron
Technical SEO | | yatesandcojewelers0 -
Page not being indexed
Hi all, On our site we have a lot of bookmaker reviews, and we are ranking pretty good for most bookmaker names as keywords, however a single bookmaker seems to have been shunned by Google. For a search "betsafe" in Denmark, this page does not appear among the top 50: http://www.betxpert.com/bookmakere/betsafe All of our other review pages rank in top 10-20 for the bookmaker name as keyword. What to do if Google has "banned" a page? Best regards, Rasmus
Technical SEO | | rasmusbang0 -
Why am i still getting duplicate page title warnings after implementing canonical URLS?
Hi there, i'm having some trouble understanding why I'm still getting duplicate page title warnings on pages that have the rel=canonical attribute. For example: this page is the relative url http://www.resnet.us/directory/auditor/az/89/home-energy-raters-hers-raters/1 and http://www.resnet.us/directory/auditor/az/89/home-energy-raters-hers-raters/2 is the second page of this parsed list which is linking back to the first page using rel=canonical. i have over 300 pages like this!! what should i do SEOmoz GURUS? how do i remedy this problem? is it a problem?
Technical SEO | | fourthdimensioninc0