I have 15,000 pages. How do I have the Google bot crawl all the pages?
-
I have 15,000 pages. How do I have the Google bot crawl all the pages? My site is 7 years old. But there are only about 3,500 pages being crawled.
-
Can you tell us the URL of the site in question? That can help us to help you, because we can look at the site and maybe spot something like an improper robots.txt or site architecture that makes it hard for a robot to crawl.
-
Tihs one is interesing. I work with a currency exchange/transfer site where I have 20000+ pages in english only. What I did is pretty basic, but it worked. I did one sitemap for all the main pages - service pages, homepage and pages which won't change until the next redesign. I did one more XML sitemap file, where I had my first set of money transfer pairs grouped - country to country. My 3rd and largest XML file is where I had 16512 currency combinations listed, by each combo being a webpage. For my english version I have 16058 out of them indexed. The pages are quite similar by content, but also by function. I have 4 variables which seems to do the trick. WIth other language targetings my success ranges from 3000-12000 indexed pages from this particular sitemap.
I guess it depends on the market you are targeting. If the pages have similar content it won't hurt doing some alterations to provide custom information if possible.
Hope that helps!
-
I did set up an XML sitemap and submit it via Google Webmaster Tools. It did not help. Is it because my PR is 2?
-
Google will only index your pages if it deems they are "worthy". However, you can certainly give Googlebot some encouragement. A good way to do this is to setup an XML sitemap and submit it via Google Webmaster Tools.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google's ability to crawl AJAX rendered content
I would like to make a change to the way our main navigation is currently rendered on our e-commerce site. Currently, all of the content that appears when you click a navigation category is rendering on page load. This is currently a large portion of every page visit’s bandwidth and even the images are downloaded even if a user doesn’t choose to use the navigation. I’d like to change it so the content appears and is downloaded only IF the user clicks on it, I'm planning on using AJAX. As that is the case it wouldn’t not be automatically on the site(which may or may not mean Google would crawl it). As we already provide a sitemap.xml for Google I want to make sure this change would not adversely affect our SEO. As of October this year the Webmaster AJAX crawling doc. suggestions has been depreciated. While the new version does say that its crawlers are smart enough to render AJAX content, something I've tested, I'm not sure if that only applies to content injected on page load as opposed to in click like I'm planning to do.
Technical SEO | | znotes0 -
What's going on with google index - javascript and google bot
Hi all, Weird issue with one of my websites. The website URL: http://www.athletictrainers.myindustrytracker.com/ Let's take 2 diffrenet article pages from this website: 1st: http://www.athletictrainers.myindustrytracker.com/en/article/71232/ As you can see the page is indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:dfbzhHkl5K4J:www.athletictrainers.myindustrytracker.com/en/article/71232/10-minute-core-and-cardio&hl=en&strip=1 (that the "text only" version, indexed on May 19th) 2nd: http://www.athletictrainers.myindustrytracker.com/en/article/69811 As you can see the page isn't indexed correctly on google: http://webcache.googleusercontent.com/search?q=cache:KeU6-oViFkgJ:www.athletictrainers.myindustrytracker.com/en/article/69811&hl=en&strip=1 (that the "text only" version, indexed on May 21th) They both have the same code, and about the dates, there are pages that indexed before the 19th and they also problematic. Google can't read the content, he can read it when he wants to. Can you think what is the problem with that? I know that google can read JS and crawl our pages correctly, but it happens only with few pages and not all of them (as you can see above).
Technical SEO | | cobano0 -
How to stop crawls for product review pages? Volusion site
Hi guys, I have a new Volusion website. the template we are using has its own product review page for EVERY product i sell (1500+) When a customer purchases a product a week later they receive a link back to review the product. This link sends them to my site, but its own individual page strictly for reviewing the product. (As oppose to a page like amazon, where you review the product on the same page as the actual listing.) **This is creating countless "duplicate content" and missing "title" errors. What is the most effective way to block a bot from crawling all these pages? Via robots txt.? a meta tag? ** Here's the catch, i do not have access to every individual review page, so i think it will need to be blocked by a robot txt file? What code will i need to implement? i need to do this on my admin side for the site? Do i also have to do something on the Google analytics side to tell google about the crawl block? Note: the individual URLs for these pages end with: *****.com/ReviewNew.asp?ProductCode=458VB Can i create a block for all url's that end with /ReviewNew.asp etc. etc.? Thanks! Pardon my ignorance. Learning slowly, loving MOZ community 😃 1354bdae458d2cfe44e0a705c4ec38dd
Technical SEO | | Jerrion0 -
Why is the Page Authority of my product pages so low?
My domain authority is 35 (homepage Page Authority = 45) and my website has been up for years: www.rainchainsdirect.com Most random pages on my site (like this one) have a Page Authority of around 20. However, as a whole, the individual pages of my products rank exceptionally low. Like these: http://www.rainchainsdirect.com/products/copper-channel-link-rain-chain (Page Authority = 1) http://www.rainchainsdirect.com/collections/todays-deals/products/contempo-chain (Page Authority = 1) I was thinking that for whatever reason they have such low authority, that it may explain why these pages rank lower in google for specific searches using my exact product name (in other words, other sites that are piggybacking of my unique products are ranking higher for my product in a specific name search than the original product itself on my site) In any event, I'm trying to get some perspective on why these pages remain with the same non-existent Page Authority. Can anyone help to shed some light on why and what can be done about it? Thanks!
Technical SEO | | csblev0 -
X-cart page crawling question.
I have an x-cart site and it is showing only 1 page being crawled. I'm a newbie, is this common? Can it be changed? If so, how? Thanks.
Technical SEO | | SteveLMCG0 -
How to stop Search Bot from crawling through a submit button
On our website http://www.thefutureminders.com/, we have three form fields that have three pull downs for Month, Day, and year. This is creating duplicate pages while indexing. How do we tell the search Bot to index the page but not crawl through the submit button? Thanks Naren
Technical SEO | | NarenBansal0 -
Will Google Continue to Index the Page with NoIndex Tag Upon Google +1 Button Impression or Click?
The FAQs for Google +1 button suggests as follows: "+1 is a public action, so you should add the button only to public, crawlable pages on your site. Once you add the button, Google may crawl or recrawl the page, and store the page title and other content, in response to a +1 button impression or click." If my page has NoIndex tag, while at the same time inserted with Google +1 button on the page, will Google recognise the NoIndex Tag on the page (and will not index the page) despite the +1 button's impression or clicks send signals to Google spiders?
Technical SEO | | globalsources.com0 -
What is the largest page size a searchbot will crawl?
When setting up pagination, what should we limit the page size to? When will a searchbot stop crawling a particular page?
Technical SEO | | nicole.healthline0