What is the largest page size a searchbot will crawl?
-
When setting up pagination, what should we limit the page size to? When will a searchbot stop crawling a particular page?
-
I think the ideal size is below 500k, yet I feel Google will crawl even larger sized pages if the content provides value for the users.
Back in 2005 I remember Google had much tighter figures on these types of numbers yet in today's market it is a bit different, they seem to allow larger file sizes.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My pages are being crawled, but not indexed according to Search Console
According to Google Search Console, my pages are being crawled by not indexed. We use Shopify and about two weeks ago I selected that Traffic from all our domains redirects to our primary domain. So everything from www.url.com and https://url.com and so on, would all redirect to one url. Have added an attached image from Search Console. 6fzEQg8
Technical SEO | | HariOmHemp0 -
Search Console Indexed Page Count vs Site:Search Operator page count
We launched a new site and Google Search Console is showing 39 pages have been indexed. When I perform a Site:myurl.com search I see over 100 pages that appear to be indexed. Which is correct and why is there a discrepancy? Also, Search Console Page Index count started at 39 pages on 5/21 and has not increased even though we have hundreds of pages to index. But I do see more results each week from Site:psglearning.com My site is https://wwww.psglearning.com
Technical SEO | | pdowling0 -
Indexed pages
Just started a site audit and trying to determine the number of pages on a client site and whether there are more pages being indexed than actually exist. I've used four tools and got four very different answers... Google Search Console: 237 indexed pages Google search using site command: 468 results MOZ site crawl: 1013 unique URLs Screaming Frog: 183 page titles, 187 URIs (note this is a free licence, but should cut off at 500) Can anyone shed any light on why they differ so much? And where lies the truth?
Technical SEO | | muzzmoz1 -
How long after disallowing Googlebot from crawling a domain until those pages drop out of their index?
We recently had Google crawl a version of the site we that we had thought we had disallowed already. We have corrected the issue of them crawling the site, but pages from that version are still appearing in the search results (the version we want them to not index and serve up is our .us domain which should have been blocked to them). My question is this: How long should I expect that domain (the .us we don't want to appear) to stay in their index after disallowing their bot? Is this a matter of days, weeks, or months?
Technical SEO | | TLM0 -
X-cart page crawling question.
I have an x-cart site and it is showing only 1 page being crawled. I'm a newbie, is this common? Can it be changed? If so, how? Thanks.
Technical SEO | | SteveLMCG0 -
What has happened to my page rank
hi my page rank for the site www.in2town.co.uk was page rank four and then last week it went down to page rank 2 and now my page rank is 0. i really do not understand what has happened. can anyone please give me advice on what is happening
Technical SEO | | ClaireH-1848860 -
I have 15,000 pages. How do I have the Google bot crawl all the pages?
I have 15,000 pages. How do I have the Google bot crawl all the pages? My site is 7 years old. But there are only about 3,500 pages being crawled.
Technical SEO | | Ishimoto0 -
We have been hit with the "Doorway Page" Penalty - fixed the issue - Got MSG that will still do not meet guidelines.
I have read the FAQs and checked for similar issues: YES / NO
Technical SEO | | LVH
My site's URL (web address) is:www.recoveryconnection.org
Description (including timeline of any changes made): We were hit with the Doorway Pages penalty on 5/26/11. We have a team of copywriters, and a fast-working dev dept., so we were able to correct what we thought the problem was, "targeting one-keyword per page" and thin content. (according to Google) Plan of action: To consolidate "like" keywords/content onto pages that were getting the most traffic and 404d the pages with the thin content and that were targeting singular keywords per page. We submitted a board approved reconsideration request on 6/8/11 and received the 2nd message (below) on 6/16/11. ***NOTE:The site was originally designed by the OLD marketing team who was let go, and we are the NEW team trying to clean up their mess. We are now resorting to going through Google's general guidelines page. Help would be appreciated. Below is the message we received back. Dear site owner or webmaster of http://www.recoveryconnection.org/, We received a request from a site owner to reconsider http://www.recoveryconnection.org/ for compliance with Google's Webmaster Guidelines. We've reviewed your site and we believe that some or all of your pages still violate our quality guidelines. In order to preserve the quality of our search engine, pages from http://www.recoveryconnection.org/ may not appear or may not rank as highly in Google's search results, or may otherwise be considered to be less trustworthy than sites which follow the quality guidelines. If you wish to be reconsidered again, please correct or remove all pages that are outside our quality guidelines. When such changes have been made, please visit https://www.google.com/webmasters/tools/reconsideration?hl=en and resubmit your site for reconsideration. If you have additional questions about how to resolve this issue, please see our Webmaster Help Forum for support. Sincerely, Google Search Quality Team Any help is welcome. Thanks0