Howcome Google is indexing one day 2500 pages and the other day only 150 then 2000 again ect?
-
This is about an big affiliate website of an customer of us, running with datafeeds...
Bad things about datafeeds:
- Duplicate Content (product descriptions)
- Verrryyyy Much (thin) product pages (sometimes better to noindex, i know, but this customer doesn't want to do that)
-
Hi Dana,
Thanks for your detailed explanation. Appreciate it Off course I understand that site speed is a factor for crawling (+ ranking) and that the Google bots only want to spend a certain period of time on a website. It's more like, when servers are performing almost equal every day so page loads are igual to, what could it be?
I agree with your two points of considering, but I'm the type of guy that always wants to know why something is happening
@Nakul: Thanks for your responds!
The pages that are in and out of the index are mostly product pages. So the thing about "frequently updates" can be something. The website is pretty young so authority is not yet build as it should be for a big site. This can also be a factor cause the more authority the more time Google will spend indexing a website rightAnyway, great thanks for both of your answers!
Gr. Wesley
-
I agree with everything Nakul has said. Just to piggyback on that with additional information, try to think about it this way. Remember when someone gave you $1.00 when you were little and said "Don't spend it all in one place?" Well, someone at Google must have grown up with the same grandparents I did.
Okay, now, the analogy-free explanation
Google has a "crawl budget" every day. Every day that budget is allocated to millions of different sites. Now, by "sites" I mean "pages." Some pages change really frequently (i.e. the Yahoo New homepage). Some pages change hardly ever (i.e. an archived blog post). Also, some pages have very high PR and others, not so much. Also, some pages load extremely fast (consuming less of Google's bandwidth when the page is crawled) which leaves more Google resources available to Google to crawl more pages. Google likes it, and so should we all because people with fast sites are making it possible for everyone to get crawled more often (in essence, making them very considerate, well-behaved members of the Internet community).
So, based on all these, Google is going to apportion a part of its crawl budget to your site on any given day. Some days, it may have more room in its budget for you than others. Part of this might be effected by how fast pages, on any given day, load from your site. A ton of parameters can come into play here, including whether or not the pages on that day are heavier, or whether or not your servers are performing really fast on one day versus another.
I'd say the two things to be really concerned with after considering all of these things are:
- Is Google indexing all of the pages you want indexed?
- Is Google's cache date of your important pages recent enough? (i.e. 3 weeks or less)
If the answer is "no" to either one of those, then it's time to do some investigation to find out if there are technical issues or penalties that have been put in place that are hurting Google's ability or desire (not the right word to use about a bot, but I'm using it anyway) to crawl your pages.
Does that help?
-
Domain Authority / Pagerank is what Google looks to see how deep and how frequently Google will crawl a particular website. They also typically look into how frequently the content is being updated.
Think about it from Google's perspective. Why should they index that website, 2500 pages every day. What's changing ? Does the site have enough domain authority to warrant that kind of indexing ?
In my opinion, this is not a concern. Just submit XML Sitemaps and see what percentage of your submitted pages are indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Pages getting into Google Index, blocked by Robots.txt??
Hi all, So yesterday we set up to Remove URL's that got into the Google index that were not supposed to be there, due to faceted navigation... We searched for the URL's by using this in Google Search.
Intermediate & Advanced SEO | | bjs2010
site:www.sekretza.com inurl:price=
site:www.sekretza.com inurl:artists= So it brings up a list of "duplicate" pages, and they have the usual: "A description for this result is not available because of this site's robots.txt – learn more." So we removed them all, and google removed them all, every single one. This morning I do a check, and I find that more are creeping in - If i take one of the suspecting dupes to the Robots.txt tester, Google tells me it's Blocked. - and yet it's appearing in their index?? I'm confused as to why a path that is blocked is able to get into the index?? I'm thinking of lifting the Robots block so that Google can see that these pages also have a Meta NOINDEX,FOLLOW tag on - but surely that will waste my crawl budget on unnecessary pages? Any ideas? thanks.0 -
Google Places Landing Page: Homepage or City-Specific?
What should you put in the “Website” field of your Google Places page: the URL of your homepage, or of one of your location pages?
Intermediate & Advanced SEO | | AlexanderWhite0 -
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page?
Does Google still don't index Hashtag Links ? No chance to get a Search Result that leads directly to a section of a page? or to one of numeras Hashtag Pages in a single HTML page? If I have 4 or 5 different hashtag link section pages , consolidated into one HTML Page, no chance to get one of the Hashtag Pages to appear as a search result? like, if under one Single Page Travel Guide I have two essential sections: #Attractions #Visa no chance to direct search queries for Visa directly to the Hashtag Link Section of #Visa? Thanks for any help
Intermediate & Advanced SEO | | Muhammad_Jabali0 -
How to remove wrong crawled domain from Google index
Hello, I'm running a Wordpress multisite. When I create a new site for a client, we do the preparation using the multisite domain address (ex: cameleor.cobea.be). To keep the site protected we use the "multisite privacy" plugin which allows us to restrict the site to admin only. When site is ready we a domain mapping plugin to redirect the client domain to the multisite (ex: cameleor.com). Unfortunately, recently we switched our domain mappin plugin by another one and 2 sites got crawled by Google on their multsite address as well. So now when you type "cameleor" in Google you get the 2 domains in SERPS (see here http://screencast.com/t/0wzdrYSR). It's been 2 weeks or so that we fixed the plugin issue and now cameleor.cobea.be is redirected to the correct address cameleor.com. My question: how can I get rid of those wrong urls ? I can't remove it in Google Webmaster Tools as they belong to another domain (cf. cameleor.cobea.be for which I can't get authenticated) and I wonder if will ever get removed from index as they still redirect to something (no error to the eyes of Google)..? Does anybody has an idea or a solution for me please ? Thank you very much for your help Regards Jean-Louis
Intermediate & Advanced SEO | | JeanlouisSEO0 -
How to get content to index faster in Google.....pubsubhubbub?
I'm curious to know what tools others are using to get their content to index faster (other than html sitmap and pingomatic, twitter, etc) Would installing the wordpress pubsubhubbub plugin help even though it uses pingomatic? http://wordpress.org/extend/plugins/pubsubhubbub/
Intermediate & Advanced SEO | | webestate0 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
Why my own page is not indexed for that keyword?
hi, I recently recreated the page www.zenucchi.it /ITA/poltrona-frau-brescia.html on the third level domain poltronafraubrescia.zenucchi.it by putting it on the home page. The first page is still indexed for the keyword poltrona frau brescia . But the new page is no indexed for that keyword and i don't know why ( even if the page is indexed in google ) .. I state that the new domain has the same autorithy and that i put a 301 redirect to pass his authority to the new one that has many more incoming links that did not have previous .. i hope you'll help me thanks a lot
Intermediate & Advanced SEO | | guidoboem0 -
Removing a Page From Google index
We accidentally generated some pages on our site that ended up getting indexed by google. We have corrected the issue on the site and we 404 all of those pages. Should we manually delete the extra pages from Google's index or should we just let Google figure out that they are 404'd? What the best practice here?
Intermediate & Advanced SEO | | dbuckles0