Improving Crawl Efficieny
-
Hi
I'm reading about crawl efficiency & have looked in WMT at the current crawl rate - letting Google optimise this as recommended.
What it's set to is 0.5 requests every 2 seconds, which is 15 URLs every minute.
To me this doesn't sound very good, especially for a site with over 20,000 pages at least?
I'm reading about improving this but if anyone has advice that would be great
-
Great thank you for this! I'll take them on board
Becky
-
You may be overthinking this, Becky. Once the bot has crawled a page, there's no reason (or benefit to you) for it to crawl the page again unless its content has changed. The usual way for it to detect this is through your xml sitemap,. If it's properly coded, it will have a <lastmod>date for Googlebot to reference.
Googlebot does continue to recrawl pages it already knows about "just in case", but your biggest focus should be on ensuring that your most recently added content is crawled quickly upon publishing. This is where making sure your sitemap is updating quickly and accurately, making sure it is pinging search engines on update, and making sure you have links from solid existing pages to the new content will help. If you have blog content many folks don't know that you can submit the blog's RSS feed as an additional sitemap! That's one of the quickest ways to get it noticed.
The other thing you can do to assist the crawling effectiveness is to make certain you're not forcing the crawler to waste its time crawling superfluous, duplicate, thin, or otherwise useless URLs.</lastmod>
Hope that helps?
Paul
-
There are actually several aspects to your question.
1. Google will make its own decision as to how important pages and therefore how often it should be crawled
2. Site speed is a ranking factor
3. Most SEO's belief that Google has a maximum timeframe in which to crawl each page/site. However, I have seen some chronically slow sites which have still crawl and indexed.
I forgot to mention about using an xml site map can help search engines find pages.
Again, be very careful not to confuse crawling and indexing. Crawling is only updating the index, once indexed if it doesn't rank you have another SEO problem, not a technical crawling problem.
Any think a user can access a crawler should be able to find it no problem, however if you have hidden pages the crawler may not find them.
-
Hi
Yes working on that
I just read something which said - A “scheduler” directs Googlebot to crawl the URLs in the priority order, under the constraints of the crawl budget. URLs are being added to the list and prioritized.
So, if you have pages which havent been crawled/indexed as they're seen as a low priority for crawling - how can I improve or change this if need be?
Can I even impact it at all? Can I help crawlers be more efficient at finding/crawling pages I want to rank or not?
Does any of this even help SEO?
-
As a general rule pages will be indexed unless there is a technical issue or a penalty involved.
What you need to be more concerned with is the position of those pages within the index. That obviously comes back to the whole SEO game.
You can use the site parameter followed by a search term that is present on the page you want to check to make sure the pages indexed, like: site:domain.com "page name"
-
Ok thank you, so there must be ways to improve on the number of pages Google indexes?
-
You can obviously do a fetch and submit through search console, but that is designed for one-off changes. Even if you submit pages and make all sorts of signals Google will still make up its own mind what it's going to do and when.
If your content isn't changing much it is probably a disadvantage to have the Google crawler coming back too often as it will slow the site down. If a page is changing regularly the Google bot will normally gobble it pretty quick.
If it was me I would let you let it make its own decision, unless it is causing your problem.
Also keep in mind that crawl and index are two separate kettles of fish, Google crawler will crawl every site and every page that it can find, but doesn't necessarily index.
-
Hi - yes it's the default.
I know we can't figure out exactly what Google is doing, but we can improve crawl efficiency.
If those pages aren't being crawled for weeks, isnt there a way to improve this? How have you found out they haven't been crawled for weeks?
-
P.S. I think the crawl rate setting you are referring to is the Google default if you move the radio button to manual
-
Google is very clever working out how often it needs to crawl your site, pages that get updated more often will get crawled more often. There is no way of influencing exactly what the Google bot does, mostly it will make its own decisions.
If you are talking about other web crawlers, you may need to put guidelines in place in terms of robots.txt or settings on the specific control panel.
20,000 pages to Google isn't a problem! Yes, it may take time. You say it is crawling at '0.5 requests every 2 seconds' - if I've got my calculation right in theory Google will have crawled 20,000 URLs in less than a day!
On my site I have a page which I updated about 2 hours ago, and the change has already replicated to Google, and yet other pages I know for a fact haven't been crawled for weeks.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Mobile site crawl returns poorer results on 100% responsive site
Has anyone experienced an issue where Google Mobile site crawl returns poorer results than their Desktop site crawl on a 100% responsive website that passes all Google Mobile tests?
Intermediate & Advanced SEO | | MFCommunications0 -
Strange internal links and trying to improve PR ? - Please advise
Hi All, I've been looking at the internal links on my eCommerce site to try and improve PR and get it as efficient as possible so link juice isnt getting wasted etc and I've come across some odd ones I would like some advice on My website currently has between 125-146 links on every page (Sitemap approx 3500 pages). From what I read ,the ideal number of links is under 100 but can someone confirm is this is still the case ?..Is it a case of less is more , in terms of improving a page PR etc ? in terms of link juice strength etc so it's not getting diluted to unnecessary pages. One of my links is a bad url ( my domain + phone number for reason) which currently goes to a 404 page ?. - Is this okay or do we need to track down the link and remove it. I don't want link juice getting wasted as it's on every page. Another one of my links is my domain.name/# and another one with some characters after the # which both to the home page. Example www.domain.co.uk/# and www.domain.co.uk#abcde both go to homepage. Is this okay or am I potentially getting duplicate content as If I put these urls in , they go to my home page. I have a link on every page which opens up outlook (email) on the contact us. Should this really be changed to a button with a contact us form opening up instead ? I currently have 9 links on the bottom on every page i.e About it , delivery , hire terms,.contact us , trade accounts , privacy, sitemap. When I check , these pages seem to be my strongest pages in terms of PR. Is that because they are on every page?.. Should I look to reduce these links as they are accessible from the navigation menu apart from privacy and sitemap. Any advice on this would be greatly appreciated ? thanks Pete
Intermediate & Advanced SEO | | PeteC120 -
How to improve my rankings?
Hi guys, I'm targeting a specific country NZ to start off with as my site is only new 3months old - whilst I'm trying to rank for various keywords such as - psychic readings, online psychic readings, love psychics, clairvoyant, astrology, tarot card readings etc I can see I'm slowly ranking around 18, 20,30,40 through my blog page, but nothing from the site. I know the site is content thin and were currently working on improving this, is there anything else you can suggest that perhaps I might need to be aware of? Or any tools I could use to go about getting information that could help put me ahead lol Thanks all!
Intermediate & Advanced SEO | | edward-may0 -
An improved search box within the search results - Results?
Hello~ Does anyone have any positive traffic results to share since implementing this? Thanks! MS
Intermediate & Advanced SEO | | MargaritaS0 -
Crawl diagnostic issue?
I'am sorry if my English isn't very good, but this is my problem at the moment: On two of my campagnes I get a weird error on Moz Analytics: 605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag Moz Analytics points to an url that starts with: http:/**/None/**www.????.com. We don't understand how Moz indexed this non-existing page that starts with None? And how can we solve this error? I hope that someone can help me.
Intermediate & Advanced SEO | | nettt0 -
How do I Improve Google Local search position
Hi, I think its called local search position, what I'm referring to is when you do a search on a keyword and google lists not only the best matches but also usually the second match is a group of 3 businesses with telephone numbers, google reviews and at the bottom of the group it will say something like: "See results for <your keyword="">on a map. This is what I'm referring to. in anycase my question is if I click on the link to see more results on a map I'm listed as number 3, however on the search page before where the link is displayed which I just clicked on I'm not being listed and instead one business name is being listed three times, each of the listings uses the same address but a different telephone number, In addtion the business that is being listed three times is also listed in the results being returned above in this case position #1 for the keyword I have searched. I assume this has something to do with them also being listed in the group of local businesses below three time.. The business I'm interested in getting listed in this group of results is currently being listed page 2 position 5 for the keyword..</your> Any suggestions would be greatly appreciated.. Thanks in advance..
Intermediate & Advanced SEO | | robdob11 -
Would spiders successfully crawl a page with two distinct sets of content?
Hello all and thank you in advance for the help. I have a coffee company that sell both retail and wholesale products. These are typically the same product, just at different prices. We are planning on having a pop up for users to help them self identify upon their first visit asking if they are retail or wholesale clients. So if someone clicks retail, the cookie will show them retail pricing throughout the site and vice versa for those that identify themselves as wholesale. I can talk to our programmer to find out how he actually plans on doing this from a technical standpoint if it would be of assistance. My question is, how will a spider crawl this site? I am assuming (probably incorrectly) that whatever the "default" selection is (for example, right now now people see retail pricing and then opt into wholesale) will be the information/pricing that they index. So long story short, how would a spider crawl a page that has two sets of distinct pricing information displayed based on user self identification? Thanks again!
Intermediate & Advanced SEO | | ClayPotCreative0 -
Page Crawling Check after Modification Done without staying 7 days
Page Crawling Check after Modification Done without staying 7 days. I have dome modification to my site and uploaded .so i wanna check remaining errors but Moz Crawl web site once per 7 days ,is there any way to check before that . Thank you
Intermediate & Advanced SEO | | innofidelity0