How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Domain and urls aren't showing up in Google search
Hi, Moz community, I hope you are staying safe, I have been trying to search our website in Google by using the whole domain name, but it's not showing up. For example: https://www.example.com/
Competitive Research | | ksmith88
https://www.example.com/inner-page.html
Or if search brand name: Example, doesn't come up But when I try example.com, it comes up along with other pages. Neither the inner pages are being come up in the search nor the home page with https://www.example.com. I have checked with Site:example.com, it is showing all the pages, but it is weird on the other hand that it is not visible in the search, what could be the reason? Any tool to check it? I thought it was because of the latest core update from Google. But, there are many keywords in the rankings, so I am sure the website hasn't been impacted. I checked penalties or issues through many tools and even in the search console, everything is fine. Any help would be appreciated.1 -
How much keyword density for Google?
I have several pages on one site which have gone down during the past few months. They keyword density on those pages, which is not unnatural, pleased Google for many years. it still pleases Bing. But Google now seems very picky. Based upon your experience, what is the ideal % keyword density for 2 and 3 word phrases, and should they be left out of alt tags even when proper to put them there? While Google dominates, we do not wish to alienate BIng/Yahoo. It is a huge mystery, and experimentation with more non-keyword-related text has so far not born any fruit. Thank you, GH
Competitive Research | | gheh20130 -
Quick question about country specific organic results
Do you think that if your website is from your home country. You will rank better for some keyword even when you dont have much page authority when compared to other websites having much higher page authorities from other countries.
Competitive Research | | ksbnok0 -
Why is different the difficulty of a keyword in Google Spain and Google mexico?
In your opinion, Which are the main reasons of this difference?
Competitive Research | | BorjaUrreta910 -
The starter crawl is going on 2 days and no results
Does the starter crawl work in the first 30 days? Mine has been going 2 days and still no results, has finished yet??
Competitive Research | | WalterW0 -
Tool to scan contents of a page and see it through Google's eyes?
Does anyone know of a tool that can be used to input a URL and have it spit out keywords of how Google could possible see what the contents of that page is about?
Competitive Research | | shawn810 -
How does a site get to no 3 in Google with no KW in their links?!!
Hello everyone, my first post - ahhh I'm investigating a niche and there is a site that should have no right being there in my view. It's no. 3 Google UK for 'company formation' with a small site with 65 weak links from only 7 domains and hosted in the US. But more importantly, the Open Site Explorer says there is not 1 link with that term in its anchor text. This I find crazy and makes me suspicious. But before I go back to my client saying "oh they must be black hat" I would like your expert views. I'm not sure whether to tut or congratulate them and for the first time I'm not sure what reasons to give for their amazing performance! What's your views?
Competitive Research | | GOYMedia480 -
How much weight does Google give to Exact Match Domains?
I'm building a site on a virtual host and now it's ready to go online, but i still have to choose a domain name. One of the main keywords i want to rank for is a 3-word keyword phrase with 9000+ exact match searches per month. Here's an example to better understand my question: 'Guitar training lessons' My main competitor's domain is only 5 months old but it does have the full keyword phrase in it with '4u' added at the end: www.guitartraininglessons4u.com I wanted to go with www.guitartrainingcenter.com (notice that 'lessons' is left out of the domain name) but i'm wondering if my main competitor would have a big advantage by having the full keyword phrase in his domain. How much weight does google give to sites that have the exact search query in their domain name? Does a domain still qualify as 'exact match' if a word (info) is added to it? How much harder would it be to outrank this domain as apposed to a site that doesn't have the keywords in its domain name? Thanks in advance Freek
Competitive Research | | ZeroGrav1