How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
So What's Up With Those Crappy Search Results?
I used to rank for some keywords now I've been outranked by crappy websites. But what amazes me most is that among the top 10 results for a particular keyphrase, 3 of these results point to websites that are no longer online! Worst than that, these websites have to backlinks! So how come 404 pages / non-existing websites rank higher than I do? Is Google loosing it or are they trying to create so much confusion in the hope that website owners will turn to Adwords?
Competitive Research | | sbrault740 -
Does anyone use the Google Custom Search API?
What has your experience been like with the API? Do you prefer a rank tracking tool? If so, which one? API: https://developers.google.com/custom-search/v1/overview
Competitive Research | | CIEEwebTeam0 -
Someone help me with these results?
I've been using SEOMOZ for several months now. I've been working on cleaning up my onpage SEO for a while now. I have much less errors then my competition and my competitive link analysis is better than the #1 and the same as the #2 google result for "Kayak Fishing" in Google US. Can anyone offer any more advice on how I can get rank better? My site yakangler.com is currently ranked #52 on Google US. My SEO Report overview: http://awesomescreenshot.com/044ekugfc One of the competition overview: http://awesomescreenshot.com/0f6ekvg89 Looking at the link analysis will the difference in links make that much of a difference? http://awesomescreenshot.com/03cekwo55
Competitive Research | | mr_w0 -
Why would a specific Title page search not show up on Google?
I need help to solve an ongoing problem. I have been working to try to figure this out now for weeks. When you search a
Competitive Research | | rdominey
specific page title that has a low competition and all of the SEO checks indicate
that the page should rank in the top 10 if not #1 yet it is nowhere to be found
(not in top 200). I have looked at all of the suggested possible caused from
this and other forums. I have been told by Google that we are not being
manually penalized. I have taken action to correct all of the issues that have
been mentioned in forums; speed, links, SEOmoz crawl results are good, No major
problems for the site, page rank for the search keywords is A yet; Still the problem persists please let me explain with this simple test result: Search Google, Yahoo and Bing for; Gallery Wrap vs Museum Wrap Canvas Looking for this page: http://www.getyourphotosoncanvas.com/gallery-wraps-vs-museum-wraps/ Google = not in top 200 Yahoo = 2 Bing = 2 On the Google search if you drop the work Canvas the result is #2 With the exact title phrase; Gallery Wrap vs Museum Wrap Canvas We find the following pages, but not the correct page: Free Digital Proof from Get Your Photos on Canvas <cite>www.getyourphotosoncanvas.com/free-digital-proof/</cite> FREE Digital Proofs offered by Get Your Photos on Canvas before you ... form the Gallery Wrap or what the Museum Wrap will look like and much, much more! Rank 76 on search for Gallery Wrap vs Museum Wrap Canvas Photos on Canvas Online Gallery Photographs by Ray Dominey <cite>www.getyourphotosoncanvas.com/store/</cite> Photographs
on Canvas by renowned St. Augustine Photographer
Ray Dominey. Photographs ... Gallery Wrap vs. Museum Wrap · Before & Afters
That WOW! Rank 107 on search for Gallery Wrap vs Museum Wrap Canvas Photo on Canvas Triptych, Three Panel Canvas Split Wall Display <cite>www.getyourphotosoncanvas.com/.../split-panel-triptych-photos-on-... 21, 2012 – Photo on Canvas Triptych Split Panels are very popular today but the origin of ...
Gallery Wrap vs. Museum Wrap · Before & Afters That WOW!</cite>Feb Rank 128 and 132 on search for Gallery Wrap vs Museum Wrap Canvas I need help can anyone please help me figure this out?0 -
Google recipe search...
I just did a search for a recipe and saw something new. Google has a sidebar that lets you toggle on/off ingredients... Pretty nifty, and interesting. I did not have the recipe toggle marked, I was just using regular search "everything"
Competitive Research | | Mcarle0 -
Google Page Rank not working?
My Google PR in the toolbar has not worked since last night? Is it on my end or is anyone else have the same issue??
Competitive Research | | Robbie82990 -
How come the results in Google vary with domains
Hello, How is everyone doing? My question is about the google search engine results page. How come some results have the www. in front of them and some don't. Also what are the SEO implications of having www. in front of your search results vs. not. Is this something to do with canonical? I have included a screen shot so you will see what I mean. One result is www.gearyi.com and the result without the www is ingenexdigital.com. R6GLL.png
Competitive Research | | digitalops0 -
My client has shown me a similar site, though not a competitor. He wants to know what sites they are linked from that give them such a good Google rank for certain kewords. Can SEOMoz tell me this?
When using google.com.au and searching for "travel to france", www.frenchtravel.com.au is the 3rd organic result. (the 1st two are not travel businesses, they are non profit travel guides) My client, who runs www.visituk.com.au, an Australian site that organises tours of the UK, said "so we just need to add these sort of words to the site?" I said, yes, but it doesn't end there. The real task is to have a link to your site on other sites surrounded with the words "travel" and "UK". He asked if he could see a list of the sites the french site was being referred by relevant to the search phrase. Is there an SEOmoz tool for this? Or is there another way I can generate that list? Thanks Simon
Competitive Research | | electrik0