How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is there any update on Google Search Results
I am following some keywords for my website on google. About a month, on the first page of these keywords, there are a lot of changes on ranking. 3-4 website has been falling to 2.3.page and new 3-4 website are shown on 1.page. But these new sites has 0 pagerank and there are no backlinks..These are new websites. What is the reason is there any update on Google search results ?
Competitive Research | | fikhir0 -
I Got A Scraper Delisted From Google ...
I have an electronics niche news website. A scraper who had an online store selling products in my niche copied every one of my articles and posted them on his site under the heading "News" ... generally within 1/2 hour of me posting them on my site. His site was even showing up in the rankings before mine. I filed a copyright infringement claim with Google two weeks ago via their online form explaining what he was doing. Today, I received an email from Google saying that they have reviewed his site and have delisted it from the search engine. I just checked, and he is GONE ... completely delisted, no trace. My site traffic has also jumped at least 25% today. It pays to complain! Just sharing 😉
Competitive Research | | Humanovation3 -
Predicting Time For Results
Hi Mozzers. My company uses an SEO agency. They are building links from relevant sites but, since hiring them around 9 months ago our SERPs for our important search terms have remained pretty static (dropping slightly). I'm in charge of my company's marketing budget so need to make decisions about ROI when determining where best to invest my resources. It seems to me that it ought to be possible to predict (with a fair degree of accuracy) how long it's likely to take us to move up the SERPs based upon an analysis of our website, link profile and how it's changing over time vs those of our competitors. Unfortunately my SEO agency keeps dodging this question. Am I being unreasonable in expecting this? Is what I've described above even possible? If it is possible what are the factors you would consider when trying to make such a prediction? Thanks for your help David
Competitive Research | | davidoff5744440 -
How to check Google Keyword rankings?
Hey, So I recently watched the DuckDuckGo commercial about Google's results bubble. My question is how can you get the actual keyword rankings of relevant key phrases without Google taking your locations and search history into account? Would it just be a case of clearing your cache & history from the beginning of time or is there an accurate tool (other than SEOmoz ofc that I can use? Best regards, Dan
Competitive Research | | Sparkstone0 -
Quick question about country specific organic results
Do you think that if your website is from your home country. You will rank better for some keyword even when you dont have much page authority when compared to other websites having much higher page authorities from other countries.
Competitive Research | | ksbnok0 -
Someone help me with these results?
I've been using SEOMOZ for several months now. I've been working on cleaning up my onpage SEO for a while now. I have much less errors then my competition and my competitive link analysis is better than the #1 and the same as the #2 google result for "Kayak Fishing" in Google US. Can anyone offer any more advice on how I can get rank better? My site yakangler.com is currently ranked #52 on Google US. My SEO Report overview: http://awesomescreenshot.com/044ekugfc One of the competition overview: http://awesomescreenshot.com/0f6ekvg89 Looking at the link analysis will the difference in links make that much of a difference? http://awesomescreenshot.com/03cekwo55
Competitive Research | | mr_w0 -
The starter crawl is going on 2 days and no results
Does the starter crawl work in the first 30 days? Mine has been going 2 days and still no results, has finished yet??
Competitive Research | | WalterW0 -
Government Sites Cluttering Results?
Hi Guys, Have you ever come across government sites that are cluttering up SERP's that you're trying to rank for? For a new site that I'm working on one of the keyword terms is "driving test cancellations" and is in the UK. 3 of the top 4 results are government related sites which have verry little (if not nothing) to do with the keywords. Whilst these government sites are (very) loosley related to the keyword terms, and understandably have high pa/da, what would be the best way to try and rrank higher than these sites. I'm in the process of building links and social profiles - I'm really just wondering if there's something I'm missing that is an "easy fix" for jumping ahead of these sites - or getting them removed due to their lack of relevence. Gary...
Competitive Research | | perfectweb0