How to Safely Scrape Google Results?
-
I've built a couple of small tools that I use personally, maybe 2 or 3 times per day.
Both tools scrape the top 10 results from Google and provide more details about each domain (like the SEOMoz Keyword Difficulty Tool).
Google seem to have banned my IP address for automated searches... can anyone tell me a safe way of scraping the google results? Is there a suitable API for this?
How do SEO Moz do this on such a huge scale?
-
As I doubt that the APIs have considerably improved since this blog post http://www.seomoz.org/blog/the-nasty-problem-with-scraping-results-from-the-engines, google scraping is still a big issue and necessary for our daily seo work.
Scraping savely can only work if you succeed in convincing Google that you're a "natural" user and not a scarping robot. How can you do that?
- Search with alternating IPs, from different locations using proxies from the countries where you'd like to scrape from
- don't send too many requests at once from the same source
Consider that, when requesting a URL, the browser sends various information elements to the server, containing, for example, your Operating System, browser version, referer, etc. - every element can and should be changed to virtually change your identity when executing a new search.
- change browsers, browser versions, operating system information, etc.
- take care when changing browser localization values (en-GB, en-US probably don't return the same results)
- have a good network of proxy servers ready to send the different requests with your different identities to
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Domain and urls aren't showing up in Google search
Hi, Moz community, I hope you are staying safe, I have been trying to search our website in Google by using the whole domain name, but it's not showing up. For example: https://www.example.com/
Competitive Research | | ksmith88
https://www.example.com/inner-page.html
Or if search brand name: Example, doesn't come up But when I try example.com, it comes up along with other pages. Neither the inner pages are being come up in the search nor the home page with https://www.example.com. I have checked with Site:example.com, it is showing all the pages, but it is weird on the other hand that it is not visible in the search, what could be the reason? Any tool to check it? I thought it was because of the latest core update from Google. But, there are many keywords in the rankings, so I am sure the website hasn't been impacted. I checked penalties or issues through many tools and even in the search console, everything is fine. Any help would be appreciated.1 -
I am confused/frustrated/surprised how bad my website is doing on google ranking
Hello, I am confused/frustrated/surprised how bad my website (flyhy.co) is doing on google ranking and I have no clue why even though I have been doing my homework regarding SEO. Just a bit of background, I have created a new website about 6 months ago for the paragliding community, the primary goal is to provide a platform for people to publish their ads (osclass), but also to provide some interesting reviews and tools to help paragliders chose their wing. We have been putting a lot of effort to provide a nice user experience and tio build the tools mentioned above. Our main channel to connect with the community is Facebook, and we have been quite active there. I have looked at many SEO articles and I made sure the website provides a good UX, the URLs are SEO friendly, good meta data, etc. Also have been using the google search console and analytics to monitor all of this. But here is the thing, all these does not seem to change anything in our ranking for important keywords such as "paraglider for sale", "paragliding equipment", etc. We seem to only rank (looking at Google’s keyword tool) for very specific wing model names that people have mentioned in their ads. I have ran out of ideas on how to improve our SEO !!!!!! I know the website is only 6 months old, but by now we should get some results. As an example, I will mention one our main website competitors: www.paraglidingequipment.org. OK the URL is pretty obvious and this website ranks in page #1 for "paragliding equipment" (but also for "paraglider for sale" and other paragliding related key phrases). OK there is the URL (paraglidingequipment.org), but I thought nowadays google bots are smarter than just that. The website is 1 year old (so not really much older than us, and was ranking high anyway even 6 months ago). The website looks like it was clearly made by one person and then quickly just left it running, so no content has been added (except for people putting their ads), there is almost no activity on the Facebook account. I have run some test such as "pagespeed insights" and we both rank the same. On "seositecheckup.com", we are clearly better with more 10 points. Is there anyone out there who can tell me what is going on? Have I missed a very important aspect of SEO? Is our website somehow compromising the robots crawling (although I can see about 80 pages have been already indexed in google search console)? I know content is king, but in paraglidingequipment.org the only content I see are ads, and we have ads and other interesting (ie reviews and tools) for paragliders. To conclude, I am basically completely clueless of what to do to rank at least on the first couple of pages of google for the key phrases above. I need help. Hichem. PS: in Moz bar our score is non existing (PA=1,DA=10), on paraglidingequipment.org (PA=23,DA=15). So it looks that essentially we are not apparent on the web! PSS: We have also tried to build some backlinks on few important paragliding community websites.
Competitive Research | | hichemboudali0 -
How much keyword density for Google?
I have several pages on one site which have gone down during the past few months. They keyword density on those pages, which is not unnatural, pleased Google for many years. it still pleases Bing. But Google now seems very picky. Based upon your experience, what is the ideal % keyword density for 2 and 3 word phrases, and should they be left out of alt tags even when proper to put them there? While Google dominates, we do not wish to alienate BIng/Yahoo. It is a huge mystery, and experimentation with more non-keyword-related text has so far not born any fruit. Thank you, GH
Competitive Research | | gheh20130 -
Any chance to out rank Google flight data for company name?
If you search any number with "co" after it you get Continental Airlines flight information of the corresponding number. So you if you search "4co" you get the current flight details for Continental flight 4. Is there any chance if you have a company called 4CO and you own 4co.com that you could get the number one spot for that term or will google flight results always trump the "organic" results? Thanks!
Competitive Research | | 2comarketing0 -
Best methodology for creating local keywords when Google has no data?
Generally I'll look at data for specific geographical searches and incorporate the data from the other keywords, then track the metrics. I think there is likely a more efficient system but I'm not sure where to start.
Competitive Research | | DoriC0 -
Why do i not receive google traffic?
İ have published over 3000 unique articles to pr3 drupal site over the past 3 months, yet only get about 20-30 visitors a day from google to my new 3000 articles. i have spent over 10 000usd for those articles, all range between 400-800 words and all pass copyscape. 90 percent of the articles are indexed and site pr3 site. the site is alltopics.com why do i not receive traffic?
Competitive Research | | rxesiv0 -
Excel Concatenate Function for Google Places?
I'm trying to expedite my research by using the concatenate function. How should the search URL be modified to trigger a google places search as opposed to a normal search. Thanks!
Competitive Research | | BlueFountainMedia0 -
How much weight does Google give to Exact Match Domains?
I'm building a site on a virtual host and now it's ready to go online, but i still have to choose a domain name. One of the main keywords i want to rank for is a 3-word keyword phrase with 9000+ exact match searches per month. Here's an example to better understand my question: 'Guitar training lessons' My main competitor's domain is only 5 months old but it does have the full keyword phrase in it with '4u' added at the end: www.guitartraininglessons4u.com I wanted to go with www.guitartrainingcenter.com (notice that 'lessons' is left out of the domain name) but i'm wondering if my main competitor would have a big advantage by having the full keyword phrase in his domain. How much weight does google give to sites that have the exact search query in their domain name? Does a domain still qualify as 'exact match' if a word (info) is added to it? How much harder would it be to outrank this domain as apposed to a site that doesn't have the keywords in its domain name? Thanks in advance Freek
Competitive Research | | ZeroGrav1