Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What MOZ domain authority are considered bad links?
As a general rule of thumb, what MOZ domain authority are considered bad links? Ie., is it for sites that are 20 and below? What is the DA floor where you should target links from?
Link Building | | jspinder150 -
Is it worth getting a backlink to a page that will be 301'd?
I have an opportunity to get high value backlinks (from the Emmy's) to a landing page before our new website is live. The backlinks won't be free and they would go live before the website is live. Is it worth spending the money if the landing page will be redirected once the site goes live?
Link Building | | TimThiel0 -
Gained a number of new 'do-follow' links - Good or Bad?
So, my client's site is on a subdomain of the parent site and there were a number of links previously pointing to the subdomain site from the main site which were all no-follow so were purely for referral/traffic benefits. Around the end of February, the main site changed all the links to 'do-follow'. Is this likely to have caused the decline we've recently seen in organic traffic to the site and a drop in rankings? I'm curious as to how that could have affected the site as we didn't know and have only just found out about it but I'm not sure how to show that it's led to a decline. Any help is appreciated!
Link Building | | soapmed1 -
Who's going to Pubcon Las Vegas in October 2015? I'm looking to connect with programmers, seo and link building providers.
Hi Moz Members, We are heading to PubCon next month. We are a marketing agency looking to connect with providers that can help us service our clients with website development and seo. Please drop me a line at cwjteng@hotmail.com with your contact information and blurb about your company/services. Thanks! K
Link Building | | kjseo0 -
How to make google index the page title has changed ?
Google has indexed all my websites, but I have to change some title in the website but when checking google does not update the change.
Link Building | | tuananh0688
Please help me .
My site : http://blog.addme.vn0 -
Blogroll links vs. in author's byline
So, I have the following dilemma. I have certain amount on my budget and I'm thinking where to invest it better. Would you recommend obtaining blogroll links or focus on links that put in author's byline (for instance when you write a guest post). Could you also explain why you think so? 🙂 Thanks beforehand.
Link Building | | VinceWicks0 -
Why aren't my links being indexed
I am building backlinks from similar sources to 2 different domains. Domain A seems to get its linked indexed within a day or 2, while Domain B has virtually identical links that haven't indexed for over a month. Domain A is older and has more authority, and Domain B is hosted on a different server. Any insight into why this is happening? It's very frustrating.
Link Building | | insitegoogle0 -
Google not providing all competitor site's external incoming links?
I heard Google is not providing all of a competitor's site's external links, speculated as protecting their privacy. But that Yaho still provides the complete list of their recognized links. (I assume they mean when using the term 'link:url') Has anyone else heard this claim? Is there a way around this check?What sources do you use at SEOMoz for your Pro reports? thanks, geo
Link Building | | rhawk0