Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What is the importance of root domains linking to your website in Google's rankings? I notice our competition has a much higher number on keywords I'm analyzing. Thank you!
I've noticed our competition has a much higher number of "root domains" linking to their page than we do. Is this simply a result of more websites linking to them? How long does it normally take to build up these numbers/rankings? (I'm assuming it's a concerted effort, which I'll be researching.) Thank you!
Link Building | | mjfinet0 -
Seo for a photographer's website
Hello all, I am a big fan of this site & of SEO. I am a photographer & I do my own SEO (horribly, I must admit), my questions is how to I go about attracting natural links when so much of my content image of weddings & family portraits? How I do get good websites or blogs interested in about my website?
Link Building | | KristopherWho0 -
What's a reasonable time to start ranking for a niche site?
I launched a niche site to go through the whole process, for fun and profit. Started with keyword research (3-6 word long tail keywords), have done first 3 posts and done minimal amount of link building. Have gotten about a dozen visitors in the first 5 weeks and ranking on page 2. Is this ok or should I be more aggressive with link building (assuming my keyword research is done ok)?
Link Building | | AndrusPurde1 -
Links from Directory's
I have been looking at the Directory's recommended on the SEOMOZ site. All of those that I have looked into do not appear to have a page rank for the actual page that my link would be appearing on. They all appear to offer N/A as the reply.Is this a problem? Thanks in advance for any replies!
Link Building | | Babyshoe0 -
Anyone have Free Directories with High Domain Authority they'd like to share?
I was just curious if anyone had any directories they'd like to share that carry high Domain Authority(imo: 70+)? I know about dmoz.org and Pegasus but other than that, none. Thanks.
Link Building | | Modbargains0 -
SEOmoz's ranking of links on competitor question..
I am Perplexed about SEOmoz crawl for new links on a competitors website. When I looked at a company who received very high ranks for the links...on SEOMoz...such as. 95, 96, 97 etc..... I couldn't figure out how they ranked so high. One company bought 20 links for a couple hundred dollars and all of them were ranked very high on SEOmoz but when I used the page rank tool on them individually they were either "0" or "1" and none were over a 2. I was debating submitting for those since I am just starting out and wanted to get our name listed in the internet world. Are the high rankings by SEOmoz related to something else? Would I be better off buying a listing in some of the premium directories instead. Like Yahoo or BBB or Manta. ( After I get my site optimized first) Thank you, Greg
Link Building | | Boodreaux0 -
What's the real deal with nofollow
I had a few questions regarding nofollow links. It seems like more and more sites, forums, etc. nofollow their links. Is it still worth trying to get a link from? I've heard only Google takes nofollows into consideration. Do other search engines (Bing specifically) "listen" to nofollows? Finally, when checking for nofollow, does it need to be right by the link(s) or can it be anywhere in the source? Thanks in advance!
Link Building | | DevonIntl0 -
What's your best link, and where is it from?
I'll start with ours, it's a BBC article which listed our website in the resources box, keyword rich 🙂
Link Building | | tomcraig860