Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Whats the best way to get credit links from sites i've built?
Hello! I've build 100's of sites. They mostly have site wide footer links pointing back to me. I know this is now frowned on. But does anyone have a good solution to get maximum value back from these? A few have a footer link to a credit page that then links back. I get quite a lot of work back from them. So I don't really fancy removing them. Many thanks in advance.
Link Building | | SolveWebMedia0 -
Forwarding a domain seems to be creating 10,720 backlinks according to majestic?
I have a site toptwincitiesrealtors.com that points to my main site mnpropertygroup.com when I look up my backlinks on majestic it says I have 10,720 coming from this toptwincitiesrealtors.com site. should I stop pointing that site? I have a low trust flow but high citation flow on my mnpropertygroup.com site and a 0 trust flow and 5 citation flow on my toptwincitiesrealtors.com site
Link Building | | jchoughton0 -
Hi I changed my site to https://www.cocaineteskit.co.uk - now unsure
Hi ALL, just changed over to https//.www.cocainetestkit.co.uk with the money being in wholesaling. However I am having my links poorly indexed - any suggestions?
Link Building | | AndreavanEugen0 -
I have listed my business in a lot of directories and my authority hasn't changed why??
I have listed my business in a lot of directories recently and my domain authority dropped 1 point...why? second I would like to know what is the most effective way to increase my website domain authority?
Link Building | | VanityCosmetic0 -
SEOs and web developers frequently leave links to their site in the footer of their clients' sites. Does this negatively impact the site with the links?
Does this provide any SEO value to the receiving site? Has anyone experienced problems doing this?
Link Building | | KatMouse2 -
SEO MOZ LINK BUILDING TOOLS
So I learnt that in link building you need to send an email to someone from where you want the inbound link coming in from. In this case, are there any tools withing SEO moz that allow you toactually search for links and also provide you the contact email? Thank you, Vijay
Link Building | | vijayvasu0 -
Backlinks not showing up in the campaign crawl
I have been adding backlinks, about 110 since the last crawl, but only 14 new linking domains showed in my campaign profile. Can someone explain? Is it that the linking domains were not crawled and thus not seen. Is it that the crawl posting is from data that was crawled a long time ago (4-6 weeks ago) and it took a while for it to show up is it that the domains i got links on were, possibly, not so hot and just get filtered out?
Link Building | | Ken_Jansen0