Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Grr SEO linking.. I am not understanding why I wouldn't have lots more links.. Please help. Thanks
I have done the whole moz open exploer and I am not understanding why my site wouldn’t have more links registering to my website.. I have lots of sites(directorys and 3rd party) with my website domain in them. The only one that is linking to my site is BBB.com and my advertsing with saint paul press. www.somerersetautodealer.com But if I have links with all kinds of automotive directories why wouldn’t they register? I am sure this a simply answer or that I am not understanding something. Thanks for your help! Scott
Link Building | | Scott12340 -
MOZ indicates 404 page and 0 backlinks, after 301 redirect. The best redirect solution?
Dear MOZers, I have one concern i would like to ask you.. My website have valuable backlinks to the most important page which is awesome. But, some of those backlinks is old URL which is 301 redirected to the URL I use today. As far as i know 301 redirects pass page rank, so it should be alright. (?) Link i use today: domain.com/awesome-url/ My backlink: domain.com/old-boring-url/ 301 redirected to domain.com/awesome-url/ What bothers me is that in MOZ report (Link opportunities) MOZ indicates my important URL (domain.com/old-boring-url/) return 404. But, it is not true. It is 301 redirected to domain.com/awesome-url/ and it works! I am worried i am losing important link juice here(!?) Also, my On Page Grader tool for domain.com/awesome-url/ reports 0 backlinks. And that is not true! I know exactly which websites are linking back, only problem is that those website are using old URL (domain.com/old-boring-url/) to link back to my website. Am i doing something wrong? Is there a better way to redirect and save valuable link juice for my page? Thank you very much! PS The URL was changed in March 2015.
Link Building | | Chemometec0 -
How can I tell if a site is trustworthy and is not / hasn't been penalized by Google?
Hello Moz Community, I'm looking to do some link building for a client and I would like to know if there's a way to find out if a website is trustworthy and is not or hasn't been penalised by Google. Thank you.
Link Building | | CosminC0 -
Should you do a disavow even if you don't have a manual penalty?
If you are working on a website which has a history spammy links, but no manual penalty by Google... is it still worthwhile to still go through the link removal and disavow process? Thoughts appreciated.
Link Building | | Gavo0 -
Is it better to have a blog under your main website's domain or should the company's blog have its own domain?
Is it better to have a blog under your main website's domain or should the company's blog have its own domain? And if the blog is on its own domain, do links to the main site look good or bad to Google? Thanks! (And please feel free to add any other related advice you might have...)
Link Building | | Linda-Vassily0 -
PR4 blog but no data in OSE?
I've recently been asked to guest post on a site that appears to be PR4 but when checking the sites authority on OSE no data shows up. The site is http://www.osozo.com. I'm looking for opinions really, first of all on how it can be PR4 but OSE has no data on it and secondly whether it's a good site for me to post on? Any help much appreciated!
Link Building | | SamCUK0 -
Changing backlinks anchor text
Hi, I've read a few blog post here that suggests the strength of building links using your brand as an anchor text. This supposedly gives the site authority. Currently a chunck of the back links to my homepage are on generic terms i'm trying to rank for which doesn't seem to be working very well. I was thinking of contacting the various webmasters to change the anchor text to that of the site brand name but wondering if this will signal a manipulation of links to the search engines and potentially could be flagged as paid links? Has anybody done this before and what is the danger of doing this? Thanks Duke
Link Building | | clickangel0 -
How do paid directories like thomasnet.com do so well in the serps? Aren't the Panda updates supposed to be moving us away from this?
With all of the updates/changes to Google's algo, I assumed that paid listings & links like those on thomasnet.com would have less merit. Is this an incorrect assumption?
Link Building | | PropelMike0