Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why are there less backlink domains in Moz vs. Semrush?
For our domain studyville.com, Semrush is reporting 46 linking domains, and Moz is reporting 7. Does anyone know where there is such a large discrepancy?
Link Building | | shelbythomas0 -
Why Google Search Console Data is different from Moz Data for my website
Hi All I am running a website, I have been using Moz since Feb 2021. Kindly go through these pics My question is why Moz is showing 2K plus backlinks while Google search console is showing just 1253 backlinks. Why fewer links in the google search console is less? How can I increase Google search console backlinks? Also, Moz is showing 90+ DA backlinks but those websites are not showing by the Google search console. What should I do to let google consider them? m0JGjBQ.jpg Cmp8ei3.png
Link Building | | ssubodhsingh0 -
How Effective Are Links Between The Same Company's Websites With Different Domain Extensions?
Morning! The main competitor of an eCommerce site I'm working on has a total of 31 sites for 31 different countries. Each one of these sites has a different domain extension (.com, .co.uk, .fr, .it etc.), and every single one of these sites' pages links to all the other homepages through a dropdown menu on the navigation bar. When I pop the .co.uk URL (our main competitor) into Open Site Explorer, I'm advised they have a 45,079 links from 475 domains. If I look at 'just discovered' links, most are from their own sites - I guess MOZ picks these up every time a new page is created. Now, these guys are huge in the UK. They rank in the top 10 for pretty much every single search term and, to put it into some kind of perspective, their Search Metrics score is 33,000 compared to our measly 160! Don't get me wrong, they do get some decent links from authoritative sites, but it seem most of their links are from their own sites. How does Google view these? Does my competitor have these thousands of 'internal' backlinks to thank for their current position? I've just checked their .kr URL and this has 12.5 million(!) links from just 450 domains. Do every single one of these links pass equity? Or does Google just look at one from each domain? Thanks, Lewis
Link Building | | PeaSoupDigital0 -
Domain Change, loss of inbound links ...
We're strongly considering a domain name change; this is purely for marketing reasons. We think in the long term, this will be a good thing. I believe we can mitigate the page redirection, branding changes, etc. My concern are the inbound links: from 200+ domains, 3,000+ links. So, I guess we can contact each of the top sites linking to us and hope they update our links. I'm not hopeful. I believe we'll loose must of the links. Has anyone been down this road and have experience to share?
Link Building | | jmueller0823
What should I expect, worst case? Is there a way to mitigate? Thanks much.
Jim0 -
Whoa 1000's of links from Industrial Interface?
Hello all! I just took over an account, and in webmater tools the site has thousands of links to its homepage from a site named http://www.industrialinterface.com. Not sure if this is a good or bad thing. (thinking bad) Tried to contact the webmaster, and the contact form does not work, so that right there is a bad sign. Does anyone have an opinion on industrial interface? Anyone have luck in reaching them? Appreciate the feedback! Dorian
Link Building | | drufast10 -
OSE shows links on sites but can't find links
Hi mozzers, I'm cleaning up our backlink profile and looking up anchortexts in OSE. I downloaded and selected one anchortext. However when I go to the sites OSE found, I can't find the links. I look in the source code and onpage keywords. Is it because of my lack of skills that I can't find the links 🙂 or isn't OSE working properly.
Link Building | | StephWeigert0 -
'Pay With A Tweet' - Yay or Nay
Is this system a good way of creating social signals? If you have something of value maybe a eBook, does using the 'Pay With A Tweet' system seem like a good one? The only thing I can think of that would be negative is if they paid with a tweet and then after removed the tweet. Cheers
Link Building | | activitysuper0 -
What's the typical response time to link building email requests?
Hello Forum, We're about to embark on a link building campaign and were curious about how long, on average, it takes to get a response to an email requesting links to our page. We're trying to come up with a timeline estimate for our campaign. Thanks
Link Building | | pano0