Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Why Google Search Console Data is different from Moz Data for my website
Hi All I am running a website, I have been using Moz since Feb 2021. Kindly go through these pics My question is why Moz is showing 2K plus backlinks while Google search console is showing just 1253 backlinks. Why fewer links in the google search console is less? How can I increase Google search console backlinks? Also, Moz is showing 90+ DA backlinks but those websites are not showing by the Google search console. What should I do to let google consider them? m0JGjBQ.jpg Cmp8ei3.png
Link Building | | ssubodhsingh0 -
Why personal change coach not ranking for his own name, exact match
The term is tim hallbom The new website is TimHallbom.com I was under the impression that someone could take over rank for their own name. Let me know what we're doing wrong. Thanks.
Link Building | | BobGW0 -
'Too many links' on our pages.
This figure includes links that sit within our navigational menus. Is there a way to block this somehow so that Google and Moz do not read them as 'internal links'? Thanks in advance.
Link Building | | Ashley-Jacada0 -
Any benefits to having Wikipedia links now they are 'no-followed' (apart from traffic and natural link prof.)
I see that Wikipedia outbound links are all no-followed, is there any benefit (aside from the traffic) for having links here now ? For example is their co-citation and co-occurance benefits. I know there is without the links since from seeing previous Moz content about this saying Google getting good at connecting brand/s and topic mentions on a page (without any links) so appreciate Wikipedia is still good for that sort of thing. And a no-followed link is obviously good for the potential traffic. But is there any additional SEO benefit to having a no followed link on a wikipedia entry/stub too ? (aside from its contribution to your no-followed links which in turn contribute to a natural looking link profile) Cheers Dan
Link Building | | Dan-Lawrence0 -
Sales agents posting job listings
So we are in the middle of a hiring blitz looking for sales agents in different areas of the country. One of our internal sales agents used a program called Job-a-matic which appears to be a product developed or sponsored by SimplyHired. This software seems to have put our job posting on a lot strange sites really unrelated to our industry and they job description such as: truck driver jobs teen jobs white collar jobs jobs a go-go database admin jobs medical coding and many many more When I Google our company name I see all of these listing on pages 3, 4, 5 etc . Is it possible that Google may see these listings as spammy or paid links? It appears as if these listing were just automatically generated and not really controlled in anyway. I'm worried that they could have generated a penalty as we did experience a significant drop in traffic. Thanks -Brandon
Link Building | | brandzz0 -
What's a reasonable time to start ranking for a niche site?
I launched a niche site to go through the whole process, for fun and profit. Started with keyword research (3-6 word long tail keywords), have done first 3 posts and done minimal amount of link building. Have gotten about a dozen visitors in the first 5 weeks and ranking on page 2. Is this ok or should I be more aggressive with link building (assuming my keyword research is done ok)?
Link Building | | AndrusPurde1 -
My Yahoo Directory listing isn't being indexed by Google - what do to?
I couldn't find this elsewhere in Q&A...so here goes. I recently (a couple of months back) shelled out $300 for a Yahoo Directory listing. My site got included, but my PageRank didn't budge. Figuring it may take a while, I kept on checking Google's index for the link - still nothing. Now it seems as if some Yahoo & other directory categories are excluded from the Google Index, rendering the links useless from a PR point of view. Anything I can do about this? I've heard of suggestions like linking to the specific page of the category where I'm listed, in the hope Google will crawl and re-index, but I don't know. Any thoughts / suggestions? BTW the directory category link is: http://dir.yahoo.com/Business_and_Economy/Business_to_Business/Communications_and_Networking/Telecommunications/Wireless/Software. Thanks in advance...
Link Building | | dusanb0 -
What percentage of an old post can I change without lose rank?
A few months ago I updated some old posts of my site (Wordpress), to improve old content, with new pictures (better quality), better content, new info about product, and links to our review of this product (newer posts). Most part of the content is this old posts were new, so I "deleted" the old text but replaced the new one with another of better quality. I suppose it will help new post to improve rank, but I los thousands of daily visits, because old posts ranks worse since the change. What do you think? What happened here? Maybe I shouldn't update the whole text, only add a link to new content? Thank you!
Link Building | | DSG0