Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
HELP! Google is penalizing me and I can't figure out why.
I've come to the conclusion Google is definitely penalizing me for the keword "sensaphone" and I cannot figure out why. I've checked back links etc. and I don't see anything that would raise a flag. We rate very well in Bing etc. for the same keyword, but in Google SERP's will be on the 3rd or later page. Our domain is - www.absoluteautomation.com - any ideas?
Link Building | | absoauto1 -
MOZ.com
Hey friend! Have fun exploring Q&A, but in order to ask your own questions, comment, or give thumbs up, you need to sign up for Moz Analytics. You can also earn access by getting 500MozPoints by participating in YouMozand the Moz Blog!
Link Building | | njhypnotherapy0 -
Perhaps the SEOMoz beginner's guide should be updated?
I was on the phone tonight with a site owner who may have been affected by Penguin. I was explaining how overuse of keywords with anchor text can adversely affect a site and he asked me an interesting question: "If that is the case, then why does the SEOMoz guide say differently?" Here are some quotes from the SEOMoz beginner's guide: "Anchor Text - One of the strongest signals the engines use in rankings is anchor text. If dozens of links point to a page with the right keywords, that page has a very good probability of ranking well for the targeted phrase in that anchor text." That pretty much implies that you should try to get as many keyword anchor texted links to your site as possible. There should probably be some type of Penguin warning in here. "Self-Created, Non-Editorial - Hundreds of thousands of websites offer any visitor the opportunity to create links through guest book signings, forum signatures, blog comments, or user profiles. These links offer the lowest value, but can, in aggregate, still have an impact for some sites. " Granted, the text goes on to say this: "In general, search engines continue to devalue most of these types of links, and have been known to penalize sites that pursue these links aggressively. Today, these types of links are often considered spammy and should be pursued with caution." That's a good warning, but I think it could be perhaps made a little more clear that self made links like this are not wise to use as an SEO tool. I think that to a beginner, it kind of sounds like the guide is saying, "There are lots of opportunities out there where you can get a link by making one yourself. Go ahead and use them but don't overdo it."
Link Building | | MarieHaynes6 -
Changing url from non www to www.
basically I have realised i have more page authority and links going to www.onestopmuscle.co.uk instead of onestopmuscle.co.uk. I use wordpress and have no clue how to make sure onestopmuscle.co.uk redirects to www. version. anyone have any ideas? I don't want to mess about with files or i'll most likely make the blog into a disaster.
Link Building | | FLEAR0 -
'Pay With A Tweet' - Yay or Nay
Is this system a good way of creating social signals? If you have something of value maybe a eBook, does using the 'Pay With A Tweet' system seem like a good one? The only thing I can think of that would be negative is if they paid with a tweet and then after removed the tweet. Cheers
Link Building | | activitysuper0 -
How do I help an author of family histories and biography's reach her niche's
Please have a look at this site for me http://www.louisewilson.com.au/ louise also has about six blogs on different books she has written. She has sold a few thousand copies which is great and has help people find out where they came from. Not many of these people have linked to her site's and she is not getting the traffic she deserve's for lots of long tail keywords and some broader ones. What simple on page changes can she make and what would be the best broader keyword's to go after. What would a good strategy be she has a very small ( tiny ) budget but can write and is enthusiastic , It's also a great way for her to be involved with people who are interested in simalar thing's. Oh yeah she has never used facebook of social networks how could she effectivly market her books and engage with potential reader's. Thanks in advance Oh and by the way Im not getting paid to help her Just incase you think i'm trying to get you to do my work for me. I just think this is an interesting case. I do do sales for an IT and SEO company but she is not one of our clients. So far I have just explained to her about keywords in titles meta tags and internal linking and just explained to her a bit about link building. But really need some help thanks very much. PS was this question too long?
Link Building | | duncan2740 -
I'm interested in knowing link building strategies for regional businesses.
I'm not just interested in sites to target, but also how to manage anchor text when you are targeting phrases that include a keyword + a geo modifier. Thanks!
Link Building | | medtouch0 -
What's your favorite link building tactic?
What's your favorite 100% white hat link building tactic? Well, maybe you don't want to reveal your favorite...just a good one...
Link Building | | AdamThompson0