Why Moz OSE, Ahrefs, Majestic and so on, don't change their user agent while crawling?
-
Some blackhat websites, PBNs and other "cheaters" are using various methods to effectively block third party backlink checker bots (OSE, Ahrefs, Majestic...) : robot.txt, IP and such.
A simple solution for those bots would be to mimic Google by using its user agent string for example.
Or if not legally permitted (which I doubt) use some kind of randomness in user agent strings, urls, and IPs in order to prevent blocking.This should not be a big deal IMHO, am I missing something obvious ?
-
The ethics of the Internet dictate that you
- crawl politely,
- obey robots.txt and
- properly identify yourself
This isn't a new issue. Link networks and sites have blocked crawlers and manipulated Google for years. Fortuneatly, it's only a small fraction of the web. Also, it unlikely links from those networks have much value, so crawl priority would be super low anyway.
Actually, it could be viewed as beneficial when blackhat sites block OSE and aHrefs, because those sites often get penalized by Google, but 3rd party crawlers have no way to know this, so blocking effectively keeps them out of the indexes.
-
Well, I think bot blocking is an obvious problem even now, and will be more important tomorrow with all private networks as you can imagine.
MOZ (and others) should find and implement the best possible solution, I see no problem with TAGFEE as soon as you are transparent with regards to the fact that your bots are undetectable.
I understand that what I'm proposing is maybe not best nor wanted solution, but the problem must be addressed or OSE will soon have no value at all
What do you propose ?
-
I agree with George here -- we'd hear a huge outcry if we pretended to be Googlebot or a different bot. We'd also likely get blocked, as sometimes people only let in a certain few known bots/IPs to crawl their site. If we changed user agents and IPs regularly, it would not be cool or TAGFEE.
-
What about using different user agents and IPs regurarly in order to avoid detection ?
Is there any acceptable other solution ?
-
The reputation and integrity of the major players would be at stake here. If they changed their user agent identification (to spoof Googlebot or Bing or whatever) that could be detected, and they would be castigated. The crawler IP address and its user agent ID would be out of sync...
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Moz crawling doesn't show all of my Backlinks
I'm trying to make an SEO backlinks and anchors report on my website When using the Link Explorer, I see no backlinks to show while I have create much more backlinks on this website. How to fix the issue? How can I check and correct report of my backlinks? Website is www.poolcleanerspro.com I also need some how to track a keywords?
Link Building | | fassi345621 -
Question about reciprocal link building. I'm not an SEO professional, just a local service business owner.
I did a link page on my website 13 years ago and never took it down. Should we scratch that page all together? Is it ok with Google to do a page on Recommended local service providers. Maybe I can keep some of those reciprocal links if that's the case...
Link Building | | FVLMS0 -
My Domain has a couple of badlinks decreasing my rankings, will disavowing them reduce my Domain Authority on Moz?
Good day Every Body, I have a heart aching issue, my site (nightwatchng.com) amassed a number carnivorous backlinks, I have lost rankings, i studied my search traffic and discovered that I have been hit by Google Penguin Algorithm Penalty, I was forced to believe that those backlinks were built to my site on purpose just so my rankings will drop, I know the importance of link building and thats why i follow the white hat technique. Now the big question is, IF I DISAVOW THESE TERRIBLE LINKS FROM GOOGLE SEARCH, WILL MY MOZ DOMAIN AUTHORITY DROP FROM WHAT IT CURRENTLY HAS?? I also want to know if the Algorithm Penalty will affect the subdomain (news nightwatchng.com) of my site?
Link Building | | Newswatchng0 -
Spam links in spam analyses tool: Referral page doesn't exist anymore
Hi, I have the following problem: The spam analyses tool in MOZ gives me a few links with a spam score of 10 and 9. I want to remove those links and try to contact the webmaster before disavowing the links. I read about being careful with too much disavowing. Unfortunately removing the links isn't possible because three different reasons: The page/subdomain of one referring website that should have a link to my website doesn't exist anmymore (compliance.ikkiesvoorverandering.nl/). The website http://ikkiesvoorverandering.nl does exist, but is almost empty and there is no possibility to contact the webmaster. What do you guys recommend to do in this situation? Any experience with this problem? On another website (with again no contact option) there is a link who directs to a former website, which has a redirect to my current website. I have no Google search console on that old website anymore. Does that make it impossible to disavow the link? What are other option to remove this link? The website that comes up in the spam analyses tool is pretty much empty (http://veensma.jouwbegin.nl/). How can I delete a link from an empty website? Any thoughts? I hope someone can help me with this, Would be glad to here from you 🙂 Lars
Link Building | | DPA0 -
Perhaps the SEOMoz beginner's guide should be updated?
I was on the phone tonight with a site owner who may have been affected by Penguin. I was explaining how overuse of keywords with anchor text can adversely affect a site and he asked me an interesting question: "If that is the case, then why does the SEOMoz guide say differently?" Here are some quotes from the SEOMoz beginner's guide: "Anchor Text - One of the strongest signals the engines use in rankings is anchor text. If dozens of links point to a page with the right keywords, that page has a very good probability of ranking well for the targeted phrase in that anchor text." That pretty much implies that you should try to get as many keyword anchor texted links to your site as possible. There should probably be some type of Penguin warning in here. "Self-Created, Non-Editorial - Hundreds of thousands of websites offer any visitor the opportunity to create links through guest book signings, forum signatures, blog comments, or user profiles. These links offer the lowest value, but can, in aggregate, still have an impact for some sites. " Granted, the text goes on to say this: "In general, search engines continue to devalue most of these types of links, and have been known to penalize sites that pursue these links aggressively. Today, these types of links are often considered spammy and should be pursued with caution." That's a good warning, but I think it could be perhaps made a little more clear that self made links like this are not wise to use as an SEO tool. I think that to a beginner, it kind of sounds like the guide is saying, "There are lots of opportunities out there where you can get a link by making one yourself. Go ahead and use them but don't overdo it."
Link Building | | MarieHaynes6 -
Is anybody else noticing a dramatic change to their 'links to your site' section in Google Webmaster Tools?
Hey,
Link Building | | ChrisHolgate
Over the last six months or so we've been going through our backlink profile and cleaning up links from poor quality sources. Week by week there have been small changes in our Google Webmaster Tools 'links to your site' section to reflect this. I logged on this morning however and there has been a dramatic shift in the information displayed. Pretty much every bad link has been removed from the list including sites I know for a fact are still linking to us as they didn't communicate at all to our removal requests. Additionally, rather than showing the top 1000 links to our site as it used to, WMT is only showing 73 linking domains. The remaining 73 domains are good natural links from high quality sources. I'm guessing Google are just in the middle of an update and that the remaining linking domains (including the bad ones) will reappear shortly. This isn’t a request for advice or help but I’m just curious as to whether anybody else is seeing anything similar?0 -
Why don't some external links "count"?
A car-dealer client advertises through DexKnows, and the entry includes a link to the client's website. That link is not listed as a linking domain through OpenSite Explorer for the client's site. A competing car dealer also advertises through DexKnows, but that link is counted as a linking domain for the competitor. Why the difference? (I'm new and still learning -- linkbuilding appears to be my weakness.) Thank you!
Link Building | | TheOptimizer690 -
Why aren't links to my site from YouTube showing up in my link reports?
Our company is paying for a service to create hundreds of videos/month for us with links back to our primary domain. If I go to YouTube I see the videos with the links, but YouTube does not show in my link reports. Why not?
Link Building | | jtroia0