Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Major Drop in Traffic Outside AU After Migration
We have a major drop in traffic after migration especially in the US and other international countries but we are more worried about the US. Before Migration:
Intermediate & Advanced SEO | | christwix
February 3 - March 18, 2018 (44 days)
vs
After Migration:
March 20 - May 2, 2018 (44 days) See GA traffic comparison (screenshot 1) Also, based on SEMrush the traffic, the United States is down by 46% (see screenshot 2) while Australia has decreased by 12% (screenshot 3). What could be the main issue issue of this? Would really love to hear an in-depth explanation about this. Keen to hear your thoughts about this. Cheers, iNMdNNq OTq5wYn gCBmFOF0 -
My site lost a lot of traffic in the lastest update - what to do?
Hi all, there seems to have been an algorithm update on February 7. One of my big sites www.poussette.com, lost about 25 % of its organic traffic afterwards and has not revovered yet. What are the best steps to take right now? It is 7 years old we continuously did conservative SEO (technical, link building, adding content). Thanks in advance. Dieter
Intermediate & Advanced SEO | | Storesco0 -
How can I stop spam Google Organic traffic?
Hey Moz, I'm a rather experienced SEO who just encountered a problem I have never faced. I am hoping to get some advice or be pointed in the right direction. I just started work for a new client. Really great client and website. Nicer than most design/content. They will need some rel canonical work but that is not the issue here. The traffic looked great at first glance 131k visits in April. Google Analytics Acquisition Overview showed 94% of the traffic as organic. When I dug deeper and looked at the organic source I saw that Google was 99.9% of it. Normal enough. Then I looked at the time on site and my jaw dropped. 118,454 Organic New Users for Google only stayed on the site for 3 seconds. There is no way that the traffic is real. It does not match what Google Webmaster tools, Moz, and Ahrefs are telling me. How do I stop a service that is sending fake organic Google traffic?
Intermediate & Advanced SEO | | placementLabs0 -
Why is my Bing traffic dropping?
In the middle of September we launched a redesigned version of our site. The urls all stayed the same. Since site launch traffic in Google has steadily increased but Bing traffic has dropped by about 50%. Any ideas on what I should look at?
Intermediate & Advanced SEO | | EcommerceSite0 -
Is it OK to dynamically serve different content to paid and non-paid traffic from the same URL?
Hi Moz! We're trying to serve different content to paid and non-paid visitors from the same URL. Is this black hat? Here's the reason we want to do this -- we're testing a theory that paid ads boost organic rankings. This is something we saw happen to a client and we want to test this further. But we have to have a different UX that's more sparse and converts better for paid. Thanks for reading!
Intermediate & Advanced SEO | | Horizon_SEO0 -
Domain Migration of high traffic site:
We plan to perform a domain migration in 6 months time.
Intermediate & Advanced SEO | | lcourse
I read the different articles on moz relating to domain migration, but some doubts remain: Moving some linkworthy content upfront to new domain was generally recommended. I have such content (free e-learning) that I could move already now to new domain.
Should I move it now or just 2 months before migration?
Should I be concerned whether this content and early links could indicate to google a different topical theme of the new domain ? E.g. in our case free elearning app vs a commercial booking of presential courses of my core site which is somehow but not extremely strongly related) and links for elearning app may be very specific from appstores and from sites about mobile apps. we still have some annoying .php3 file extensions in many of our highest traffic pages and I would like to drop the file-extension (no further URL change). It was generally recommended to minimize other changes at the same time of domain migration, but on the other hand implementing later another 301 again may also not be optimum and it would save time to do it all at the same time. Shall I do the removal of the file extension at the same time of the domain migration or rather schedule it for 3 months later? On the same topic, would the domain migration be a good occasion to move to https instead of http at the same time, or also should we rather do this at a different time? Any thoughts or suggestions?0 -
My organic traffic has died!! Why?
Hi there, I have recently updated this website, as in new design, new URL structure, the works. The old site before I introduced the new site was doing well, as in up and down around 2 000 visitors a day. The site sort of started to go a bit down the slightest bit just before we launched the new site on the same domain but nothing of any concern. We 301ed all the old URLs to their corresponding URL structure as to keep most of the link juice from the old site, however with the new site going live the begining of this month June 2014, till now the site has dropped off to just 200 per day. I have been searching high and low for an answer and have been comming up blank. Why could this have happened, what did I do incorrectly? www.zulu.org.za
Intermediate & Advanced SEO | | ProsperoDigital0 -
Has anyone else seen a Google Plus Local listing displace a regular search listing?
I have a particular site that I have been working on for about eight months and had the site on Page 1 of Google search results for eight keywords (they are fairly small local-based keywords, so I'm really not trying to boast). Perhaps six weeks ago for two of the keywords we popped into the #2 position for Google Plus Local results. When this happened the site completely disappeared from the regular search results. A couple weeks later, the Google Plus Local listing was gone, and the site was back on Page 1 in the regular listings. This has gone back and forth several times, with either a very high Local result or a very high regular search result, but only one at a time. I suppose it would make sense for the same site to only be able to have one position on the front page at any given time, but my searches for info on this have been entirely fruitless. Has anyone else seen anything like this or have any thoughts? Cheers.
Intermediate & Advanced SEO | | IanKietzman271