Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Gradual Increase in Domain Authority After Domain Migration But No Improvement in Organic Traffic Yet
We migrated our domain in early April and simultaneously added an SSL certificate. Everything was done by the books. All redirects implemented perfectly, very few errors. Google notified via Search Console. Despite all steps being done perfectly our domain authority dropped from 24 to 8. Organic traffic dropped from about 80 per day to about 10. Each month domain authority increases by 2 or 3. We are now back up to a DA of 16. But no improvement in organic traffic yet. At what point should organic traffic start to return? Hopefully the consistent improvement in DA is a good sign. I have been told that adding SSL and moving the domain at the same time was a very bad idea. We are starting link building next week. Hopefully that will help further. Any ideas as to when this situation will improve? Needless to say it has been awful for our business.
Intermediate & Advanced SEO | | Kingalan10 -
How to Target Country Specific Website Traffic?
I have a website with .com domain but I need to generate traffic from UK? I have already set my GEO Targeting location as UK in Google Webmasters & set country location as UK in Google Analytics as well but still, i get traffic only from India. I have also set Geo-targeting code at the backend of the website. But nothing seems works. Can anyone help me how can is do this? I am unable to understand what else can be done.
Intermediate & Advanced SEO | | seoninj0 -
Search traffic down 30% this month
Our search traffic has been growing at a steady clip for the last year but is down about 30% this month. As part of a redesign, we've repurposed our home page (blog.getvero.com). Rather than serve as a feed of recent posts, it's now an email signup page. We created a new page (blog.getvero.com/posts/) to display new posts. I think this is likely the reason for the drop in search traffic but I'm frustrated that it's losing us thousands of visitors per month. A few questions: 1. How long will it take to recover from this? 2. Is there anything we can do to speed up the recovery process? 3. Why are some of our best performing posts seeing less search traffic even though the URL hasn't changed? Any help is greatly appreciated.
Intermediate & Advanced SEO | | Nobody16116983020420 -
Drop in traffic after redesign
Is it common for a site to see slight traffic drops after a site redesign (containing cleaner code, more usability and basically just being more helpful for the end user)? A new site of ours went live last Wednesday and has experienced a drop in traffic. If you have seen this in your own site, how did you recover? And how long did the recovery take?
Intermediate & Advanced SEO | | Gordian0 -
Leaking organic traffic - how to debug?
Hi all, We've been running an eCommerce marketplace for more than 2.5 years now. Most of our traffic and revenue have been from organic traffic, which have been growing steadily with our inventory and brand, peaking at March this year. From there, we started losing organic traffic (and revenue) each month, at a rate of about 15-20% - for no reason we can understand. In addition, some of our older pages no longer appear in search results (unless we add the name of the site to the search query). We launched a redesign on the end of May, which seemed to initially improve engagement, but didn't affect this trend of lower organic traffic. Our webmaster tools doesn't show anything special - if anything, we made an effort to clean-up every 404 that appears there and other small issues. We did make the following changes very recently, but it did not seem to have a positive effect (so far): We have deep pagination for some categories of the site, and we just added rel=prev,next in the head of every paginated series on the site. We started generating a dynamic sitemap and submitted it to google. For some reason only about a fourth of the pages on the sitemap are indexed. In addition, the "index status" as reported by webmaster tools shows some weird numbers. First, the number there is way bigger than the amount of pages we have - possibly all the combinations of our listing categories and pagination. That number was constant for a while, before taking a deep earlier this year, rising back up and declining again for the last couple of months. Screenshot of the graph What would be the first steps you'd take to understand the core of the problem? we're really at a loss here.
Intermediate & Advanced SEO | | erangalp1 -
Organic Brand Traffic Tanking
Hey Guys, We recently launched a new website in late February. Since then, we have seen a drop in organic traffic from most of our top organic keywords. My major concern is the drop that we've seen from our branded keywords. Since the new site launch, our #1 organic traffic and revenue-driving keyword (brand name) dropped over 31%! It should be noted that all of our URLs changed, however, I've updated the sitemap in GWT and we have utilized 301 redirects on all old URLs. Any insights or recommendations on where I should be looking or what I should be doing? Thanks!
Intermediate & Advanced SEO | | Shorething0 -
Best way to handle traffic from links brought in from old domain.
I've seen many versions of answers to this question both in the forum, and throughout the internet... However, none of them seem to specifically address this particular situation. Here goes: I work for a company that has a website (www.example.com) but has also operated under a few different names in the past. I discovered that a friend of the company was still holding onto one of the domains that belonged to one of the older versions of the company (www.asample.com) and he was kind enough to transfer it into our account. My first reaction was to simply 301 redirect the older to the newer. After I did this, I discovered that there were still quite a few active and very relevant links to that domain, upon reporting this to the company owners they were suddenly concerned that a customer may feel misdirected by clicking www.asample.com and having www.example.com pop up. So I constructed a single page on the old domain that explained that www.asample.com was now called www.example.com and provided a link. We recently did a little house cleaning and moved all of our online holdings "under one roof" so to speak, and when the rep was going over things with the owners began to exclaim that this was a horrible idea, and that domain should instead be linked to it's own hosting account, and wordpress (or some other CMS) should be installed, and a few pages of content about the companies/subject should be posted. So the question: Which one of these is the most beneficial to the site and the business that are currently operating (www.example.com?) I don't see a real problem with any of these answers, but I do see a potentially un-needed expense in the third solution if a simple 301 will bring about the most value. Anyone else dealt with a situation like this?
Intermediate & Advanced SEO | | modulusman0 -
Any resources for targeting sites towards long-tail keywords or broad match traffic?
I've been looking around, but haven't had much luck finding info or case-studies on targeting long-tail keywords or broad match traffic. So, for example, trying to target a site about used toyotas. (Not my term, but provides a decent example) Theres more motivated traffic searching "2002 Toyota Camry" than "Used Toyotas". While Used Toyotas make more sense for a site theme from a visitor perspective, I would rather have an article on my site rank for the easier keyword of say Blue 2002 Toyota Camry. I make more money from long tail keywords than Used Toyotas. Any thoughts or references about increasing those rankings would be appreciated. Thanks.
Intermediate & Advanced SEO | | MeanGiant0