Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
(Urgent) losing traffic after 301 redirect
We face a seo problem of losing traffic after 301 redirect.We have used 301 redirect from a sub-domain url to main domain, after a few month, we discovered that the traffic in google is dropped 40% as well as yahoo dropped 50% without reason, we have updated sitemap already, but we cannot find any reason for the traffic dropped till now..The original url (more then 5000 links)https://app.example.com/ebook Redirected Urlhttps://www.example.com/ebookThank you for your help!
Intermediate & Advanced SEO | | yukung0 -
Is it a good strategy to link older content that was timely at one point to newer content that we would prefer to guide traffic and value to
Hi All, I've been working for a website/publisher that produces good content and has been around for a long time but has recently been burdened by a high level of repetitious production, and a high volume in general with pages that don't gather as much traffic as desired. One such fear of mine is that every piece published doesn't have any links pointing to when it is published outside of the homepage or syndicated referrals. They do however have a lot (perhaps too many) outbound internal links away from it. Would it be a good practice, especially for new content that has a longer shelf life, to go back to older content and place links pointing to the new one? I would hope this would boost traffic via internal recircultion and Page Authority, with the added benefits of anchor text boosts.
Intermediate & Advanced SEO | | ajranzato91 -
Is there anything I need to worry about if... We show/hide header navigation based upon visit from external traffic?
Scenario: So imagine if LinkedIn turned off their main navigation/header if you landed on your personal profile via a search engine or via an external link. But if you were on LinkedIn when you found it, the navigation remains the same.
Intermediate & Advanced SEO | | mysitesrock0 -
Is it OK to dynamically serve different content to paid and non-paid traffic from the same URL?
Hi Moz! We're trying to serve different content to paid and non-paid visitors from the same URL. Is this black hat? Here's the reason we want to do this -- we're testing a theory that paid ads boost organic rankings. This is something we saw happen to a client and we want to test this further. But we have to have a different UX that's more sparse and converts better for paid. Thanks for reading!
Intermediate & Advanced SEO | | Horizon_SEO0 -
Spike then Drop in Direct Traffic?
We've been doing some SEO work over the last few weeks and earlier this week we saw a large spike in traffic. Yay we all thought, but then yesterday the traffic levels returned to pre-celebratory levels. I've been doing some digging to try and find out what was different Monday and Tuesday this week. Mondays are usually big traffic days for us anyway, but this week was by far the biggest, and Tuesday was even higher still, our best day ever. After some poking, I found that the direct traffic followed the same pattern as our overall traffic levels (image attached). The first spike coincides with an email we sent out that day, but the later spike we just don't know where it came from? I understand loosely that direct isn't easily traceable, but can anyone help us understand more about this second spike? Thanks! ayqL2wi
Intermediate & Advanced SEO | | HB170 -
Drop in traffic after redesign
Is it common for a site to see slight traffic drops after a site redesign (containing cleaner code, more usability and basically just being more helpful for the end user)? A new site of ours went live last Wednesday and has experienced a drop in traffic. If you have seen this in your own site, how did you recover? And how long did the recovery take?
Intermediate & Advanced SEO | | Gordian0 -
Question about putting high traffic keywords in my Primary navigation menu.
Hello, I seem to be having a bit of a dilemma with making a crucial site architecture decision about which high traffic keyword I should put in my primary navigation menu. I am the owner of a computer repair business that I am currently re branding out of necessity for a few reasons. My existing business website has been established for the past 5 years now and I do all of the SEO and have been on the 1st Page of GOOGLE for anything computer repair related since day 1 however, like I said am re branding my company and migrating from Joomla to WordPress so it is a great time to make some positive and effective changes to my site architecture. I am going to be using the Silo Site Architecture on the new Site and I have a very firm working knowledge on the process but I seem to have hit a snag or dilemma with one of my Primary Navigation Categories for the Silo Theme. My specif question is this please.. Doing keyword research the Keyword Phrase "Computer Repair" is the most highly searched for keyword phrase for people that have computer related problems (naturally) and Ideally "Computer Repair" should be one of my Main Menu Navigation Silo Category Themes. But... here lies the problem.... If I go with "Computer Repair" in the (Main Nav Menu) then although it gets - 823,000 Local Monthly Searches I would be opening myself up to a potential problem because normally, most people associate the Phrase Computer Repair with Desktop Computer Repair. So in essence I would be forced to use an alternate other than "Computer Repair" for the Desktop Computer Repair structure in the Silo Theme (Sidebar Nav Menu). The Keyword Phrase "Desktop Repair" gets only - 12,100 Local Monthly Searches so basically no one uses the Search Phrase "Desktop Repair. when they are looking to get their computer repaired. I hope that I did not just confuse you? Still confused? Continue reading and I will dissect my psycho babble for you..... "The Semantic Historical Logic" Historically, a Desktop has always been referred to as a computer. Hence the reason why even still today, when our "Desktop" has problems and we need to get it fixed, we Search for "Computer Repair". Why is that? That's a very good question and here is "exactly" why. Long before we had Laptops, Netbooks, Tablets and Smart Phones we had the all encompassing and mighty "Computer" that allowed us to connect to the rest of the world. It was not until Laptops actually came about where there was a need to assign an actual _"Classification System"_and all mighty and powerful "Computer" became a "Desktop Computer**"!!! ** So, there you have it. This is the reason why "Computer Repair" is synonymous with "Desktop Repair" and why "NO ONE" searches for desktop repair when their Desktop Computer is broken! ============================================================= ACTUAL EXAMPLES WITH SCREEN SHOTS BELOW! If I go with Example A: I have the the Highest Traffic Keyword Phrase in my Mast Head (Main Nav Menu) but would be forced to use Desktop Repair to classify (Desktop Repair) in (Sidebar Nav Menu) instead using the keyword phrase "Computer Repair" to classify Desktop Repair. Example A: Main Nav Theme Category = "Computer Repair" = 823,000 Monthly Loc Child Pages/ Categories = -Desktop Repair = **12,100 ** Monthly Loc -Laptop Repair = 165,000 Monthly Loc -Tablet Repair = 165,000 Monthly Loc -Remote Desktop = 1,000,000 Monthly Loc I am using WordPress - (Pages / Child Pages) not Categories & Posts! So, as you can see from (Example A:) above, not being able to use the keyword phrase "Computer Repair" to classify the "Desktop Repair" section kind of opens me up for failure to a good extent as most of my business is done on regular desktop computers which people generally think "Computer Repair" when they are searching to have their Desktop Repaired. ============================================================= Example B: Main Nav Theme Category = **"Computer Service" = **246,000 Monthly Loc Child Pages/ Categories = -Computer Repair = 823,000 Monthly Loc -Laptop Repair = 165,000 Monthly Loc -Tablet Repair = 165,000 Monthly Loc -Remote Desktop = 1,000,000 Monthly Loc I am using WordPress - (Pages / Child Pages) not Categories & Posts! Now, with (Example B:) even though the keyword Phrase "Computer Service" is not the more favorable item to have as the Silo Theme Category in the Main Navigation Menu, we can see that it is much more favorable in terms of Local Monthly Searches over the just about non searched for phrase "Desktop Repair" So as you see, I have a bit of a dilemma that a more experienced SEO could counsel me on. The question is, through your experience, which scenario would you see as more favorable for the site Architecture example A: or example B: This brings me to my next question that also creates some confusion for me. If you say I think (Example B:) would be my better bet what would you recommend that I do with the URL Structure if "Computer Service" is the Parent Page for the Silo Theme? Example: I am using the /%category%/%postname%/ permalink structure for the Silo Site Architecture for the (Blog Section) only - and am using WP Pages and Child Pages for my Silo Content for my Services (Not Posts). Would this URL be a problem in Googles eyes or a customers eyes and be perceived as SPAMMY ... http://www.pcmedicsoncall.com/computer-services/computer-repair/ More than likely, I would say yes because it looks that way to me! My question to you in regards to the link structure above is, If I take the "Computer Service" page and change the "SLUG" to (services) yes it will look better but... will that effectively work against me??? EDIT: ^^ Answered my own question on the Services deal directly above. ^^ Thank you for reading my very long winded questions but I am pretty detailed and I think that the better that I explained it the less writing and guessing what I meant would be better for all concerned (typing wise) Thank you very much and I look forward to your insightful expertise and wisdom. Marshall COMPUTER-SERVICE-MAST-HEAD.png COMPUTER-SERVICE-MAST-HEAD.png
Intermediate & Advanced SEO | | MarshallThompson310 -
Has anyone else seen a Google Plus Local listing displace a regular search listing?
I have a particular site that I have been working on for about eight months and had the site on Page 1 of Google search results for eight keywords (they are fairly small local-based keywords, so I'm really not trying to boast). Perhaps six weeks ago for two of the keywords we popped into the #2 position for Google Plus Local results. When this happened the site completely disappeared from the regular search results. A couple weeks later, the Google Plus Local listing was gone, and the site was back on Page 1 in the regular listings. This has gone back and forth several times, with either a very high Local result or a very high regular search result, but only one at a time. I suppose it would make sense for the same site to only be able to have one position on the front page at any given time, but my searches for info on this have been entirely fruitless. Has anyone else seen anything like this or have any thoughts? Cheers.
Intermediate & Advanced SEO | | IanKietzman271