Regular Expressions for Filtering BOT Traffic?
-
I've set up a filter to remove bot traffic from Analytics. I relied on regular expressions posted in an article that eliminates what appears to be most of them.
However, there are other bots I would like to filter but I'm having a hard time determining the regular expressions for them.
How do I determine what the regular expression is for additional bots so I can apply them to the filter?
I read an Analytics "how to" but its over my head and I'm hoping for some "dumbed down" guidance.
-
No problem, feel free to reach out if you have any other RegEx related questions.
Regards,
Chris
-
I will definitely do that for Rackspace bots, Chris.
Thank you for taking the time to walk me through this and tweak my filter.
I'll give the site you posted a visit.
-
If you copy and paste my RegEx, it will filter out the rackspace bots. If you want to learn more about Regular Expressions, here is a site that explains them very well, though it may not be quite kindergarten speak.
-
Crap.
Well, I guess the vernacular is what I need to know.
Knowing what to put where is the trick isn't it? Is there a dummies guide somewhere that spells this out in kindergarten speak?
I could really see myself botching this filtering business.
-
Not unless there's a . after the word servers in the name. The . is escaping the . at the end of stumbleupon inc.
-
Does it need the . before the )
-
Ok, try this:
^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.|rackspace cloud servers)$|gomez
Just added rackspace as another match, it should work if the name is exactly right.
Hope this helps,
Chris
-
Agreed! That's why I suggest using it in combination with the variables you mentioned above.
-
rackspace cloud servers
Maybe my problem is I'm not looking in the right place.
I'm in audience>technology>network and the column shows "service provider."
-
How is it titled in the ISP report exactly?
-
For example,
Since I implemented the filter four days ago, rackspace cloud servers have visited my site 848 times, , visited 1 page each time, spent 0 seconds on the page and bounced 100% of the time.
What is the reg expression for rackspace?
-
Time on page can be a tricky one because sometimes actual visits can record 00:00:00 due to the way it is measured. I'd recommend using other factors like the ones I mentioned above.
-
"...a combination of operating system, location, and some other factors can do the trick."
Yep, combined with those, look for "Avg. Time on Page = 00:00:00"
-
Ok, can you provide some information on the bots that are getting through this that you want to sort out? If they are able to be filtered through the ISP organization as the ones in your current RegEx, you can simply add them to the list: (microsoft corp| ... ... |stumbleupon inc.|ispnamefromyourbots|ispname2|etc.)$|gomez
Otherwise, you might need to get creative and find another way to isolate them (a combination of operating system, location, and some other factors can do the trick). When adding to the list, make sure to escape special characters like . or / by using a \ before them, or else your RegEx will fail.
-
Sure. Here's the post for filtering the bots.
Here's the reg x posted: ^(microsoft corp|inktomi corporation|yahoo! inc.|google inc.|stumbleupon inc.)$|gomez
-
If you give me an idea of how you are isolating the bots I might be able to help come up with a RegEx for you. What is the RegEx you have in place to sort out the other bots?
Regards,
Chris
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce replatforming and redirects - how much traffic will I lose?
Hey there,I'm looking to hear your experiences in regards to replatforming an ecommerce store and SEO impacts.My company is analyzing the impacts of switching from Magento Entreprise to Shopify Plus. Some background info : 900k sessions / month 52% of sessions coming from SEO Multilingual store : half of traffic is French, half is English 945 domains linking to us, according to search console Competitive industry (retail) Moving to Shopify would force us to do two things: Redirect all category pages, brand pages and product pages. Shopify forces a specific URL structure for these pages that is different from our current one. Redirect the English section of the site to a subdomain (https://en.example.com/...). Have multiple stores on Shopify can't be done on the same domain. I'm especially afraid of the impact of moving the English section to a subdomain. I feel it would lose most of the domain authority - most backlinks go to the website root so very few will be redirected to the subdomain.Even if we spend a lot of time doing redirections, do you think the traffic will significantly suffer? Do you have stats to share on a similar migration you would have done, or other insights?Thanks a lot!
Intermediate & Advanced SEO | | Cheebee1540 -
website Based in India But need traffic from Europe and North America
We are based in India but have all our prospective clientele in Europe and North America. The problem is ; despite all our efforts we are getting almost 60% traffic from India which is not our target region. We have already tried following hosting our website on US server adding GB and US language tags webmaster target region only allows one country so we cannot set the target there Apart from this any other suggestion? Prashant
Intermediate & Advanced SEO | | TPS20130 -
How can I track traffic source for each user?
We received an enquiry on one of our landing pages and I am trying to track down where that user come from? Whether he came from social networks or search engines and if it is from search engine which keywords he used etc.. Does anyone know if there is any way I could see that?
Intermediate & Advanced SEO | | Rubix0 -
Google Analytics: how to filter out pages with low bounce rate?
Hello here, I am trying to find out how I can filter out pages in Google Analytics according to their bounce rate. The way I am doing now is the following: 1. I am working inside the Content > Site Content > Landing Pages report 2. Once there, I click the "advanced" link on the right of the filter field. 3. Once there, I define to "include" "Bounce Rate" "Greater than" "0.50" which should show me which pages have a bounce rate higher of 0.50%.... instead I get the following warning on the graph: "Search constraints on metrics can not be applied to this graph" I am afraid I am using the wrong approach... any ideas are very welcome! Thank you in advance.
Intermediate & Advanced SEO | | fablau0 -
Question about putting high traffic keywords in my Primary navigation menu.
Hello, I seem to be having a bit of a dilemma with making a crucial site architecture decision about which high traffic keyword I should put in my primary navigation menu. I am the owner of a computer repair business that I am currently re branding out of necessity for a few reasons. My existing business website has been established for the past 5 years now and I do all of the SEO and have been on the 1st Page of GOOGLE for anything computer repair related since day 1 however, like I said am re branding my company and migrating from Joomla to WordPress so it is a great time to make some positive and effective changes to my site architecture. I am going to be using the Silo Site Architecture on the new Site and I have a very firm working knowledge on the process but I seem to have hit a snag or dilemma with one of my Primary Navigation Categories for the Silo Theme. My specif question is this please.. Doing keyword research the Keyword Phrase "Computer Repair" is the most highly searched for keyword phrase for people that have computer related problems (naturally) and Ideally "Computer Repair" should be one of my Main Menu Navigation Silo Category Themes. But... here lies the problem.... If I go with "Computer Repair" in the (Main Nav Menu) then although it gets - 823,000 Local Monthly Searches I would be opening myself up to a potential problem because normally, most people associate the Phrase Computer Repair with Desktop Computer Repair. So in essence I would be forced to use an alternate other than "Computer Repair" for the Desktop Computer Repair structure in the Silo Theme (Sidebar Nav Menu). The Keyword Phrase "Desktop Repair" gets only - 12,100 Local Monthly Searches so basically no one uses the Search Phrase "Desktop Repair. when they are looking to get their computer repaired. I hope that I did not just confuse you? Still confused? Continue reading and I will dissect my psycho babble for you..... "The Semantic Historical Logic" Historically, a Desktop has always been referred to as a computer. Hence the reason why even still today, when our "Desktop" has problems and we need to get it fixed, we Search for "Computer Repair". Why is that? That's a very good question and here is "exactly" why. Long before we had Laptops, Netbooks, Tablets and Smart Phones we had the all encompassing and mighty "Computer" that allowed us to connect to the rest of the world. It was not until Laptops actually came about where there was a need to assign an actual _"Classification System"_and all mighty and powerful "Computer" became a "Desktop Computer**"!!! ** So, there you have it. This is the reason why "Computer Repair" is synonymous with "Desktop Repair" and why "NO ONE" searches for desktop repair when their Desktop Computer is broken! ============================================================= ACTUAL EXAMPLES WITH SCREEN SHOTS BELOW! If I go with Example A: I have the the Highest Traffic Keyword Phrase in my Mast Head (Main Nav Menu) but would be forced to use Desktop Repair to classify (Desktop Repair) in (Sidebar Nav Menu) instead using the keyword phrase "Computer Repair" to classify Desktop Repair. Example A: Main Nav Theme Category = "Computer Repair" = 823,000 Monthly Loc Child Pages/ Categories = -Desktop Repair = **12,100 ** Monthly Loc -Laptop Repair = 165,000 Monthly Loc -Tablet Repair = 165,000 Monthly Loc -Remote Desktop = 1,000,000 Monthly Loc I am using WordPress - (Pages / Child Pages) not Categories & Posts! So, as you can see from (Example A:) above, not being able to use the keyword phrase "Computer Repair" to classify the "Desktop Repair" section kind of opens me up for failure to a good extent as most of my business is done on regular desktop computers which people generally think "Computer Repair" when they are searching to have their Desktop Repaired. ============================================================= Example B: Main Nav Theme Category = **"Computer Service" = **246,000 Monthly Loc Child Pages/ Categories = -Computer Repair = 823,000 Monthly Loc -Laptop Repair = 165,000 Monthly Loc -Tablet Repair = 165,000 Monthly Loc -Remote Desktop = 1,000,000 Monthly Loc I am using WordPress - (Pages / Child Pages) not Categories & Posts! Now, with (Example B:) even though the keyword Phrase "Computer Service" is not the more favorable item to have as the Silo Theme Category in the Main Navigation Menu, we can see that it is much more favorable in terms of Local Monthly Searches over the just about non searched for phrase "Desktop Repair" So as you see, I have a bit of a dilemma that a more experienced SEO could counsel me on. The question is, through your experience, which scenario would you see as more favorable for the site Architecture example A: or example B: This brings me to my next question that also creates some confusion for me. If you say I think (Example B:) would be my better bet what would you recommend that I do with the URL Structure if "Computer Service" is the Parent Page for the Silo Theme? Example: I am using the /%category%/%postname%/ permalink structure for the Silo Site Architecture for the (Blog Section) only - and am using WP Pages and Child Pages for my Silo Content for my Services (Not Posts). Would this URL be a problem in Googles eyes or a customers eyes and be perceived as SPAMMY ... http://www.pcmedicsoncall.com/computer-services/computer-repair/ More than likely, I would say yes because it looks that way to me! My question to you in regards to the link structure above is, If I take the "Computer Service" page and change the "SLUG" to (services) yes it will look better but... will that effectively work against me??? EDIT: ^^ Answered my own question on the Services deal directly above. ^^ Thank you for reading my very long winded questions but I am pretty detailed and I think that the better that I explained it the less writing and guessing what I meant would be better for all concerned (typing wise) Thank you very much and I look forward to your insightful expertise and wisdom. Marshall COMPUTER-SERVICE-MAST-HEAD.png COMPUTER-SERVICE-MAST-HEAD.png
Intermediate & Advanced SEO | | MarshallThompson310 -
Block search bots on staging server
I want to block bots from all of our client sites on our staging server. Since robots.txt files can easily be copied over when moving a site to production, how can i block bots/crawlers from our staging server (at the server level), but still allow our clients to see/preview their site before launch?
Intermediate & Advanced SEO | | BlueView13010 -
Is traffic and content really important for an e-commerce site???
Hi All, I'm maintaining an e-commerce website and I've encountered some related keywords that I know will not convert to sales but are related to the subject and might help becoming an "authority". I'll give an example... If a car dealership wrote an amazing article about cleaning a car.
Intermediate & Advanced SEO | | BeytzNet
Obviously it is related but the chances of someone looking to clean his car will go ahead and buy one now are quite low. Also, he will probably bounce out of this page after reading the piece. To conclude, Would such an article do GOOD (helping to become an authority and having more visitors) or BAD (low conversion rate and high bounce rate)? Thanks0 -
Whats your regular routine ? Can we learn new things from each other
I tend to work on the on page changes first of all following keyword research. Then take a look at some internal linking, Setup a wordpress blog on /blog or sub domain and get my copywriter to start adding regular content . Next stage is link building Old fashioned emails requests, blog comments taking a look through existing sites we own for relevant places. On going analysis once positions change.
Intermediate & Advanced SEO | | onlinemediadirect0