Help Blocking Crawlers. Huge Spike in "Direct Visits" with 96% Bounce Rate & Low Pages/Visit.
-
Hello,
I'm hoping one of you search geniuses can help me.
We have a successful client who started seeing a HUGE spike in direct visits as reported by Google Analytics. This traffic now represents approximately 70% of all website traffic. These "direct visits" have a bounce rate of 96%+ and only 1-2 pages/visit. This is skewing our analytics in a big way and rendering them pretty much useless. I suspect this is some sort of crawler activity but we have no access to the server log files to verify this or identify the culprit. The client's site is on a GoDaddy Managed WordPress hosting account.
The way I see it, there are a couple of possibilities.
1.) Our client's competitors are scraping the site on a regular basis to stay on top of site modifications, keyword emphasis, etc. It seems like whenever we make meaningful changes to the site, one of their competitors does a knock-off a few days later. Hmmm.2.) Our client's competitors have this crawler hitting the site thousands of times a day to raise bounce rates and decrease the average time on site, which could like have an negative impact on SEO. Correct me if I'm wrong but I don't believe Google is going to reward sites with 90% bounce rates, 1-2 pages/visit and an 18 second average time on site.
The bottom line is that we need to identify these bogus "direct visits" and find a way to block them. I've seen several WordPress plugins that claim to help with this but I certainly don't want to block valid crawlers, especially Google, from accessing the site.
If someone out there could please weigh in on this and help us resolve the issue, I'd really appreciate it. Heck, I'll even name my third-born after you.
Thanks for your help.
Eric
-
Hi SirMax,
Thanks for your input. I appreciate it. We'll add Wordfence to our WordPress toolbox and see if that addresses the issue.
In response to previous posts, thanks to everyone for your input. We were able to apply some filters to remove the bogus bot traffic from the analytics and normalize the data, however, this did not actually resolve the issue and in my eyes is more of a BandAid fix. The evil crawlers are still there, we just can't see them.
Thanks again for all of your input.
Eric
-
Hostname filtering does not work any more. Unfortunately most of the spammers have adapted and are using your website as hostname.
For the WordPress I use Wordfence plugin( using paid version - not affiliated with them in any shape or form beyond paying for their services). In the advance blocking you can set limits on how fast and how many pages crawlers can request. You can also block by country or ip range. It can also show you live traffic with a lot of details ( a lot more then google analytic - more like server log ). It might not be the complete remedy but it can help.
-
I wish I had an answer for how to stop the bots from hitting your site at all - I don't think a good one exists, as any solutions that wouldn't also block real human traffic to your site are going to be easy for spam bots to get around. I think your best bet is just to do everything you can to keep your data as clean as possible.
-
Hi Ruth,
Thanks a bunch for taking the time to respond to my post. Great advice. This is reassuring on a number of levels, however, it doesn't address the underlying issue of how to stop these spam bots in the first place.
We've already started the process of filtering out some of this bogus data. We'll also be integrating some WordPress plugins to see if that helps. That said, if the spam bots are hitting Analytics directly, as opposed to the actual website, WP plugins won't do anything.
Anyway, I appreciate your input and advice. Thanks so much.
Eric
-
Hi Eric,
A few things to reassure you off the bat:
- For what it's worth, there is a huge, HUGE amount of crawler spam happening in the web today. Every site I work on is being hit hard with false referrals and direct visits. I know Google Analytics is working on a solution to better filter these visits out. So I wouldn't be too concerned that it is something a competitor is doing to your site, specifically - it's more likely that it's been caught up in the general wave of spam crawlers.
- It's important to note that when we talk about Google looking at bounce rate and dwell time as part of ranking your site, those numbers are specifically from clicks through from search - that's data that Google can get without using your private web analytics data as a ranking factor, which they've said repeatedly that they don't and won't do. So a bunch of direct visits with high bounce rates will NOT affect your rankings.
So, it's not dangerous, just annoying. On to how to get that data out of your reports:
- Make sure you're not filtering out spam referrers at a View level - this can cause those visits to incorrectly appear as direct traffic.
- You could set up an Advanced Segment in Google Analytics to filter out direct visits with visit times of, say, under 5 seconds. Some real traffic may get caught in that, but it will get the noise levels down.
- The best way to filter out spam bot traffic, in my opinion, is to set up hostname filtering. Here's a post on Megalytic on how to do that: https://megalytic.com/blog/how-to-filter-out-fake-referrals-and-other-google-analytics-spam. Make sure you've also got an "Unfiltered Data" View so you'll still have historic raw data if you need it.
Hope that helps! Good luck.
-
Check webserver log files, or log visits (ip address, user agent, __utma, __utmz, possibly browser fingerprint, etc...)
Analyzing those you can easily find out if the traffic is from scraping bot or humans.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Google Analytics / Facebook UTM
Hello, I have a quick question. I am setting up conversion tracking for my Facebook ads, so I am giving each ad set a tracking URL (UTM) in order to see which ads are converting etc. Is it possible to see on analytics how much I am spending on these ads or a cost per conversion? Or does the tracking merely track general analytics data such as bounce rate, exit rate, revenue generated etc? Kind Regards, James
Reporting & Analytics | | SO_UK0 -
Google Analytics - Tracking multiple thankyou pages?
Hi Guys, I want to track email opt-ins for multiple thank you pages. The setup is as follows: http://image.prntscr.com/image/57632e05a15f42fda0b8ffec2d176460.png I have not yet built the thank you pages, so i was wondering what the URL should be to make it easy to track them in GA? I'm thinking: domain.com/thankyou-page/page1 Then using regular expression in GA to track /thankyou-page/ Would this be a good way to go about it? Cheers. f6c7r0
Reporting & Analytics | | spyaccounts110 -
How do I exclude fake direct load traffic from networks in Google Analytics?
Starting on Friday 1/20, we noticed a huge, unnatural spike in Direct Load traffic. While researching where it was coming from, the big flags were huge spikes in countries that normally only have <5 sessions a month like Russia, Singapore, Brazil, etc., each sending 1400 a week, with >99% bounce rate and <0:00:05 average session duration. While looking into networks, we saw an influx in Networks that had never sent traffic before, each with >1300 sessions a week, 100% bounce rate, and 0:00:00 session duration. The list of these Networks are: astute hosting usa incorporated
Reporting & Analytics | | ServiceMichael
nephoscale inc.
network transit holdings llc
serverbeach
coreix ltd
2ezhost llp
nforce entertainment b.v.
mir telematiki ltd
servers australia pty ltd wholesale services provider for abuse
reliablehosting
dimenoc servicos de informatica ltda
c0715718213 I have seen a lot of guides of filtering out Referral traffic, but these are all coming in as Direct Load and are skewing our Direct Load results. Any idea how to filter or remove this traffic from Google Analytics?0 -
Free Media Site / High Traffic / Low Engagement / Strategies and Questions
Hi, Imagine a site "mediapalooza dot com" where the only thing you do there is view free media. Yet Google Analytics is showing the average view of a media page is about a minute; where the average length of media is 20 - 90 minutes. And imagine that most of this media is "classic" and that it is generally not available elsewhere. Note also that the site ranks terribly in Google, despite having decent Domain Authority (in the high 30's), Page Authority in the mid 40's and a great site and otherwise quite active international user base with page views in the tens of thousands per month. Is it possible that GA is not tracking engagement (time on site) correctly? Even accounting for the imperfect method of GA that measures "next key pressed" as a way to terminate the page as a way to measure time on page, our stats are truly abysmal, in the tenths of a percentage point of time measured when compared with actual time we think the pages are being used. If so, will getting engagement tracking to more accurately measure time on specif pages and site signal Google that this site is actually more important than current ranking indicates? There's lots of discussion about "dwell time" as this relates to ranking, and I'm postulating that if we can show Google that we have extremely good engagement instead of the super low stats that we are reporting now, then we might get a boost in ranking. Am I crazy? Has anyone got any data that proves or disproves this theory? as I write this out, I detect many issues - let's have a discussion on what else might be happening here. We already know that low engagement = low ranking. Will fixing GA to show true engagement have any noticeable impact on ranking? Can't wait to see what the MOZZERS think of this!
Reporting & Analytics | | seo_plus0 -
Thankyou page tag configuration while implementing enhance ecommmerce
Hello guys, Currently i am tracking conversion in google analytic via google tag manager by doing following configuration in tag manager :- "Page URL Contains thankyoupage.html" But now i am implementing Enhance Ecommerce with tag manager with following configuration : - Custom event - "Event equals Transaction" So i want to ask if i configure "Event equals Transaction" while implementing enhance ecommerce do i have to delete this configuration "Page URL Contains thankyoupage.html" which i am using currently? Thanks! Dev
Reporting & Analytics | | devdan0 -
Huge Spike in Direct Traffic from IE7
Our site is seeing a huge spike in direct (none) traffic from IE 7 from July 8, 2014 - on. June 25 - July 7 showed 21 direct visits from IE 7; July 8 - July 20 is showing 5,889 (an increase of 27,943%). All traffic from the spike is going to our homepage. Other Google Analytics' stats for this direct (none) IE 7 traffic: Bounce Rate: 99.52%
Reporting & Analytics | | SJVC_Susie
Avg. Session Duration: 0:02
Pages/session: 1.01
Mostly all new users What's strange is that the traffic is from a variety of cities and networks. What could be causing this? Has anyone experienced this before?0 -
Bounce rate calculation
Could anyone help out with a bounce rate query please. A website of mine, an affiliate website offering product reviews, has a bounce rate of 82%. I am puzzled a little as to how this is calculated. Is it only a bounce if the user goes from the search engine to my site and then back to the search engine? Or would search engine to my sites to the affiliates site via the link on my sites also count as a bounce? With a site of this nature the goal would be to get as many people visiting the merchant as possible and so a high level of bounce from me to them would be ideal Thanks, Carl
Reporting & Analytics | | Grumpy_Carl0 -
Duicated page error
Hi, I am trying to figure out how to fix duplicated error Most of them are from wordpress "feed" Does anyone know how to fix this problem? | Wedding Photographer San Antonio | Soobumim Photography 210-863-9878 begin_of_the_skype_highlighting 210-863-9878 end_of_the_skype_highlighting http://www.soobumimphotography.com/feed/?paged=11 21 1 0 Wedding Photographer San Antonio | Soobumim Photography 210-863-9878 begin_of_the_skype_highlighting 210-863-9878 end_of_the_skype_highlighting http://www.soobumimphotography.com/feed/?paged=12 21 1 0 Wedding Photographer San Antonio | Soobumim Photography 210-863-9878 begin_of_the_skype_highlighting 210-863-9878 end_of_the_skype_highlighting http://www.soobumimphotography.com/feed/?paged=13 |
Reporting & Analytics | | BistosAmerica0