Stripping Out Referral Spam From Past Reports
-
Hi,
I'm looking to confirm the best approach for retroactively stripping away referral spam (free buttons, SEMalt, etc.). Now to be clear, I already have filters in place to ignore them from current stats, so moving forward I'm fine. However, I'd love to go back and check untainted stats.
I've setup segments using a regex to strip the root words away and it seems to be working. I have a regex setup to strip out things like: social-buttons|seoanalyses|copyrightclaims|classifiedads|jobsense|free-share-buttons|e-buyeasy|acrobats.hol|cheap-online|amezon|search-help|qut-smoking and so forth.
I've been going through my referral data, noticing obvious spam, and adding their domains to my segment. Is this the optimal way for me to get a clear, untainted view of my past stats?
-
Sweet, glad to hear our filters will suffice. Thanks for the input, Daniel.
-
Hey, no worries and you're right that your filters should block them as well. Using .htaccess would be just an additional defense mechanism but may not be necessary.
-
Hi Daniel,
Thanks again for the response. What would be the difference in Analytics data between my filters and going straight to .htaccess? If the data is the same, is there an additional benefit to .htaccess?
For regular users, I'd suspect less bandwidth since they can't load my domain, but I don't think these bots actually load the page or visit.
-
I would use your .htaccess file to block them with the following code (this would for example block referrals from semalt.com and semalt.com subdomains):
RewriteEngine On
Options +FollowSymlinks
RewriteCond %{HTTP_REFERER} ^https?://([^.]+.)*semalt.com\ [NC,OR]
RewriteRule .* – [F]
You can also use .htaccess to block IP addresses associated with the spammy sources.
edit: just saw your edit but hope this helps nevertheless!
-
Hi Daniel,
Thanks for the additional tips. I do have the bot filtering feature enabled as another point of protection. I checked my referral exclusion list and apparently set this up about a year ago for the initial wave of referral bots I noticed. I didn't know it added them to direct.
The majority of my spam referral hosts have been added to regular filters. I think with the combination of my retroactive approach and new filters, I should have reliable data going forward.
-
Hi there,
You’re on the right track and the best way to retroactively remove spammy sources is through report filters and advanced segments.
A couple other notes:
- A good way to spot spammy referrers is to sort by bounce rate and eliminate any with 100% bounce and over 10 sessions.
- Avoid using the “referral exclusion list” since this will just count spam traffic as direct traffic instead.
- You should also enable the GA ‘bot filtering’ feature under ‘Reporting view settings’ as seen here
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best way to present Google Analytic Reports
Hi Everyone, I’m just looking to produce some quite detailed reports for our google analytic can someone suggest a format or templates for the best way to present this information. Also can someone suggest what the best variables are to capture from a reporting perspective. We don’t sell on line so im just aliitle bit curious about what to include in terms of conversions etc
Reporting & Analytics | | aplnz20170 -
Tracking specific path and referral for unique session
Happy Monday everyone! We recently received a unique lead on our site and I was wondering if there was anyway to pinpoint the exact behavior flow and/or specific referral source they came from on Google Analytics. I know you can view the general behavior flow for all or segmented users based on what pages they viewed, but I wanted to see if I can track by unique sessions to see where this person exactly found us. Any insight is appreciated! Thanks
Reporting & Analytics | | KathleenDC0 -
My domain as a sufix on GA reports
Hi, There's a friend of mine who asked me with some weird results on her sites' Google Analytics reports (http://esther-roche.es). I've searched the GA code and its GA account config and I've no clue about where this esther-roche.es is to remove it. Any idea? Thanks in advance, BzgPcAo.jpg
Reporting & Analytics | | Webicultors0 -
No Google Analytics code on page BUT reporting is active
How could Google Analytics be reporting data if my pages don't have the GA code on them? Mike
Reporting & Analytics | | Mike_c0 -
Landing pages report - Meaning of clics metric
Hi there, I am looking at the landing pages report on Google Analytics, I see 4 columns: Impressiones Clics Average position CTR Regarding the clics metric, this shouldn't be equal to the sessions of organic traffic that you get? In Adwords, a clic is a session. What I see is that clics are not sessions and I am a bit surprised of this. Why are they different in this report? Thanks and regards Thanks and regards
Reporting & Analytics | | footd0 -
Facebook referrals
Does anyone know how to find this out. When I use Google Analytics to monitor a website, I am receiving referrals from Facebook, but it does not tell me the source on Facebook. Only that it is coming from Facebook somewhere.
Reporting & Analytics | | esn0 -
Duplicate content? Split URLs? I don't know what to call this but it's seriously messing up my Google Analytics reports
Hi Friends, This issue is crimping my analytics efforts and I really need some help. I just don't trust the analytics data at this point. I don't know if my problem should be called duplicate content or what, but the SEOmoz crawler shows the following URLS (below) on my nonprofit's website. These are all versions of our main landing pages, and all google analytics data is getting split between them. For instance, I'll get stats for the /camp page and different stats for the /camp/ page. In order to make my report I need to consolidate the 2 sets of stats and re-do all the calculations. My CMS is looking into the issue and has supposedly set up redirects to the pages w/out the trailing slash, but they said that setting up the "ref canonical" is not relevant to our situation. If anyone has insights or suggestions I would be grateful to hear them. I'm at my wit's end (and it was a short journey from my wit's beginning ...) Thanks. URL www.enf.org/camp www.enf.org/camp/ www.enf.org/foundation www.enf.org/foundation/ www.enf.org/Garden www.enf.org/garden www.enf.org/Hante_Adventures www.enf.org/hante_adventures www.enf.org/hante_adventures/ www.enf.org/oases www.enf.org/oases/ www.enf.org/outdoor_academy www.enf.org/outdoor_academy/
Reporting & Analytics | | DMoff0 -
GA custom reports involving pages and goals - what are the metrics saying?
Hi, All! I would like to create a custom report that will enable me to see which of my pages are contributing to goal completion on my site (so I can then optimize the pages that are contributing the most, with maximal ROI for the optimization investment). If I make the dimension "page/page title" and the metric "goal X completions" - which would make sense - what exactly are the numbers that I am seeing telling me? Is it how many times a person started the goal funnel from that pages (meaning every goal would appear only once and there be no overlap)? That doesn't appear to be the case with the numbers, because the headline in the main "Goals" section tells me I have 30 goal completions for that goal, for example, but the headline in the custom report (which is adding up all the numbers) is, say, 100. Or does it mean the number of times that this page was ever in the navigation path of someone who ended up completing a goal? Then the same goal would be counted multiple times, for each page in the path. Additionally, I see this strange thing on some of my reports where the actual funnel pages appear as contributing towards goals, which I guess makes sense, but again the numbers don't match up. If the goal was to get to page B, and the funnel was A->B, and there were supposedly 30 goal completions, my custom report says that A gave 28 goal completions and B gave 25. Anyone know for sure - or through testing - what the case is with all these things? Any explanations will be much appreciated!
Reporting & Analytics | | debi_zyx0