PDF web traffic hitting our site
-
Hi there,
Over the last few months our traffic has spiked due to irrelevant pdf documents sending us crap traffic, our bounce rate is sky high as well as other metrics. I don't want to just filter out this traffic in GA rather try and stop our site from being attacked.
Any advice on a way forward would be great.
Thanks
-
Based on this I don't think you have anything to worry about. It doesn't appear to be an attack, as you described in your original post. An actual attack on your website would have much higher volume. The worst this could possibly be is spam, which is mainly just annoying.
Easy solution: you don't want to filter out this traffic from GA because it may be useful at some point. So just create another view in GA, and name it "unfiltered". This view will have no filters and you can see all traffic in its raw glory. In your main view, name it something like "master" or "the one view to view them all" or whatever you want and set filters to remove that traffic from view.
Personally it looks more to me like these are old pdfs that other websites are linking to, which is what your hosting provider has also said. Your best move here is actually to setup redirects to relevant pages to recapture some of those links that are probably ending in 404s and get some link equity to important pages.
-
HI Alick, seems to be coming from an external source, I've included a screen grab for you too.
I've also discussed this with our hosting provider who gave the following response:
Thanks for the info from Webmaster Tools. That screenshot that shows the HTTP response is just showing that a request to http://www.icmp.co.uk/lulu-the-lioness-a-heroines-story.pdf throws a 301 redirect over to https://www.icmp.ac.uk/lulu-the-lioness-a-heroines-story.pdf — this runs because of the standard HTTPS/primary domain redirect code in settings.php and unfortunately doesn’t tell us much here.
I pulled down the database again and ran a search for a few of these filenames, and those came up empty. Looks like these don’t touch Drupal at all. When we saw them in the database before, in the sessions table, that was likely just because that filter module was storing browser history in user session data for some reason.
I did a little research here, and I think that leaves a few potential causes:
Another site is linking to these files (even though they don’t exist), and this is where Google is picking up/indexing the URLs from. This should be checkable in Google Analytics if you look at Referrals to those files.
These were listed on the sitemap at some point (but not any longer: https://www.icmp.ac.uk/sitemap.xml).
These files existed at some point in the past, but have since been deleted.
There was a DNS misconfiguration at some point, and that domain name was pointing to a different server where these files did exist.
While these are a little annoying to see in Analytics, from what I’ve read, 404s don’t negatively impact the site from an SEO standpoint, and there’s no evidence that the site itself is compromised at all, so unless we see evidence otherwise, I wouldn’t worry about these.
-
Hi,
Pdf trafic from your own site or other sites?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
If website users don't accept GDPR cookie consent, does that prevent GA-GTM from tracking pageviews and any traffic from that user that would cause significant traffic decreases?
I've been doing a lot research on GDPR impact and implementation with GTM-GA for clients, but it's been 12 months since GDPR has gone live I haven't found anything on how GA traffic has been impacted if users don't accept cookie consent. However, I'm personally seeing GA accounts taking huge losses in traffic since implementing GDPR cookie solutions (because GTM/GA tags aren't firing until cookies are accepted). Is it common for websites to see significant decreases in traffic due to too many users not accepting cookie consent? Are there alternative solutions to avoid traffic loss like that and still maintain GDPR compliance? It seems to me that the industry underestimated how many people won't accept cookie consent. Most of the documentation and articles around GDPR's start (May 2018) didn't foresee or cover that aspect properly, everything seems to be technically focused with the assumption that if implemented properly most people would accept cookie consent, but I'm personally not seeing that trend and it's destroying GA data (lost traffic, minimal source attribution, inaccurate behavior data, etc). Thanks.
Reporting & Analytics | | Kickboard2 -
How is this site being tracked in GA?
Hi, Bit of an unusual one this, so please bear with me. This site http://www.hayesandhurst.com/ is being tracked in Google Analytics but we're unsure as to how. To the best of our knowledge we haven't inserted the tracking ID into the theme options or the tracking code into the source - and yet it appears to be tracking successfully! I did send some instructions to the client to set up the Analytics account in their name - I fear they have added the code somewhere to the site, but we cannot see where! Perhaps via Google tag Manager?? As I say, an odd one this, but if anyone can shed any light on this mystery i'd be hugely grateful. The tracking ID is UA-64505394-1 for the record. Regards,
Reporting & Analytics | | nathangdavidson
Nathan0 -
Looking for a GA developer to block internal traffic and integrate with Snipcart JS API - paid
Hi There, We're looking for someone to help us setup: Blocking internal traffic using a custom dimension/query URL as described here: http://www.simoahava.com/analytics/block-internal-traffic-gtm/ Integrate Google Analytics ecommerce tracking with the Snipcart API as described here: https://snipcart.com/blog/integrating-snipcart-with-google-analytics-ecommerce-tracking Discuss Kiss Metrics and possibly integrate with Snipcart We're a small, 2 person startup so are ideally looking for a freelancer or small firm to help - Any recommendations or pointers in the right direction would be much appreciated. Cheers Ben
Reporting & Analytics | | cmscss0 -
Direct Traffic Spike
In February, I transferred an HTML site to a WordPress Platform. Since then, Direct traffic has spiked to nearly 400% since the WordPress transition. The Direct traffic spike took roughly 2 months before it started to kick in. Does anyone know what this could be attributed to?
Reporting & Analytics | | SteveZero120 -
Traffic Discrepancy between google and moz
I checked my traffic data this morning for this past week and there seems to be what I hope is a huge mistake. https://www.diigo.com/item/image/3vpdp/mckp My traffic seems to have dropped from 176 organic visits to 0 visits. There is no indication of this at all in Google analytics. What's going on?
Reporting & Analytics | | EcomLkwd0 -
Google Analytics Site Search to new sub-domain
Hi Mozzers, I'm setting up Google's Site Search on a website. However this isn't for search terms, this will be for people filling in a form and using the POST action to land on a results page. This is similar to what is outlined at http://support.google.com/analytics/bin/answer.py?hl=en&answer=1012264 ('<a class="zippy zippy-collapse">Setting Up Site Search for POST-Based Search Engines').</a> However my approach is different as my results appear on a sub-domain of the top level domain. Eg.. user is on www.domain.com/page.php user fills in form submits user gets taken to results.domain.com/results.php The issue is with the suggested code provided by Google as copied below.. Firstly, I don't use query strings on my results page so I would have to create an artificial page which shouldn't be a problem. But what I don't know is how the tracking will work across a sub-domain without the _gaq.push(['_setDomainName', '.domain.com']); code. Can this be added in? Can I also add Custom Variables? Does anyone have experience of using Site Search across a sub-domain perhaps to track quote form values? Many thanks!
Reporting & Analytics | | panini0 -
What is the impact of a panda refresh on a Pandalized site?
When a panda refresh hits and you have a pandalized site, If the site were to de-pandalized, would you see traffic back to pre-panda levels right away? Or any type of movement right away?
Reporting & Analytics | | jessefriedman0 -
Setting up Google Analytic Goals to a 3rd Party Site
I recently received help on a question I asked on SEOmoz but need additional clarification. I am trying to set up goals in Google Analytics for people who click on a “purchase botton” which sends them to PayPal. I created a Thank You page and tried to get PayPal to redirect to it, however, our customers only get to our site’s 404 page. Here is what I’ve done so far: Went into my PayPal account and turned the “Auto Return” to ‘on’ Under website payment preferences, I added the following URL http://www.teecycle.org/thank-youutm_nooverride1. (I formatted the URL this way because the person who provided me with help recommended using the format ?UTM_nooverride=1. However, our CMS system won’t allow “?” or “=”)
Reporting & Analytics | | EricVallee340