PDF web traffic hitting our site
-
Hi there,
Over the last few months our traffic has spiked due to irrelevant pdf documents sending us crap traffic, our bounce rate is sky high as well as other metrics. I don't want to just filter out this traffic in GA rather try and stop our site from being attacked.
Any advice on a way forward would be great.
Thanks
-
Based on this I don't think you have anything to worry about. It doesn't appear to be an attack, as you described in your original post. An actual attack on your website would have much higher volume. The worst this could possibly be is spam, which is mainly just annoying.
Easy solution: you don't want to filter out this traffic from GA because it may be useful at some point. So just create another view in GA, and name it "unfiltered". This view will have no filters and you can see all traffic in its raw glory. In your main view, name it something like "master" or "the one view to view them all" or whatever you want and set filters to remove that traffic from view.
Personally it looks more to me like these are old pdfs that other websites are linking to, which is what your hosting provider has also said. Your best move here is actually to setup redirects to relevant pages to recapture some of those links that are probably ending in 404s and get some link equity to important pages.
-
HI Alick, seems to be coming from an external source, I've included a screen grab for you too.
I've also discussed this with our hosting provider who gave the following response:
Thanks for the info from Webmaster Tools. That screenshot that shows the HTTP response is just showing that a request to http://www.icmp.co.uk/lulu-the-lioness-a-heroines-story.pdf throws a 301 redirect over to https://www.icmp.ac.uk/lulu-the-lioness-a-heroines-story.pdf — this runs because of the standard HTTPS/primary domain redirect code in settings.php and unfortunately doesn’t tell us much here.
I pulled down the database again and ran a search for a few of these filenames, and those came up empty. Looks like these don’t touch Drupal at all. When we saw them in the database before, in the sessions table, that was likely just because that filter module was storing browser history in user session data for some reason.
I did a little research here, and I think that leaves a few potential causes:
Another site is linking to these files (even though they don’t exist), and this is where Google is picking up/indexing the URLs from. This should be checkable in Google Analytics if you look at Referrals to those files.
These were listed on the sitemap at some point (but not any longer: https://www.icmp.ac.uk/sitemap.xml).
These files existed at some point in the past, but have since been deleted.
There was a DNS misconfiguration at some point, and that domain name was pointing to a different server where these files did exist.
While these are a little annoying to see in Analytics, from what I’ve read, 404s don’t negatively impact the site from an SEO standpoint, and there’s no evidence that the site itself is compromised at all, so unless we see evidence otherwise, I wouldn’t worry about these.
-
Hi,
Pdf trafic from your own site or other sites?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does Search Console data include GMB traffic? Branded CTR is 37.8%- Good or Bad?
Hey all, Per Search Console our branded keyword CTR is 37.8%. But when that keyword is searched our GMB listing shows up on top of the #1 result. For the same 90 day period GMB shows another 35% visits to our GMB (based on the number of impressions and visits to our GMB page) listing when the same keyword is searched. My question is this. Does Search console data include clicks that came from our GMB listing or not? My thinking is like this: If GMB traffic is not calculated in search console then it means that 72.8% of people looking for our brand will end up on our site on way or another 9organic #1 result plus GMB listing visits) We are also doing PPC for this very keyword that has gets almost 20% of the remaining traffic. So after adding all up we are loosing about 8% of our branded traffic to people who are doing adwords. When you search our brand you normally see 2, 3 competitor's adwords ads. Does anyone know how this works exactly? And if you don't mind sharing your branded keyword CTR's, so I can compare to ours please. I would love to compare to a site that actually has a GMB listing ranking for the same keyword Thanks in advance, Davit
Reporting & Analytics | | Davit19850 -
Drop of traffic after massive technical issue
Hello,
Reporting & Analytics | | SharonEKG
since August i am working on a customers website on WordPress who has a costume made theme, back in October after updating some plugins we had a massive breakdown and the website went up and down and had technical issues for over a month and traffic was completely gone for a while, since we have dropped to about 40% of the monthly traffic the website was getting prior, i was waiting to see if the website will recover since we were getting some traffic and are ranking but that did not happen, is there a way to tell if there are any code issues or anything that can cause that drop? moz crawler only indicates normal meta description errors but nothing in the code, changing the theme would probably be best solution as a popular premade theme would give a definite answer but that is not possible.0 -
How accurate is Google analytics at measuring traffic? (Free version)
Hi Guys. When we compare our actual sales to Google analytics conversions it can be way out. Sometimes as much as 50%. Is this the same for the data on traffic? And if so, does the data tend to be out by a similar amount over time? i.e) If we compare this year and last year it give us a good indicator of differences in traffic volume? (albeit not 100% accurate?) Thanks. Isaac.
Reporting & Analytics | | isaac6630 -
Why do I have a lot of direct traffic from MSN and Yahoo with 100% bounce rate?
For MSN I have about 8000 hits of one page and a 100% bounce rate. In the last month I have about 400 hits from Yahoo with a 100% bounce rate. Msn is all different pages, while Yahoo is all the home page. What would cause this?
Reporting & Analytics | | EcommerceSite0 -
Amazon.com inc.increase in direct traffic
Hi All, I have seen a increase of direct traffic from hostname amazon.com inc. This only happened on one day. Any ideas what/why it is? Thanks
Reporting & Analytics | | Sayers0 -
No (Not Provided) Traffic
Hi All, I have a site that gets around 500,000 visits every month from the UK and US, but funny enough the number number of visits under (not provided) traffic is only 179 per month. For most websites I usually get 10% or 15% of the traffic under not provided, what could the reasons be for such a low percentage of (not provided)? Thanks, Carlos
Reporting & Analytics | | Carlos-R0 -
New Google Analytics Site Speed tool and excel
Hello, I was wondering if there is a good tool or method to pull the new Google Analytics Site Speed data into excel and use this document to track site speeds on a weekly basis for multiple clients? Any good articles or how-to's would be awesome!
Reporting & Analytics | | Hakkasan0 -
Site: Query Question
Hi All, Question around the site: query you can execute on Google for example. Now I know it has lots of inaccuracies, but I like to keep a high level sight of it over time. I was using it to also try and get a high level view of how many product pages were indexed vs. the total number of pages. What is interesting is when I do a site: query for say www.newark.com I get ~748,000 results returned. When I do a query for www.newark.com "/dp/" I get ~845,000 results returned. Either I am doing something stupid or these numbers are completely backwards? Any thoughts? Thanks, Ben
Reporting & Analytics | | BenRush0