PDF web traffic hitting our site
-
Hi there,
Over the last few months our traffic has spiked due to irrelevant pdf documents sending us crap traffic, our bounce rate is sky high as well as other metrics. I don't want to just filter out this traffic in GA rather try and stop our site from being attacked.
Any advice on a way forward would be great.
Thanks
-
Based on this I don't think you have anything to worry about. It doesn't appear to be an attack, as you described in your original post. An actual attack on your website would have much higher volume. The worst this could possibly be is spam, which is mainly just annoying.
Easy solution: you don't want to filter out this traffic from GA because it may be useful at some point. So just create another view in GA, and name it "unfiltered". This view will have no filters and you can see all traffic in its raw glory. In your main view, name it something like "master" or "the one view to view them all" or whatever you want and set filters to remove that traffic from view.
Personally it looks more to me like these are old pdfs that other websites are linking to, which is what your hosting provider has also said. Your best move here is actually to setup redirects to relevant pages to recapture some of those links that are probably ending in 404s and get some link equity to important pages.
-
HI Alick, seems to be coming from an external source, I've included a screen grab for you too.
I've also discussed this with our hosting provider who gave the following response:
Thanks for the info from Webmaster Tools. That screenshot that shows the HTTP response is just showing that a request to http://www.icmp.co.uk/lulu-the-lioness-a-heroines-story.pdf throws a 301 redirect over to https://www.icmp.ac.uk/lulu-the-lioness-a-heroines-story.pdf — this runs because of the standard HTTPS/primary domain redirect code in settings.php and unfortunately doesn’t tell us much here.
I pulled down the database again and ran a search for a few of these filenames, and those came up empty. Looks like these don’t touch Drupal at all. When we saw them in the database before, in the sessions table, that was likely just because that filter module was storing browser history in user session data for some reason.
I did a little research here, and I think that leaves a few potential causes:
Another site is linking to these files (even though they don’t exist), and this is where Google is picking up/indexing the URLs from. This should be checkable in Google Analytics if you look at Referrals to those files.
These were listed on the sitemap at some point (but not any longer: https://www.icmp.ac.uk/sitemap.xml).
These files existed at some point in the past, but have since been deleted.
There was a DNS misconfiguration at some point, and that domain name was pointing to a different server where these files did exist.
While these are a little annoying to see in Analytics, from what I’ve read, 404s don’t negatively impact the site from an SEO standpoint, and there’s no evidence that the site itself is compromised at all, so unless we see evidence otherwise, I wouldn’t worry about these.
-
Hi,
Pdf trafic from your own site or other sites?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you track two Google Analytics Accounts on one site?
If you have a site that had an old analytics account and then implemented a new one is it possible to run tracking code that records to both accounts without causing your site or data issues? We are doing this so we don't loose data at any point - ideally it wouldn't have been split between the two but making one redundant isn't an option. Ideally we would have merged the data from both accounts and had one - however the research we have done points to this not being a possibility - unless one of you guys knows different? It would be great if anyone has experience on any this.. Thanks
Reporting & Analytics | | ChrisAllbones0 -
Universal Analytics: Why does Google Organic appear as Direct traffic?
Hi there, When I enter the site via Google Search and follow myself via Real-Time Analytics I appear an organic visitor (which is good). When I browse and visit the site I still am an organic visitor. However, as soon as I fill in the contact form (gravity forms) and land on the "thank you page" I appear as a direct visitor with Google as the source. Since I have the thank you page set-up as a goal, Analytics incorrectly attributes these conversions to the direct medium instead of the organic medium. The tracking code has been installed on all the pages and all conversions are being recorded. What is going on?
Reporting & Analytics | | Robbern0 -
Senuke and traffic generator program is a good idea?? I think i got some problems now.
First of all thanks for reading, especially if you are the one whose bright ideas will help me out:) I started using senuke xcr about 3 months ago, obviously at the beginning i didnt make much success(not like I do now). Later i bought that inferno thingy and it actually works. First 2 weeks didnt make much difference(although i could see some little but stable uprising) but after 4th week ended, the average impression and queries doubled up, 6th went up again, its like every week or two it jumps up and keeps it there. Also the actual traffic from keywords went up! When about the second week finished, i started using a traffic generator program, first it leveled out the impressions and seemed to help a bit. Lately i think it messed it all up, plus about 2 weeks ago there was 2-3 dayswhen i sent a bit more traffic than usual and around that time the average rising of impressions didnt happened, it might even went down. Now i stopped using traffic g. and everything stayed the same no improvement!! Anyone could help me? I need to get it moving up again! Also im still nowhere near the top as the keywords are competitive well at least for me. What do i do wrong and what should i do? Also what about traffic generator? ps is it safe or/and or allowed to write that? Thanks
Reporting & Analytics | | Sugafree0 -
Anyone notice a drop in results using site operator?
I set our site's preferred domain back on January 28. We had a www and non www domain being indexed. Since then, I've seen the number or results for our site site operator (site:) decline dramatically. Not sure if this is a good thing or bad thing. So, I'm trying to see if it's unique to our site. My gut is that the numbers are probably leveling out to where they should be and the duplicates are falling out, but I would think that as I see number of results for non www decline, the number of results for www would increase. Any thoughts? Anyone else seeing fluctuations in results using site: ? Lisa
Reporting & Analytics | | Aggie0 -
Bing Won't Index Site - Help!
For the past few weeks I’ve been trying to figure out why my client's site is not indexed on bing and yahoo search engines. My Google analytics is telling me I’m getting traffic (very little traffic) from Bing almost daily but Bing webmaster tools is telling me I’ve received no traffic and no pages have been indexed into Bing since the beginning of December. At once point I was showing ranking in Bing for only one keyword then all of a sudden none of my pages were being indexed and I now rank for nothing for that website. From Google I’m getting over 1200 visits per month. I have been doing everything I can to possibly find the culprit behind this issue. I feel like the issue could be a redirect problem. In webmaster tools on Bing I’ve used “Fetch as Bingbot” and every time I use it I get a Status of “Redirection limit reached.”. I also checked the CRAWL Information and it’s saying all the URL’s to the site are under 301 redirect. A month or so ago the site was completely revamped and the canonical URL was changed from non www to www. I have tried manually adding pages to be indexed multiple times and Bing will not index any of the sites pages. I have submitted the sitemap to Bing and I am now at a loss. I don’t know what’s going on and why I can’t get the site listed on Bing. Any suggestions would be greatly appreciated. Thanks,
Reporting & Analytics | | VITALBGS
Stephen0 -
Why did my site go from 1,000 Impressions to 0 impressions over the past couple days
so we have been doing a lot of reconstruction our website the past couple days with our navigation and moving stuff around. Now bare with me because i am a noob to SEO and i am still learning. But we went from a steady 800-1000 impressions everyday to a 0 since the first. you can check out our site here: www.hidguy.net Not sure if its because we are moving stuff around or what. But we can notice that our conversion rate has dropped and we have had traffic but no conversions and no orders. Is what we are doing bad or is this normal?
Reporting & Analytics | | Dante130 -
How Can I Record .zip Traffic in Google Analytics
The company I am with shows traffic coming in to .zip files in Google Analytics. The traffic being recorded stopped a while back, but I know people are still downloading/visiting these .zip ULRs. I'm not sure how they were recording/tracking the .zip URLs (in Analytics) before, but I'd like to track them once again. Does anyone know how to do this? Thanks
Reporting & Analytics | | poolguy0 -
Site speed not being reported accurately?
We're constantly on the lookout for site speed, and Google's Webmaster tools are saying that we're really really slow (on the order of 5-15 seconds per page). But the site NEVER feels that slow, and lots of other tools say we're in the 3-5 second range. Further, we've implemented literally 100% of Google's suggestions, and all we have are ad units that now render using Googles Async ad loader, further reducing time to interactivity. Could Google be dinging us in search results for this? Here's an example page that they said loaded in 200+ seconds (!?!) http://hark.com/clips/kwkdqqtzsg-terran-nuclear-launch-detected Thanks!
Reporting & Analytics | | TheIronYuppie0