PDF web traffic hitting our site
-
Hi there,
Over the last few months our traffic has spiked due to irrelevant pdf documents sending us crap traffic, our bounce rate is sky high as well as other metrics. I don't want to just filter out this traffic in GA rather try and stop our site from being attacked.
Any advice on a way forward would be great.
Thanks
-
Based on this I don't think you have anything to worry about. It doesn't appear to be an attack, as you described in your original post. An actual attack on your website would have much higher volume. The worst this could possibly be is spam, which is mainly just annoying.
Easy solution: you don't want to filter out this traffic from GA because it may be useful at some point. So just create another view in GA, and name it "unfiltered". This view will have no filters and you can see all traffic in its raw glory. In your main view, name it something like "master" or "the one view to view them all" or whatever you want and set filters to remove that traffic from view.
Personally it looks more to me like these are old pdfs that other websites are linking to, which is what your hosting provider has also said. Your best move here is actually to setup redirects to relevant pages to recapture some of those links that are probably ending in 404s and get some link equity to important pages.
-
HI Alick, seems to be coming from an external source, I've included a screen grab for you too.
I've also discussed this with our hosting provider who gave the following response:
Thanks for the info from Webmaster Tools. That screenshot that shows the HTTP response is just showing that a request to http://www.icmp.co.uk/lulu-the-lioness-a-heroines-story.pdf throws a 301 redirect over to https://www.icmp.ac.uk/lulu-the-lioness-a-heroines-story.pdf — this runs because of the standard HTTPS/primary domain redirect code in settings.php and unfortunately doesn’t tell us much here.
I pulled down the database again and ran a search for a few of these filenames, and those came up empty. Looks like these don’t touch Drupal at all. When we saw them in the database before, in the sessions table, that was likely just because that filter module was storing browser history in user session data for some reason.
I did a little research here, and I think that leaves a few potential causes:
Another site is linking to these files (even though they don’t exist), and this is where Google is picking up/indexing the URLs from. This should be checkable in Google Analytics if you look at Referrals to those files.
These were listed on the sitemap at some point (but not any longer: https://www.icmp.ac.uk/sitemap.xml).
These files existed at some point in the past, but have since been deleted.
There was a DNS misconfiguration at some point, and that domain name was pointing to a different server where these files did exist.
While these are a little annoying to see in Analytics, from what I’ve read, 404s don’t negatively impact the site from an SEO standpoint, and there’s no evidence that the site itself is compromised at all, so unless we see evidence otherwise, I wouldn’t worry about these.
-
Hi,
Pdf trafic from your own site or other sites?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Direct Traffic has Drastically Increased
Hi, I have noticed in Google analytics that direct traffic of one of my website is increased drastically. I didn't participate in Email Marketing or any other paid campaigns however direct traffic is continuously increasing. I checked all sources, locations, browsers, filters etc but didn't get any findings. Please help.
Reporting & Analytics | | RuchiPardal0 -
Organic Traffic going to Direct? How to find the ultimative proof
Hi there, a few months ago we had a new site launch. As we operate internationally, visitors in different countries do get redirected to the specific country subfolder. e.x. domain.com/it Since site launch. Direct Traffic is performing really really well(almost to good to be true) while organic is doing decent but not more. I'm quite sure, that a part of direct belongs to the organic channel. Though now I need strong arguments to fight against our own web development team. Do you have any suggestions, where and how I can find strong evidence of my hypotheses? Thanks in advance
Reporting & Analytics | | ennovators0 -
Why am i seeing google.inc in analytics as direct traffic? Bug?
Recently i have seen two clients sites see a spike in direct traffic coming from google.inc. Is this a bot/bug which i should filter out. Just a bit confused why google.inc has visited our clients site 676 times! Any insight would be great.
Reporting & Analytics | | BlueWren1 -
Organic and direct traffic swap
We moved to a CMS (Webhook) and when we did that organic traffic and direct traffic swapped places. Since we moved it organic traffic is down by about 400 visits and direct traffic is up by 400 visits. I went through the list below and confirmed everything is working. The http refer wasn't being passed for a couple of weeks but the issue was resolved and the organic traffic issue is still ongoing. Is there anything else that may cause this issue? I confirmed the issue isn't one of the below problems. during http to https redirect (or vice versa) the referrer may not be passed incorrect subdomain or cross-domain tracking can strip the referrer. 302 redirects sometimes caused the referrer to be dropped problems with cookies being lost/corrupted. javascript missing from certain entry pages (means any further page view looks like a direct)
Reporting & Analytics | | BT20090 -
WMT and 'Links To Your Site'
Anyone else find that there are, almost continually, links added to the 'Links To Your Site' list from years ago that weren't previously reflected? I'm seeing links that were added to directories in 2008 (by whoever was doing the SEO then) only showing in the last week or so when these links weren't in the list a few months ago. I don't suppose there's much I can do - it's just annoying in that it adds to more people to contact to have nonsense removed.
Reporting & Analytics | | Martin_S0 -
Google Webmaster Dropped Traffic
Hi Guys, I've recently implemented a new site design with a new url architecture etc.. I set up Google Webmaster tools in early March and went live with the new site on 2nd April. Since the 6th April (4 days after going live) I've notice a drop in impressions from 15,000 - 28,000 per day to 3,500 - 5,500 per day. Now I'm kind of new to this, so after I cleaned up after my initial panic, I checked impressions vs. clicks. Impression are down 44% and clicks 31%. Not bad I thought but then again what if I dropped so far off the radar I never made an impression. 1. So first question: Are the number of impressions of a keyword independant of ranking. So if I rank #3,445,234th will I still see impressions listed but avg. position reflect #3,445,234? So while I was thinking of asking these questions I checked Google Analytics for which I have just over 3 years of metrics. Compared to the past; overall traffic and organic traffic are the same if not 10-20% better. 2. So second: Am I missing something? Its a resort website so easter just ended and you would expect the traffic to match and drop, but such a sudden and dramitic drop in Webmaster Tools impressions does not match Google Analytics. Infact Organic traffic climbed slightly. Can anyone offer any insight? Thanks, Adam
Reporting & Analytics | | NaescentAdam0 -
URL-structure change - former long-tail traffic gone
Hey people, I'm sure many of you applied changes to the URL structure of a client's or your own website before. So did I for obvious reason: The structure before was like www.domain.com/brand_page/_22-key-word-translatedkeyword.php (ranked 20). This was changed to www.domain.com/key-word.html.
Reporting & Analytics | | dumperama
Edit: Also on-page it was optimized, but only taking out worthless links like "keyword-link to other page" and adding a relevant SEO text (also valuable for the user) Now, for the targeted short-tail keyword, the outcome was great - ranking increased by 17 landing the page on the first SERP. But: Before this page garnered a wide range of long-tail keyword traffic.To be exact: 2600 different keywords generated traffic for that page in a period of 1 month. Now the newly structured site (also on-page optimized) only receives traffic from around 100 keywords. You can imagine that the absolute amount of visits also dropped. So I'd like to know if you observed similar results. Another question that's coming up in this context: How regularly does Google refresh the keywords associated with a page? Like: Is this page really relevant for this one keyword we associated it with 5 years ago? Because it is clear, when I'm looking at the aforementioned 2600 KW in detail, most don't have anything to do with the site, i.e. are not mentioned at all. Still they generated valuable traffic though. All of this is really crucial to this project, because soon the whole website's supposed to be relaunched with optimized URL structure and of course everything else that's need SEO wise... I'd love to hear your experiences. Thanks!!0 -
Duplicate Content From My Own Site?!
When I ran the SEO Moz report it says that I have a ton of duplicate content. The first one I looked at was my home page. http://www.kisswedding.com/ http://www.kisswedding.com/index.html http://kisswedding.com/index.html All of the above 3 have varying internal links, page authority, and link root domains. Only the first has any external links. All of the others only seem to have 1 other duplicate page. It's a difference between the www and the non-www version. I have a verified acct for www.kisswedding.com in google webmaster tools. The non-www version is in there too but has not been verified. Under settings for the verified account (www.kisswedding.com), "Don't set a preferred domain" is checked off. Is that my mistake. And if so, which should I select? The www version or the non-www version? Thanks!
Reporting & Analytics | | annasus0