Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Fake Links indexing in google
-
Hello everyone,
I have an interesting situation occurring here, and hoping maybe someone here has seen something of this nature or be able to offer some sort of advice.
So, we recently installed a wordpress to a subdomain for our business and have been blogging through it. We added the google webmaster tools meta tag and I've noticed an increase in 404 links. I brought this up to or server admin, and he verified that there were a lot of ip's pinging our server looking for these links that don't exist. We've combed through our server files and nothing seems to be compromised. Today, we noticed that when you do site:ourdomain.com into google the subdomain with wordpress shows hundreds of these fake links, that when you visit them, return a 404 page.
Just curious if anyone has seen anything like this, what it may be, how we can stop it, could it negatively impact us in anyway? Should we even worry about it? Here's the link to the google results.
https://www.google.com/search?q=site%3Amshowells.com&oq=site%3A&aqs=chrome.0.69i59j69i57j69i58.1905j0j1&sourceid=chrome&es_sm=91&ie=UTF-8 (odd links show up on pages 2-3+)
-
Thank you everyone for your responses! The link you sent of the cached pages LynnP was also helpful. As soon as my co-worker who administers the server gets in I'm going to mention to him that we check the subfolders for anything fishy. I know for a fact he looked for subfolders that were suspicious but I'm not sure he may have thought to check the existing folders for sneaky things. Most passwords have been changed... but I will double check.
Again, thanks everyone for your help, very useful!
-
My 2 cents: This does look like a wp hack - been having a nightmare with a recent Pharma hack like JV mentions and honestly I still cannot figure out how exactly they got into the site but suspect through an outdated plugin.
A couple of things to keep in mind are to check your htaccess file for weird lines and have a look for non standard wp files in various folders (things like cache.php or ms-writer.php if I recall right). These files were not showing recent change dates however so it was not as simple as just ftping in and seeing which files had been recently changed (still no idea how they pulled that off). It can also be that all these pages are being spun out of a handful of php files (or the database!) so not 100% the case that you would actually see the subfolders (although in some cases you might). Also seen dev versions of wp on the same server that have not been kept so up to date be used to get into the main production version (pretty sure they were indexed through links sent via gmail emails, thanks google!).
You can check the google cache for any of these pages to see what they looked like and when they were last cached for example: http://webcache.googleusercontent.com/search?q=cache:Y0U-2Yyk3y4J:news.mshowells.com/CI/Ugg-Hazelwood-1437.shtml+
Most of them show late August cache dates so that should help narrow the timeframe. Interesting to note that all pages have a bunch of links at the bottom, some to your site some to other (probably infected) sites. All of the links are now 404s so maybe the hack got taken down by the originator (no idea why just a thought since its a bit odd that all of the links on the external sites also seem to be 404ing now). Needless to say, change all wpadmin, ftp etc passwords to be safe!
-
Hmm...never seen this exactly before - but a few years back we discovered for a client that their reality tv series show (Deadliest Catch) member site had been severely infected by Canadian Pharma phony sites....
Seems the hacker had 'broken' in via a MS update that was not done on their hosting platform site - and it took the tv company almost 4 months to disavow, rebuild and then index and begin to rank again as I remember....i.e. this was NOT a WP issue but a hosting server hack...
But with 20+ pages of Uggs and Nude Men rolling Christians (love that one, eh!) infections, you need to get that totally fixed asap so I'd start with querying the hosting vendor logs...
How comes to mind...if you can not determine where the hack came from - you could kill the subdomain after saving all your articles - recreate it say as "info.mshowells.com" or "advice.mshowells.com" or "counsel.mshowells.com" and reload in the same artices....have had to do that too for another client....
-
Yeah, only 2 of us, server admin guy. We're talking right now and the site is on a brand new VPS that has never been compromised, no strange folder structure, brand new install of Wordpress.. you can see lots of server errors in the error log on the server but the files NEVER existed, and neither of us removed the files. I, personally, do not even have access to the VPS. Only he does, and he is well aware what he's doing and most definitely would have noticed an odd set of folders and would have remembered deleting them. Almost as soon as we made the wordpress install live is when the 404 crawl errors showed up in google, and on the server. We both have seen many instances of wordpress sites being compromised and know what to look for and how to clean it up. This is why this is baffling. Because we're not exactly sure how or in what way they would benefit from this. My server admin thinks these hackers are somehow tricking google somehow... we just both have never seen this and not sure what to expect... very bizarre!
-
That's pretty strange. There isn't another web person there who might have cleaned things up without telling you? Or maybe your server company?
I don't see how these URLs could be indexed if they never existed, so at some point, someone created those pages and they were around long enough to get indexed. Are there any weird spikes in crawl rates or search queries since the launch of the subdomain?
I've seen this kind of hack before. The hacker just drops some folders full of HTML files into the roots. That's why all those links have a two characters sub directory. That was the folder the HTML files were in before someone likely just saw those folders in the root and deleted them. Maybe they didn't realize what they were doing and thought they were just doing the house cleaning?
Doing a "site:mshowells.com/ci/" or "site:mshowells.com/sp/" can show you what I'm talking about.
-
Well, the interesting thing is the links are only showing up on the subdomain news.mshowells.com - which has only existed on the server for maybe 2 - 3 months? Also, when we first noticed them, we checked the server and wordpress and there were no files and nothing was out of order or anything fishy. Everything was and is just fine. We haven't done any cleanup of any sort. And Wordpress & plugins have been kept up to date.
That's why it's weird because at no point were there hacked files or content or anything... so it's a little confusing...
-
Looks like a hack. A hacker somehow got in at some point, dropped a bunch of Ugg Boot affiliate marketing pages and left. Not sure why they are 404ing unless someone already discovered these when they happened and cleaned them up. That could've happened months and months ago.
The 404s shouldn't effect your SEO, but the hack has potential to if it hasn't been cleaned up properly. Do you see a spike in search queries if you look back over the last year or two? That may indicate when the hack occurred and was cleaned up. It's important to know how the hack was cleaned up, so you can ensure that the vulnerabilities have been resolved. If they haven't been, your site is still open to additional attacks, and spam like that can hurt your SEO.
For Wordpress, it's important to keep not only Wordpress itself up to date, but also your plugins (and only use well established plugins, and do a little research on them to make sure people aren't screaming about hacking issues). Hackers search for vulnerabilities in all sorts of places.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should I "no-index" two exact pages on Google results?
Hello everyone, I recently started a new wordpress website and created a static homepage. I noticed that on Google search results, there are two different URLs landing on same content page. I've attached an image to explain what I saw. Should I "no-index" the page url? Google url.JPG In this picture, the first result is the homepage and I try to rank for that page. The last result is landing on same content with different URL. So, should I no-index last result as shown in image?
Technical SEO | | amanda59640 -
Removing a site from Google index with no index met tags
Hi there! I wanted to remove a duplicated site from the google index. I've read that you can do this by removing the URL from Google Search console and, although I can't find it in Google Search console, Google keeps on showing the site on SERPs. So I wanted to add a "no index" meta tag to the code of the site however I've only found out how to do this for individual pages, can you do the same for a entire site? How can I do it? Thank you for your help in advance! L
Technical SEO | | Chris_Wright1 -
Abnormally high internal link reported in Google Search Console not matching Moz reports
If I'm looking at our internal link count and structure on Google Search Console, some pages are listed as having over a thousand internal links within our site. I've read that having too many internal links on a page devalues that page's PageRank, because the value is divided amongst the pages it links out to. Likewise, I've heard having too many internal links is just bad in general for SEO. Is that true? The problem I'm facing is determining how Google is "discovering" these internal links. If I'm just looking at one single page reported with, say, 1,350 links and I'm just looking at the code, it may only have 80 or 90 actual links. Moz will confirm this, as well. So why would Google Search Console report different? Should I be concerned about this?
Technical SEO | | Closetstogo0 -
How long does Google takes to re-index title tags?
Hi, We have carried out changes in our website title tags. However, when I search for these pages on Google, I still see the old title tags in the search results. Is there any way to speed this process up? Thanks
Technical SEO | | Kilgray0 -
Google will index us, but Bing won't. Why?
Bing is crawling our site, but not indexing it, and we cannot figure out why -- plus it's being indexed fine in Google. Any ideas on what the issue with Bing might be? Here's are some details to let you know what we've already checked/established: We have 4 301’s and the rest of our site checks out We’ve already established our Robots is ok, and that we are fixing our site map/it's in fine shape We do not see anything blocking bingbot access to the site There is no varnish or any load balancers, so nothing on that end that would be blocking the access We also don't see any rules in the apache or the .htaccess config that would be blocking the access
Technical SEO | | Alex_RevelInteractive0 -
Will Google Recrawl an Indexed URL Which is No Longer Internally Linked?
We accidentally introduced Google to our incomplete site. The end result: thousands of pages indexed which return nothing but a "Sorry, no results" page. I know there are many ways to go about this, but the sheer number of pages makes it frustrating. Ideally, in the interim, I'd love to 404 the offending pages and allow Google to recrawl them, realize they're dead, and begin removing them from the index. Unfortunately, we've removed the initial internal links that lead to this premature indexation from our site. So my question is, will Google revisit these pages based on their own records (as in, this page is indexed, let's go check it out again!), or will they only revisit them by following along a current site structure? We are signed up with WMT if that helps.
Technical SEO | | kirmeliux0 -
Why are Google search results different if you are log'd into Google or not?
I get different results when I'm log'd into my Google account associated with my website than if I'm not. The same country is occurring. So how can I rely on the google results I'm seeing? For instance my site is page 1 with the improvements I made based on SEOMOZ if I'm log'd in. Yet I'm not on the first 25 pages if I'm not logged in.
Technical SEO | | Romana0 -
Does the Referral Traffic from a Link Influence the SEO Value of that Link?
If a link exists, and nobody clicks on it, could it still be valuable for SEO? Say I have 1000 links on 500 sites with Domain Authority ranging from 35 to 80. Let's pretend that 900 of those links generate referral traffic. Let's assume that the remaining 100 links are spread between 10 domains of the 500, but nobody ever clicks on them. Are they still valuable? Should an SEO seek to earn more links like those, even though they don't earn referral traffic? Does Google take referral data into account in evaluating links? 5343313-zelda-rogers-albums-zelda-pictures-duh-what-else-would-they-be-picture3672t-link-looks-so-lonely.jpg Sad%20little%20link.jpg
Technical SEO | | glennfriesen1