Pages mirrored on unknown websites (not just content, all the HTML)... blackhat I've never seen before.
-
Someone more expert than me could help... I am not a pro, just doing research on a website... Google Search Console shows many backlinks in pages under unknown domains... this pages are mirroring the pages of the linked website... clicking on a link on the mirror page leads to a spam page with link spam... The homepage of these unknown domain appear just fine... looks like that the domain is partially hijacked... WTF?!
Have you ever seen something likes this?
Can it be an outcome of a previous blackhat activity?
-
The links actually are just in some forms
-
Thanks everyone, I'll do more research and update all
-
One more thing that could be a long shot but is worth keeping an eye on: If someone had zero ethics and wanted to steal your links, one thing they could do is reach out to webmasters that link to you and pretend to be you, and claim that the mirror site is the new version of the website and ask them to change their links. Once complete, they'd redirect the mirror site to their own commercial site.
So, keep an eye on your own link profile and theirs, just to make sure none of those links are changing.
-
We see scraped content all the time. Occasionally we see scraped HTML in full form but slightly edited.
In general I'd do the following:
- Disavow those domains in GSC. Also consider disavowing any domains pointing links at the mirror sites. It could be possible that they are building up junk links to a mirror site which canonicals your site. No idea why anyone would do that but worth covering your bases.
- Submit DMCA notices to Google, their webhost, etc.
- Take a look through server logs for anything screwy. If you see tons of weird traffic from countries that aren't relevant to your business, you might decide to block them with your .htaccess file.
- Certainly make sure nothing has been changed on your own site (canonicals, common Wordpress hacks, etc.), though nothing is suggesting to me that that has happened.
Other than that I would just keep an eye on it and keep an eye on Google Search Console.
-
Hi,
From what I read it looks similar to these cases - https://moz.com/community/q/getting-different-search-queries-in-google-webmaster & http://moz.com/community/q/chinese-site-ranking-for-our-brand-name-possible-hack
In these cases - sites where hacked using a vulnerability in a Wordpress slider - they copied sites they wanted to target on to some of the hacked domains and used other hacked domains to point links to them. At that point they were trying to redirect traffic from genuine e-commerce sites to bogus ones.
You could try to contact the site owners & tell them they have been hacked. At the same time file a complaint @Google
rgds,
Dirk
-
Some basic scraping maybe common, but they have gone out of there way to backlink to you...
Have you checked there metrics on open site explore...? Checked whois?
-
As far as I know website scraping is not uncommon, apart from the backlinks
-
I am honest I am still a little puzzled. But if what you are describing is what is happening. Something very strange is going on. I have not heard of it before.
A totally different domain, that you did not build is linking to your site with the exact copy of a page that it is linked to. If that is the case, your site is being specifically targeted. Someone has to have gone out of there way to copy your site and then re-build it on a different domain.
This is what I would do:-
I would contacting google asap.
Can you find who owns the site on whois?
Check the wayback machine to find out what the site was previously
User open site explorer to track the nature of the links on the other domain. .
And wow... that is unbelievable if it is correct...
-
Yes, sorry but I thought that was clear from the beginning.
If I click to the console links I am directed to the mirrored page in these domains. That's exactly how I saw them, otherwise I coudn't discover them
-
That console part provides clickable links. So that goes back to my original question. Can you click through to the domain and see a mimic copy of your site? Or is it simply showing you a link that clicks to nothing?
-
In the console there is a section called (translating into English) "Links pointing to your site"
Traffic has not changed, in fact these do not appear in GA referrals.
I can see all these domains, either the mirrorer paged, or the actual genuine pages pertaining to them originally. You cannot reach the mirrored pages via the menu in hp
-
Where in google search console are they reported?
Also to be clear, nothing has happened to your site. ie traffic has not changed. However you have found "strange" links in search console? Which do not click through to anything?
Or as I am still not clear - have you found another domain, that mimics your site that you can actually see?
-
Exactly, I am confused too.
Indeed, Goole Search Console reports do list them, but I can't see them in the html
Instance, I find this
<form <span="" class="html-tag" data-mce-mark="1">id="search_mini_form" action="http://mydomain.ext/it/catalogsearch/result/" method="get">
and also images hosted in the scraped websites's domain
Also, these pages apparently are under these domains that in turn appear to be real websites equally unaware of the scam
</form>
-
I am a little confused.
Are you saying that an exact copy of a unique page you have created on your website is on another domain name (which you know nothing about) and then that site is linked to your site, ie via a backlink?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Are online tools considered thin content?
My website has a number of simple converters. For example, this one converts spaces to commas
White Hat / Black Hat SEO | | ConvertTown
https://convert.town/replace-spaces-with-commas Now, obviously there are loads of different variations I could create of this:
Replace spaces with semicolons
Replace semicolons with tabs
Replace fullstops with commas Similarly with files:
JSON to XML
XML to PDF
JPG to PNG
JPG to TIF
JPG to PDF
(and thousands more) If somoene types one of those into Google, they will be happy because they can immediately use the tool they were hunting for. It is obvious what these pages do so I do not want to clutter the page up with unnecessary content. However, would these be considered doorway pages or thin content or would it be acceptable (from an SEO perspective) to generate 1000s of pages based on all the permutations?1 -
Duplicate Content Product Descriptions - Technical List Supplier Gave Us
Hello, Our supplier gives us a small paragraph and a list of technical features for our product descriptions. My concern is duplicate content. Here's what my current plan is: 1. To write as much unique content (rewriting the paragraph and adding to it) as there is words in the technical description list. Half unique content half duplicate content. 2. To reword the technical descriptions (though this is not always possible) 3. To have a custom H1, Title tag and meta description My question is, is the list of technical specifications going to create a duplicate content issue, i.e. how much unique content has to be on the page for the list that is the same across the internet does not hurt us? Or do we need to rewrite every technical list? Thanks.
White Hat / Black Hat SEO | | BobGW0 -
Competitor ranking well with duplicate content—what are my options?
A competitor is ranking #1 and #3 for a search term (see attached) by publishing two separate sites with the same content. They've modified the title of the page, and serve it in a different design, but are using their branded domain and a keyword-rich domain to gain multiple rankings. This has been going on for years, and I've always told myself that Google would eventually catch it with an algorithm update, but that doesn't seem to be happening. Does anyone know of other options? It doesn't seem like this falls under any of the categories that Google lists on their web spam report page—is there any other way to get bring this up with the powers that be, or is it something that I just have to live with and hope that Google figures out some day? Any advice would help. Thanks! how_to_become_a_home_inspector_-_Google_Search_2015-01-15_18-45-06.jpg
White Hat / Black Hat SEO | | inxilpro0 -
Is Inter-linking websites together good or bad for SEO?
I know of a website that inter-links a handful of websites together (ex- coloring.ws interlinks to a handful of other sites, including dltk-kids.com, and others). Is this negative for SEO? I was thinking about creating a few related sites and inter-linking all of them together, since they will all be relevant to each other. Any thoughts would be great!
White Hat / Black Hat SEO | | WebServiceConsulting.com0 -
If our site hasn't been hit with the Phantom Update, are we clear?
Our SEO provider created a bunch of "unique url" websites that have direct match domain names. The content is pretty much the same for over 130 websites (city name is different) that link directly to our main site. For me this was a huge red flag, but when I questioned them and they said it was fine. We haven't seen a drop in traffic, but concerned that Google just hasn't gotten to us. DA for each of these sites are 1 after several months. Should we be worried? I think yes, but I am an SEO newbie.
White Hat / Black Hat SEO | | Buddys0 -
Website Vulnerability Leading to Doorway Page Spam. Need Help.
Keywords he is ranking for , houston dwi lawyer, houston dwi attorney and etc.. Client was acquired in June and since then we have done nothing but build high quality links to the website. None of our clients were dropped/dinged or impacted by the panda/penguin updates in 2012 or updates previously published via Google. Which proves we do quality SEO work. We went ahead and started duplicating links which worked for other legal clients and 5 months later this client is either dropping or staying in local maps results and we are performing very badly in organic results. Some more history..... When he first engaged our company we switched his website from a CMS called plone to word press. During our move I ran some searches to figure out which pages we needed to 301 and we came across many profile pages or member pages created on the clients CMS (PLONE). These pages were very spammy and linked to other plone sites using car model,make,year type keywords (ex:jeep cherokee dealerships). I went through these sites to see if they were linking back and could not find any back links to my clients website. Obviously nobody authorized these pages, they all looked very hackish and it seemed as though there was a vulnerability on his plone CMS installation which nobody caught. Fast forward 5 months and the newest OSE update is showing me a good 50+ back links with unrelated anchor text back links. These anchor text links are the same color as the background and can only be found if you hover your mouse over certain areas of the site. All of these sites are built on Plone and allot of them are linked to other businesses or community websites. These websites obviously have no clue they have been hacked or are being used for black hat purposes. There are dozens of unrelated anchor text links being used on external websites which are pointing back to our clients website. Examples: <a class="clickable title link-pivot" title="See top linking pages that use this anchor text">autex Isuzu, </a><a class="clickable title link-pivot" title="See top linking pages that use this anchor text">Toyota service department ratings, </a><a class="clickable title link-pivot" style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;" title="See top linking pages that use this anchor text">die cast BMW and etc..</a> Obviously the first step is to use the disavow link tool, which will be completed this week. The second step is to take some feedback from the SEO community. It seems like these pages are automatically created using some type of bot. It will be very tedious if we have to continually remove these links. I hope there is a way to notify Google that these websites are all plone and have a vulnerability, which black hats are using to harm the innocent... If i cannot get Google to handle this, then the only other option is to start fresh with a new domain name. What would you do in this situation. Your help is greatly appreciated. Thank you
White Hat / Black Hat SEO | | waqid0 -
Domain Structure For A Network of Websites
To achieve this we need to set up a new architecture of domains and sub-websites to effectively build this network. We want to make sure we follow the right protocols for setting up the domain structures to achieve good SEO for the primary domain and local websites. Today we have our core website at www.doctorsvisioncenter.com which will ultimately will become dvceyecarenetwork.com. That website will serve as the core web presence that can be custom branded for hundreds. For example, today you can go to www.doctorsvisioncenter.com/pinehurst. Note when you start there, you can click around and it is still branded for Pinehurst or spectrum eye care. So the burning question(s). - if I am an independent doc at www.newyorkeye.com, I could do domain forwarding but Google does not index forwarded domains so that is out. I could do a 301 permanent redirect to my page www.doctorsvisioncenter.com/newyorkeye. I could then put a rule in the HT Access file that says if newyorkeye.com redirect to www.doctorsvisioncenter/newyorkeye and then have the domain show up as www.newyorkeye.com. Another way to do that is we point the newyorkeye DNS to doctorsvisioncenter.com rather than a 301 redirect with the same basic rule in the HT Access file. That means that, theoretically, every sub page would show up, for example, as www.newyorkeye.com/contact-lens-center which is actually www.doctorsvisioncenter.com/contact-lens-center. It also means, theoretically, that it will be seen as an individual domain but pointing to all the same content under that individual domain just like potentially hundreds of others. The goal is we build once, manage once and benefit many. If we do something like the above which will mean that each domain will essentially be a separate domain, but, will google see it that way or as duplicative content? While it is easy to answer "yes" it would be duplicative, it is not necessarily the case if the content is on separate domains. Is this a good way to proceed, or does anyone have another recommendation for us?
White Hat / Black Hat SEO | | JessTopps0 -
Who's still being outranked by spam?
Over the past few months, through Google Alerts, I've been watching one of our competitors kick out crap press releases, and links to their site have been popping up all over blog networks with exact match anchor text. They now outrank us for that anchor text. Why is this still happening? Three Penguin updates later and this still happens. I'm trying so hard to do #RCS and acquire links that will ensure our site's long-term health in the SERPs. Is anyone else still struggling with this crap?
White Hat / Black Hat SEO | | UnderRugSwept2