Site deindexed after HTTPS migration + possible penalty due to spammy links
-
Hi all, we've recently migrated a site from http to https and saw the majority of pages drop out of the index.
One of the most extreme deindexation problems I've ever seen, but there doesn't appear to be anything obvious on-page which is causing the issue. (Unless I've missed something - please tell me if I have!)
I had initially discounted any off-page issues due to the lack of a manual action in SC, however after looking into their link profile I spotted 100 spammy porn .xyz sites all linking (see example image).
Didn't appear to be any historic disavow files uploaded in the non https SC accounts.
Any on-page suggestions, or just play the waiting game with the new disavow file?
-
Thanks for answering all of my questions!
It's interesting that when I do a simple site:search in Google none of the main pages of your website are appearing. Most of the search results are either archives or comments. Typically, I've seen this kind of thing happen when something goes wrong in the redirects or a site is penalized.
It looks like the big dip in indexation didn't occur until about August. I would think that if you pulled the trigger in June, pages would start dropping out of the index much sooner.
In this case, your theory about a possible penalization might be right. I'd be interested to see what happens once Google considers the disavow file (unfortunately, that will take some time).
Does anyone else have any input or possible reasons why pages on this site have dropped out of the index so quickly?
-
Hi Serge,
Thanks for your input. I've answered your questions below.
- How long ago did you switch to https? - 21st June
- Have you submitted both non-www and www versions of the https site to Google Search Console (GSC)? - Yes
- Have you kept the http versions of your website in GSC? - Yes
- From the looks of it, your sitemap has been updated to reflect the https pages. Have you submitted the updated sitemap to GSC? - Yes - Submitted pages are not matching Indexed pages
- Are there any sitemap errors appearing in GSC? Any other errors? No Sitemap Errors. some 404ing pages.
- Could you attach a screenshot of the indexation rate on both https and http versions of the site from GSC?
- Could you confirm that all redirects were done 1-to-1 and properly redirected? (301s and not 302s) - Confirmed - all tools are reporting 200 status after hitting the 301.
We are still waiting to see some results from submitting the disavow file. So far, no positive movement.
Thanks for your help!
-
Hi there,
There could be a lot of reasons why certain pages of your website are dropping out of your index. Could you answer the following questions to help us narrow down the possible cause?
- How long ago did you switch to https?
- Have you submitted both non-www and www versions of the https site to Google Search Console (GSC)?
- Have you kept the http versions of your website in GSC?
- From the looks of it, your sitemap has been updated to reflect the https pages. Have you submitted the updated sitemap to GSC?
- Are there any sitemap errors appearing in GSC? Any other errors?
- Could you attach a screenshot of the indexation rate on both https and http versions of the site from GSC?
- Could you confirm that all redirects were done 1-to-1 and properly redirected? (301s and not 302s)
Some things that we could rule out:
- It looks like the site isn't using noindex tags in a way that would cause deindexing
- It looks like the robots.txt file isn't disallowing any important paths that would cause deindexation
- The http version of the www and non-www pages redirects to the www, https version of the site which is good
- Canonicals seem to be updated and pointing to the https version of the site
Sorry for all of the questions, I just want to make sure and rule out possible causes to focus in on what the issue could be.
Thanks, Serge
-
Hi!
what information do you seen in search console?
Assuming that you have already tested all of your old URL's and the redirection paths points correctly to the new URLs, does Google Search console indicates any problems with the number of URLs submitted to it?
canoncals? are they in use? pointing to the correct version of the site?
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Https vs Http Link Equity
Hi Guys, So basically have a site which has both HTTPs and HTTP versions of each page. We want to consolidate them due to potential duplicate content issues with the search engines. Most of the HTTP pages naturally have most of the links and more authority then the HTTPs pages since they have been around longer. E.g. the normal http hompage has 50 linking root domains while the https version has 5. So we are a bit concerned of adding a rel canonical tag & telling the search engines that the preferred page is the https page not the http page (where most of the link equity and social signals are). Could there potentially be a ranking loss if we do this, what would be best practice in this case? Thanks, Chris
Intermediate & Advanced SEO | | jayoliverwright0 -
HTTPS Login on HTTP Site | 301 or 302 Redirect?
I've searched the forum on this and online and can't seem to find a definitive answer. Some e-commerce sites that are http use a 302 redirect to the https login while other sites use a 301 redirect. I know 302 is generally not recommended but in this case it may make sense. Can anyone advise on the correct practice?
Intermediate & Advanced SEO | | CallMeNicholi0 -
Unpaid Followed Links & Canonical Links from Syndicated Content
I have a user of our syndicated content linking to our detailed source content. The content is being used across a set of related sites and driving good quality traffic. The issue is how they link and what it looks like. We have tens of thousands of new links showing up from more than a dozen domains, hundreds of sub-domains, but all coming from the same IP. The growth rate is exponential. The implementation was supposed to have canonical tags so Google could properly interpret the owner and not have duplicate syndicated content potentially outranking the source. The canonical are links are missing and the links to us are followed. While the links are not paid for, it looks bad to me. I have asked the vendor to no-follow the links and implement the agreed upon canonical tag. We have no warnings from Google, but I want to head that off and do the right thing. Is this the right approach? What would do and what would you you do while waiting on the site owner to make the fixes to reduce the possibility of penguin/google concerns? Blair
Intermediate & Advanced SEO | | BlairKuhnen0 -
Severe health issues are found on your site. - Check site health (GWT)
Hi, We run a Magento website - When i log in to Google Webmaster Tools, I am getting this message: Severe health issues are found on your site. - <a class="GNHMM2RBFH">Check site health
Intermediate & Advanced SEO | | bjs2010
</a>Is robots.txt blocking important pages? Some important page is blocked by robots.txt. Now, this is the weird part - the page being blocked is the admin page of magento - under
www.domain.com/index.php/admin/etc..... Now, this message just wont go away - its been there for days now - so why does Google think this is an "important page"? It doesnt normally complain if you block other parts of the site ?? Any ideas? THanks0 -
Unnatural Links Removal - are GWMT links enough?
Hi, When working on unnatural links penalty, is removing and disavowing links shown on the GWMT enough or should the list be broaden to include OSE and Majestic etc.? Thanks
Intermediate & Advanced SEO | | BeytzNet0 -
How do I find the links on my site that link to another one of my pages?
I ran IIS Seo toolkit and it found about 40 pages that I have no idea how they exist. What tool can I use to find out what internal link is linking to them so I can fix them or get rid of them?
Intermediate & Advanced SEO | | EcommerceSite0 -
Fading Text Links Look Like Spammy Hidden Links to a g-bot?
Ah, Hello Mozzers, it's been a while since I was here. Wanted to run something by you... I'm looking to incorporate some fading text using Javascript onto a site homepage using the method described here; http://blog.thomascsherman.com/2009/08/text-slideshow-or-any-content-with-fades/ so, my question is; does anyone think that Google might see this text as a possible dark hat SEO anchor text manipulation (similar to hidden links)? The text will contain various links (4 or 5) that will cycle through one another, fading in and out, but to a bot the text may appear initially invisible, like so; style="display: none;"><a href="">Link Here</a> All links will be internal. My gut instinct is that I'm just being stupid here, but I wanted to stay on the side of caution with this one! Thanks for your time 🙂 http://blog.thomascsherman.com/2009/08/text-slideshow-or-any-content-with-fades
Intermediate & Advanced SEO | | PeterAlexLeigh0 -
My site links have gone from a mega site links to several small links under my SERP results in Google. Any ideas why?
A site I have currently had the mega site links on the SERP results. Recently they have updated the mega links to the smaller 4 inline links under my SERP result. Any idea what happened or how do I correct this?
Intermediate & Advanced SEO | | POSSIBLE0