Spam pages being redirected to 404s but sill indexed
-
Client had a website that was hacked about a year ago. Hackers went in and added a bunch of spam landing pages for various products. This was before the site had installed an SSL certificate.
After the hack, the site was purged of the hacked pages and and SLL certificate was implemented. Part of that process involved setting up a rewrite that redirects http pages to the https versions.
The trouble is that the spam pages are still being indexed by Google, even months later. If I do a site: search I still see all of those spam pages come up before most of the key "real" landing pages. The thing is, the listing on the SERP are to the http versions, so they're redirecting to the https version before serving a 404.
Is there any way I can fix this without removing the rewrite rule?
-
In addition to the above, you can request removal from Google's index in Search Console
https://support.google.com/webmasters/answer/1663419?hl=en
As noted, the removal is temporary (90 days), but if you've removed the pages and any links to them, then they won't reappear.
What I would do is just check that your sitemap is up to date, and there aren't any legacy sitemaps hanging about that might still reference the pages, and also run a crawl of your site to ensure that there aren't any remaining links to these pages hanging about.
-
You could also redirect those pages with a 301 directly to the 404 page. Or you could block those pages on robots.txt if you don't need them anymore.
-
I'd recommend putting all of the urls to deindex into a sitemap, set LASTMOD date to something recent and submit for google to recrawl.
If possible, set the status codes on those pages to 410 as well.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
404s on subfolder - how to redirect?
Hi all,
Technical SEO | | MFSMarketing
we have a lot of 404s to subfolders. Eg
www.website.com/blog-post-title/imagename/
www.website.com/blog-post-title/author/ We don't have these subfolders or blog posts anymore.
How do i redirect them? These links (404s) don't seem to have any value or backlinks. Thanks,
Stef0 -
Home Page Being Indexed / Referral URLs /
I have a few questions related to home page URLs being indexed, canonicalization, and GA reporting... 1. I can view the home page by typing in domain.com , domain.com/ and domain.com/index.htm There are no redirects and it's canonicalized to point to domain.com/index.htm -- how important is it to have redirects? I don't want unnecessary redirects or canonical tags, but I noticed the trailing slash can sometimes be typed in manually on other pages, sometimes not. 2. When I do a site search (site:domain.com), sometimes the HP shows up as "domain.com/", never "domain.com/index.htm" or "domain.com", and sometimes the HP doesn't show up period. This seems to change several times a day, sometimes within 15 minutes. I have no idea what is causing it and I don't know if it has anything to do with #1. In a perfect world, I would ask for the /index.htm to be dropped and redirected to .com/, and the canonical to point to .com/ 3. I've noticed in GA I see / , /index.htm, and a weird Google referral URL (/index.htm?referrer=https://www.google.com/) all showing up as top pages. I think the / and /index.htm is because I haven't setup a default URL in GA, but I'm not sure what would cause the referrer. I tracked back when the referrer URL started to show up in the top pages, and it was right around the time they moved over to https://, so I'm not sure what the best option is to remove that. I know this is a lot - I appreciate any insight anyone can provide.
Technical SEO | | DigMS0 -
All of my pages are indexed except for 1\. How could that be?
Yesterday we were ranking #4 for our main keyword and today we're not even indexed. Not robots.txt issue, we've just added a rel canonical to page and submitted our sitemap again. What else could we do?
Technical SEO | | paulb.credible0 -
Best way to handle pages with iframes that I don't want indexed? Noindex in the header?
I am doing a bit of SEO work for a friend, and the situation is the following: The site is a place to discuss articles on the web. When clicking on a link that has been posted, it sends the user to a URL on the main site that is URL.com/article/view. This page has a large iframe that contains the article itself, and a small bar at the top containing the article with various links to get back to the original site. I'd like to make sure that the comment pages (URL.com/article) are indexed instead of all of the URL.com/article/view pages, which won't really do much for SEO. However, all of these pages are indexed. What would be the best approach to make sure the iframe pages aren't indexed? My intuition is to just have a "noindex" in the header of those pages, and just make sure that the conversation pages themselves are properly linked throughout the site, so that they get indexed properly. Does this seem right? Thanks for the help...
Technical SEO | | jim_shook0 -
Spam posts indexed, what to do now?
Hi, So we had a staff problem last week and we let some spam posts (cheap nike jerseys etc.) that also got indexed by Google. (We just checked and there are lik 105 already indexed) Of course we have now removed all these spam posts but what is the best practice at this point? Are we supposed to do something else to remove these from Google's index? (maybe through google webmaster tools?) We have already edited robots.txt to disallow those pages as a quick remedy. And finally, could this have done any harm? We were quite slow noticing these posts to remove them. They were there for about 12 days. thanks
Technical SEO | | Gamer070 -
Search Result Page, Index or Not?
I believe Google doesn't want to index and show other search result pages in there SERP.
Technical SEO | | DigitalJungle
So instead of adding "noindex, follow" tag i have changed the url in my search result page like this: Original
http://www.mysite.com/kb-search.aspx?=travelguide&type=wiki&s=3 To
http://www.mysite.com/travelguide/attraction-guide.html And the search result page contains the title of the articles, a short descriptions (300 chars.) and a link to the articles. Does it help? Or should i add noindex, follow tag? Helps Please?0 -
301 lots of old pages to home page
Will it hurt me if i redirect a few hundred old pages to my home page? I currently have a mess on my hands with many 404's showing up after moving my site to a new ecommerce server. We have been at the new server for 2 years but still have 337 404s showing up in google webmaster tools. I don't think it would affect users as very few people woudl find those old links but I don't want to mess with google. Also, how much are those 404s hurting my rank?
Technical SEO | | bhsiao1 -
IIS Server Load for 500 Page Level 301 Redirects
We are migrating content from 10 sub domains to our www site. On an IIS sever, what is potential server load impact, if any, for setting up 500 plus page level redirects?
Technical SEO | | DigitalMkt0