Website blog is hacked. Whats the best practice to remove bad urls
-
Hello
So our site was hacked which created a few thousand spam URLs on our domain. We fixed the issue and changed all the spam urls now return 404. Google index shows a couple of thousand bad URLs.
My question is-
What's the fastest way to remove the URLs from google index. I created a site map with sof the bad urls and submitted to Google. I am hoping google will index them as they are in the sitemap and remove from the index, as they return 404.
Any tools to get a full list of google index? ( search console downloads are limited to 1000 urls). A Moz site crawl gives larger list which includes URLs not in Google index too. Looking for a tool that can download results from a site: search.
Any way to remove the URLs from the index in bulk? Removing them one by one will take forever.
Any help or insight would be very appreciated.
-
Technically 404 means "temporarily unavailable but coming back later" so you might want to consider Status 410 instead of 404. You could also supplement it with Meta no-index, if you can't use the HTML implementation then fire the no-index directive through the HTTP header using X-robots:
https://developers.google.com/search/reference/robots_meta_tag (scroll down a little to find the relevant part)
E.g:
"HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
(…)
X-Robots-Tag: noindex
(…)"... something like that.
You can't use Search Console to remove URLs from Google at all. The remove URL tool, only removes URLs one at a time and it only does so 'temporarily', the URLs pop back again after a bit. The best thing you can do is give Google some harsher directives and hope they listen, in a month or two most of those should be gone
Don't use robots.txt on the URLs as, if Google can't crawl them it won't find the 410s or the no-index directives
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Structure
I'm going through the process of redesigning our website, and the URL structure was brought up. We currently have our URLs structured as domain.com/keyword. It seems that some people think setting your URLs up to look like: domain.com/directory/keyword makes more sense from a user's perspective, and from a search engine's perspective. With our directories labeled as services, solutions, clients - I see no value in adding directories as it dilutes the keyword and brings the keyword further away from the domain. Are there situations where adding a directory before the page in the URL makes sense? If anyone has data showing the difference between the two that'd be great! Thanks, Brian
Technical SEO | | PrasoonGoel0 -
Is it worth changing our blog post URL's?
We're considering changing the URL's for our blog posts and dropping the date information. Ex. http://spreecommerce.com/blog/2012/07/27/spree-1-1-3-released/ changes to http://spreecommerce.com/blog/spree-1-1-3-released/ Based on what I've learned here the new URL is better for SEO but since these pages already exist do we risk a minor loss of Google juice with 301 redirects? We have a sitemap for the blog posts so I imagine this wouldn't be too hard for Google to learn the new ones.
Technical SEO | | schof0 -
Cloaking? Best Practices Crawling Content Behind Login Box
Hi- I'm helping out a client, who publishes sale information (fashion sales etc.) In order for the client to view the sale details (date, percentage off etc.) they need to register for the site. If I allow google bot to crawl the content, (identify the user agent) but serve up a registration light box to anyone who isn't google would this be considered cloaking? Does anyone know what the best practice for this is? Any help would be greatly appreciated. Thank you, Nopadon
Technical SEO | | nopadon0 -
What's the best way to eliminate duplicate page content caused by blog archives?
I (obviously) can't delete the archived pages regardless of how much traffic they do/don't receive. Would you recommend a meta robot or robot.txt file? I'm not sure I'll have access to the root directory so I could be stuck with utilizing a meta robot, correct? Any other suggestions to alleviate this pesky duplicate page content issue?
Technical SEO | | ICM0 -
Website hacked
Hi I've been asked to help a colleague with his website. It seems to be hacked. He recently received an e-mail from Google saying his adwords account was suspended 'due to high probability his site may be hosting or distributing malicious software' I just checked his source and there seems to loads of weird on code on his pages, this would not have been but on by any members of the website owners. Please image attached when we try to access his website via google search I just contacted the hosting provider - does anyone have experience with this and how to prevent such hacking in the future. The site is build using HTML with no CMS. IjW19.jpg
Technical SEO | | Socialdude0 -
Language/country redirect best practice?
Hi, What is SEO best practice when it comes to redirecting users from www.domain.com to their specific language/country, let's say www.domain.com/de for Germany? From what I heard in on of the whiteboard fridays, it seems to be Javascript based on IP and browser language, and then set a cookie - correct? Or should we let our users manually select their language/country at the first visit? Any suggestion appreciated, thanks!
Technical SEO | | rtora0 -
Website Page Structuring and URL re writing - need helpful resources
Hello, I am not technically very sound and I need some good articles that teach me how to think about and go about website pages structuring and url rewriting that is seo friendly. I will be most obliged if some of you great seomoz-ers can pitch in with help. Regards, Talha ZigZag Solutions
Technical SEO | | TopGearMedia0 -
New website with slightly new urls
Hi we recently designed our website in work and changed some of the urls. the old site used to be http://www.example.ie/contact-us.htm now it's is http://example.ie/get-in-touch The problem we are having is with sitelinks (the ones auto generate in the serp) ie: about, contact us, team etc etc. Once cliked on, these OLD links are all going to 404 pages because of the change of url. Help with this would be greatly appreciated - I was thinking of blocking these old sitelinks in google web master.
Technical SEO | | GlenBOB0