Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
URL Injection Hack - What to do with spammy URLs that keep appearing in Google's index?
-
A website was hacked (URL injection) but the malicious code has been cleaned up and removed from all pages. However, whenever we run a site:domain.com in Google, we keep finding more spammy URLs from the hack. They all lead to a 404 error page since the hack was cleaned up in the code. We have been using the Google WMT Remove URLs tool to have these spammy URLs removed from Google's index but new URLs keep appearing every day. We looked at the cache dates on these URLs and they are vary in dates but none are recent and most are from a month ago when the initial hack occurred.
My question is...should we continue to check the index every day and keep submitting these URLs to be removed manually? Or since they all lead to a 404 page will Google eventually remove these spammy URLs from the index automatically?
Thanks in advance Moz community for your feedback.
-
If the urls follow any particular pattern then you can use a htaccess redirect and return the header code 410 / 403 / 404 to Google. (I suggest 410) They will soon drop out of the index.
I don't know the exact .htaccess syntax off the top of my head but it will be something like this:
If they all come from the same folder then it would look something like this:
RedirectMatch 410 ^/folder/.*$If they have a common character string after the forward slash (such as xyz) then it would look something like this:
RedirectMatch 410 ^/xyz.*$If they have any common character string footprints at all (such as xyz) then it would look something like this (now I'm guessing):
RedirectMatch 410 ^/()xyz.$This would be a pretty easy fix if all of those spammy urls have any common characters after the forward slash or they all originate from a certain folder.
-
You might get a little quicker removal if you send them with a 410 status code. That will let Google know that the page is gone for good. http://searchenginewatch.com/sew/how-to/2340728/matt-cutts-on-how-google-handles-404-410-status-codes
-
No problem at all! These new URLs do not actually exist on the website. Since we cleaned up the malicious code all of these URLs redirect to our 404 page.
-
Sorry to misunderstand the problem. Do those new urls actually exist on your site or just in search?
-
Hi 94501,
Thanks for taking the time to respond. Just to be clear, we are not submitting multiple removals for the same URL and I don't think Google WMT even allows you to do that. Completely new URLs are appearing each day after removing the older ones.
My main concern is having spammy URLs indexed and associated with my website and the negative effects it can have from an SEO perspective.
-
Hi Pete,
It sounds like you've done what you can. I wouldn't submit multiple removals for the same url.
I assume it's out of your site map and you're not still being hacked and have figured out how it happened and taken steps to fix it.
Google will eventually figure it out. I'd try to move on to new stuff.
Best... Mike
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Staging website got indexed by google
Our staging website got indexed by google and now MOZ is showing all inbound links from staging site, how should i remove those links and make it no index. Note- we already added Meta NOINDEX in head tag
Intermediate & Advanced SEO | | Asmi-Ta0 -
How does educational organization schema interact with Google's knowledge graph?
Hi there! I was just wondering if the granular options of the Organization schema, like Educational Organization (http://schema.org/EducationalOrganization) and CollegeOrUniversity (http://schema.org/CollegeOrUniversity) schema work the same when it comes to pulling data into the knowledge graph. I've typically always used the Organization schema for customers but was wondering if there are any drawbacks for going deep into the hierarchy of schema. Cheers 😄
Intermediate & Advanced SEO | | Corbec8880 -
Google does not want to index my page
I have a site that is hundreds of page indexed on Google. But there is a page that I put in the footer section that Google seems does not like and are not indexing that page. I've tried submitting it to their index through google webmaster and it will appear on Google index but then after a few days it's gone again. Before that page had canonical meta to another page, but it is removed now.
Intermediate & Advanced SEO | | odihost0 -
Does Google Index URLs that are always 302 redirected
Hello community Due to the architecture of our site, we have a bunch of URLs that are 302 redirected to the same URL plus a query string appended to it. For example: www.example.com/hello.html is 302 redirected to www.example.com/hello.html?___store=abc The www.example.com/hello.html?___store=abc page also has a link canonical tag to www.example.com/hello.html In the above example, can www.example.com/hello.html every be Indexed, by google as I assume the googlebot will always be redirected to www.example.com/hello.html?___store=abc and will never see www.example.com/hello.html ? Thanks in advance for the help!
Intermediate & Advanced SEO | | EcommRulz0 -
My site shows 503 error to Google bot, but can see the site fine. Not indexing in Google. Help
Hi, This site is not indexed on Google at all. http://www.thethreehorseshoespub.co.uk Looking into it, it seems to be giving a 503 error to the google bot. I can see the site I have checked source code Checked robots Did have a sitemap param. but removed it for testing GWMT is showing 'unreachable' if I submit a site map or fetch Any ideas on how to remove this error? Many thanks in advance
Intermediate & Advanced SEO | | SolveWebMedia0 -
How to de-index old URLs after redesigning the website?
Thank you for reading. After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain). It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?) What is the best way to deal with this issue?
Intermediate & Advanced SEO | | Chemometec0 -
How to find all indexed pages in Google?
Hi, We have an ecommerce site with around 4000 real pages. But our index count is at 47,000 pages in Google Webmaster Tools. How can I get a list of all pages indexed of our domain? trying to locate the duplicate content. Doing a "site:www.mydomain.com" only returns up to 676 results... Any ideas? Thanks, Ben
Intermediate & Advanced SEO | | bjs20100 -
Removing Dynamic "noindex" URL's from Index
6 months ago my clients site was overhauled and the user generated searches had an index tag on them. I switched that to noindex but didn't get it fast enough to avoid being 100's of pages indexed in Google. It's been months since switching to the noindex tag and the pages are still indexed. What would you recommend? Google crawls my site daily - but never the pages that I want removed from the index. I am trying to avoid submitting hundreds of these dynamic URL's to the removal tool in webmaster tools. Suggestions?
Intermediate & Advanced SEO | | BeTheBoss0