Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Spam URL'S in search results
-
We built a new website for a client.
When I do 'site:clientswebsite.com' in Google it shows some of the real, recently submitted pages. But it also shows many pages of spam url results, like this 'clientswebsite.com/gockumamaso/22753.htm' - all of which then go to the sites 404 page. They have page titles and meta descriptions in Chinese or Japanese too.
Some of the urls are of real pages, and link to the correct page, despite having the same Chinese page titles and descriptions in the SERPS.
When I went to remove all the spammy urls in Search Console (it only allowed me to temporarily hide them), a whole load of new ones popped up in the SERPS after a day or two. The site files itself are all fine, with no errors in the server logs.
All the usual stuff...robots.txt, sitemap etc seems ok and the proper pages have all been requested for indexing and are slowly appearing. The spammy ones continue though.
What is going on and how can I fix it?
-
Whoa, this is a weird one.
I saw that you posted this on Google's forums as well, and they suggested that this might be the Japanese keyword hack. Did you look into that? If that's not it, did you try loading the URLs that are showing up on the Wayback Machine? It's possible that someone who owned this site before your client created these pages.
Either way, the answer is to double check that your 404 pages really are 404ing. If that doesn't remove them from the index fast enough, you can actually create all of those pages, with a noindex tag, add them all to a sitemap, and submit them to Google. But the 404ing is really your long term solution.
Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
What's the best way to handle product filter URLs?
I've been researching and can't find a clear cut answer. Imagine you have a product category page e.g. domain/jeans You've a lot of options as to how to filter the results domain/jeans?=ladies,skinny,pink,10 or domain/jeans/ladies-skinny-pink-10 or domain/jeans/ladies/skinny?=pink,10 And in this how do you handle titles, breadcrumbs etc. Is the a way you prefer to handle filters and why do you do it that way? I'm trying to make my mind up as some very big names handle this differently e.g. http://www.next.co.uk/shop/gender-women-category-jeans/colour-pink-fit-skinny-size-10r VS https://www.matalan.co.uk/womens/shop-by-category/jeans?utf8=✓&[facet_filter][meta.tertiary_category][Skinny]=on&[facet_filter][variants.meta.size][Size+10]=on&[facet_filter][meta.master_colour][Midwash]=on&[facet_filter][min_current_price][gte]=6.0&[facet_filter][min_current_price][lte]=18.0&per=36&sort=
Technical SEO | | RodneyRiley0 -
Should I noindex my blog's tag, category, and author pages
Hi there, Is it a good idea to no index tag, category, and author pages on blogs? The tag pages sometimes have duplicate content. And the category and author pages aren't really optimized for any search term. Just curious what others think. Thanks!
Technical SEO | | Rignite0 -
Is it good practice to still pay for Best of the Web Directory (BOTW) and other similar one's you have to pay for?
I know that paid for links are hit by Google, but in the past these directories were okay. What about now? Thank you.
Technical SEO | | RoxBrock0 -
How to remove my cdn sub domins on Google search result?
A few months ago I moved all my Wordpress images into a sub domain. After I purchased CDN service, I again moved that images to my root domain. I added User-agent: * Disallow: / to my CDN domain. But now, when I perform site search on the Google, I found that my CDN sub domains are indexed by the Google. I think this will make duplicate content issue. I already hit by the Panguin. How do I remove these search results on Google? Should I add my cdn domain to webmaster tools to request URL removal request? Problem is, If I use cdn.mydomain.com it shows my www.mydomain.com. My blog:- http://goo.gl/58Utt site search result:- http://goo.gl/ElNwc
Technical SEO | | Godad1 -
Google's "cache:" operator is returning a 404 error.
I'm doing the "cache:" operator on one of my sites and Google is returning a 404 error. I've swapped out the domain with another and it works fine. Has anyone seen this before? I'm wondering if G is crawling the site now? Thx!
Technical SEO | | AZWebWorks0 -
Can I format my H1 to be smaller than H2's and H3's on the same page?
I would like to create a web design with 12px H1 and for sub headings on the page to be more like 24px. Will search engines see this and dislike it? The reason for doing it is that I want to put a generic page title in the banner, and more poetic headings above the main body. Example: Small H1: Wholesale coffee, online coffee shop and London roastery Large h2: Respect the bean... Thanks
Technical SEO | | Crumpled_Dog
Scott0 -
Blocking URL's with specific parameters from Googlebot
Hi, I've discovered that Googlebot's are voting on products listed on our website and as a result are creating negative ratings by placing votes from 1 to 5 for every product. The voting function is handled using Javascript, as shown below, and the script prevents multiple votes so most products end up with a vote of 1, which translates to "poor". How do I go about using robots.txt to block a URL with specific parameters only? I'm worried that I might end up blocking the whole product listing, which would result in de-listing from Google and the loss of many highly ranked pages. DON'T want to block: http://www.mysite.com/product.php?productid=1234 WANT to block: http://www.mysite.com/product.php?mode=vote&productid=1234&vote=2 Javacript button code: onclick="javascript: document.voteform.submit();" Thanks in advance for any advice given. Regards,
Technical SEO | | aethereal
Asim0 -
A question about RSS feeds and nofollow's
With the nofollow tag used very widely on the internet these days I was just wondering about how an RSS feed might help me find a way around it. Basically my question is this : I post a comment on a blog, it's approved and my comment together with my link(nofollow tag applied) is there. Now when the blogs RSS feed updates, does this nofollow tag get applied to the feed? As far as I can tell it does not - but I'm not too clue'd up on how the feed is generated. Anyone want to help me understand how it works and if what I'm suggesting would be 'a way around the nofollow tag' ? Thanks 🙂
Technical SEO | | DanHill0