How to de-index a page with a search string with the structure domain.com/?"spam"
-
The site in question was hacked years ago. All the security scans come up clean but the seo crawlers like semrush and ahrefs still show it as an indexed page. I can even click through on it and it takes me to the homepage with no 301. Where is the page and how to deindex it?
domain/com/?spam
There are multiple instances of this.
http://www.clipular.com/c/5579083284217856.png?k=Q173VG9pkRrxBl0b5prNqIozPZI
-
You are most welcome. I'm glad to hear your road to site recovery is coming along. I'm also glad to confirm that, to all of my knowledge, your understanding of the "*" operator and Disallow /?spam string is correct. One more thing:
Fetch as Google and Request Indexing
Apologies, I neglected to mention this step in my answer. It should be included. This is the best tool I'm aware of to ask Google, "hey, crawl me please." Do this after you upload your shiny new robots.txt.In GSC, under Crawl, select Fetch as Google. Then, select Fetch and Render. When status is partial or complete, click Request Indexing. There is no guarantee here, and my experience is Google does what it wants. Even so, I've seen results in less than 2 hours (full disclosure: the longest I've waited has been 3 days).
Penalty Free I agree. They cannot possibly be penalizing your site. At least, not purposefully. You have taken all recommended actions and then some to resolve site issues. Even if you do have a few bad back links floating around out there from some blackhat t3 site PBN, Penguin 4.0 should discredit that bad link juice. Your site doesn't even have the offending pages. It's just a matter of time before Google's index lines back up with your live site.
Good Work Sir,
Wipe the Index Clean,
CopyChrisSEO and the Vizergy Team -
Thanks very much for your explanation.
I have gone ahead and temporarily blocked the pages in GSC.
I am working on the robot.txt and see there are no instructions for the crawlers to skip over these urls in question.
I understand that I should use the "*" operator to alert all crawlers to disallow the pages in this format:
user-agent: *
Disallow: /?spam string
Finally, I will send the suggested edit to Google and see where that gets me. Honestly, at this point, they cannot possibly be penalized the site any worse so anything working towards cleaning up the index for the site will be a step in the right direction.
-
Hello Miamirealestatetrendsguy and fellow Mozers,
It sounds like you have had a crazy time handling this hack. Good news is, as far as I can tell from the given information, you are close to resolution. Googlebot should correct the indexed pages over time. I'm certain you would like to expedite that process. Here are three recommendations that come to mind: Remove URLs via GSC, block the offending URLs via robots.txt, and suggest edits in Google's SERPs.
Remove URLs via GSC
In GSC, under Google Index, select Remove URLs. This suppression is temporary however. Click on more information for more about that. My experience with it as been suppression for a few months. Don't worry about the time though. Our next step should take affect before your time is up.Block the Offending URLs via Robots.txt
Before you do this, be very certain what you are doing. After you are confident, list your offending URLs, edit the offending URLs as noindex nofollow in your robots.txt, and upload it. Hopefully, you can find commonalities to shorten this list and save your time.Note: I have purposefully avoided the details on how to this here because it is vital SEOs learn how to do it with full knowledge of potential risks as well as how to avoid those risks. Here are some resources:
• Google Support • Moz's Robots.txt Rundown
• Search Engine Land's Deeper LookSuggest Edits in Google's SERPs This one is iffy, and I really don't trust Google using this feedback. However, I have done it and it worked more than once. Find your offending results and send specific feedback.
Wipe that Index Clean,
CopyChrisSEO and the Vizergy Team
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Page with "random" content
Hi, I'm creating a page of 300+ in the near future, on which the content basicly will be unique as it can be. However, upon every refresh, also coming from a search engine refferer, i want the actual content such as listing 12 business to be displayed random upon every hit. So basicly we got 300+ nearby pages with unique content, and the overview of those "listings" as i might say, are being displayed randomly. Ive build an extensive script and i disabled any caching for PHP files in specific these pages, it works. But what about google? The content of the pages will still be as it is, it is more of the listings that are shuffled randomly to give every business listing a fair shot at a click and so on. Anyone experience with this? Ive tried a few things in the past, like a "Last update PHP Month" in the title which sometimes is'nt picked up very well.
Technical SEO | | Vanderlindemedia0 -
Which pages should I index or have in my XML sitemap?
Hi there, my website is ConcertHotels.com - a site which helps users find hotels close to concert venues. I have a hotel listing page for every concert venue on my site - about 12,000 of them I think (and the same for nearby restaurants). e.g. https://www.concerthotels.com/venue-hotels/madison-square-garden-hotels/304484 Each of these pages list the nearby hotels to that concert venue. Users clicking on the individual hotel are brought through to a hotel (product) page e.g. https://www.concerthotels.com/hotel/the-new-yorker-a-wyndham-hotel/136818 I made a decision years ago to noindex all of the /hotel/ pages since they don't have a huge amount of unique content and aren't the pages I'd like my users to land on . The primary pages on my site are the /venue-hotels/ listing pages. I have similar pages for nearby restaurants, so there are approximately 12,000 venue-restaurants pages, again, one listing page for each concert venue. However, while all of these pages are potentially money-earners, in reality, the vast majority of subsequent hotel bookings have come from a fraction of the 12,000 venues. I would say 2000 venues are key money earning pages, a further 6000 have generated income of a low level, and 4000 are yet to generate income. I have a few related questions: Although there is potential for any of these pages to generate revenue, should I be brutal and simply delete a venue if it hasn't generated revenue within a time period, and just accept that, while it "could" be useful, it hasn't proven to be and isn't worth the link equity. Or should I noindex these "poorly performing pages"? Should all 12,000 pages be listed in my XML sitemap? Or simply the ones that are generating revenue, or perhaps just the ones that have generated significant revenue in the past and have proved to be most important to my business? Thanks Mike
Technical SEO | | mjk260 -
Google Sites website https://www.opcfitness.com/ title NOT GOOD FOR SEO
We set up a website https://www.opcfitness.com/home on google sites. but google sites page title not good for SEO. How to fix it?
Technical SEO | | ahislop5740 -
Spam pages / content created due to hack. 404 cleanup.
A hosting company's server was hacked and one of our customer's sites was injected with 7,000+ pages of fake, bogus, promotional content. Server was patched and spammy content removed from the server. Reviewing Google Webmaster's Tools we have all the hacked pages showing up as 404's and have a severe drop in impressions, rank and traffic. GWT also has 'Some manual actions apply to specific pages, sections, or links'... What do you recommend for: Cleaning up 404's to spammy pages? (I am not sure redirect to home page is a right thing to do - is it?) Cleaning up links that were created off site to the spam pages Getting rank bank // what would you do in addition to the above?
Technical SEO | | GreenStone0 -
Pages to be indexed in Google
Hi, We have 70K posts in our site but Google has scanned 500K pages and these extra pages are category pages or User profile pages. Each category has a page and each user has a page. When we have 90K users so Google has indexed 90K pages of users alone. My question is. Should we leave it as they are or should we block them from being indexed? As we get unwanted landings to the pages and huge bounce rate. If we need to remove what needs to be done? Robots block or Noindex/Nofollow Regards
Technical SEO | | mtthompsons0 -
How to link site.com/blog or site.com/blog/
Hello friends, I have a very basic question but I can not find the right answer... I have made my blog linkbuilding using the adress "mysite.com/blog" but now im not sure if is better to do the linkbuilding to "mysite.com**/blog/ "** Is there any diference? Thanks...
Technical SEO | | lans27870 -
No existing pages in Google index
I have a real estate portal. I have a few categories - for example: flats, houses etc. Url of category looks like that: mydomain.com/flats/?page=1 Each category has about 30-40 pages - BUT in Google index I found url like: mydomain.com/flats/?page=1350 Can you explain it? This url contains just headline etc - but no content! (it´s just generated page by PHP) How is it possible, that Google can find and index these pages? (on the web, there are no backlinks on these pages) thanks
Technical SEO | | visibilitysk0 -
On page audit throws a rel="canonical" curve ball :-(
Good Morning from -3 Degrees C, still no paths gritted wetherby UK 😞 Following an on page audit one recommendation instructs me to ad:
Technical SEO | | Nightwing
http://www.barrettsteel.com/" /> on the home page of barrett steel. I'm confused, i thought i only had to add this to duplications
the home page which to my knowledge dont exist. So my question is please: "Why shoul i ad this snippet of code on the home page of http://www.barrettsteel.com http://www.barrettsteel.com/" /> Any insights welcome 🙂0