Please help :) Troubles getting 3 types of content de-indexed
-
Hi there,
I know that it takes time and I have already submitted a URL removal request 3-4 months ago.
But I would really appreciate some kind advice on this topic.Thank you in advance to everyone who contributes!
1) De-indexing archives
Google had indexed all my:
/tag/
/authorname/
archives.I have set them as no-index a few months ago but they still appear in search engine.
Is there anything I can do to speed up this de-indexing?2) De-index /plugins/ folder in wordpress site
They have also indexed all my /plugins/ folder. So I have added a disallow /plugin/ in my robots.txt 3-4 months ago, but /plugins/ still appear in search engine.
What can I do to get the /plugins/ folder de-indexed?
Is my disallow /plugins/ in robots.txt making it worse because google has already indexed it and not it can't access the folder? How do you solve this?3) De-index a subdomain
I had created a subdomain containing adult content, and have it completely deleted it from my cpanel 3months ago, but it still appears in search engines.
Anything else I can do to get it de-indexed?
Thank you in advance for your help!
-
Hi Fabio
If the content is gone when you visit your old URLs do you get a 404 code? You can plug the old URLs into urivalet.com to see what code is returned. If you do, then you're all set. If you don't, see if you can just upload a robots.txt file to that subdomain and block all search engines. Here's info on how to do that http://www.robotstxt.org/robotstxt.html
-Dan
-
Hey Dan,there is no content.
The whole website has been deleted, but it still appears in search results.What should I do?
should I put back some content and then de-index it?Thanks!
fabio -
Hi There
You should ensure the content either;
- has meta noindex tags
- or is blocked with robots.txt
- or 404's or 410's (is missing)
And then use the URL removal tool again and see if that works.
-
Hey Dan thanks a lot for all your help!
There still is a problem though. A while ago I had created an adult subdomain: adult.mywebsite.comThen I completely deleted everything inside it (even though I noticed the subfolder is still in my account).
A few days ago, when I started this thread, I also created a GWMT account for adult.mywebsite.com and submitted a removal request for all those URLs (about 15).Now today when I check:
site:mywebsite.com
or
site.adult.mywebsite.comthe URLs still appear in search results.
When I check
cache:adult.mywebsite.comit sends me to a google 404 page:
http://webcache.googleusercontent.com/search?/complete/search?client=hp&hl=en&gs_rn=31&gs_ri=hp&cp=26&gs_id=s xxxxxxxxxxxxxxxxxxxxxxxxSo I don't know what this means...
Does it mean google hasn't deindexed them?
How do I get them deindexed?
Is it possible google is having troubles de-indexing them because they have no content in them or something like that?What should I do to get rid of them?
Thanks a lot!!!!!!!!!!
Fabio -
Hey Fabio
Regarding #2 I'd give it a little bit more time. 301's take a little longer to drop out, so maybe check back in a week or two Technically the URL removal will mainly work if the content now 404's, is noindexed or blocked in robots.txt but with a redirdect you can do none of those, so you just have to wait for them to pick up on the redirects.
-Dan
-
Hi Dan,
1. Ok! I will.
2. When I click on the /go/ link in search results it redirects me to the affiliate website. I asked for the removal of /go/ a few days ago, but they (about 30 results) still appear in google when I search with the site:mywebsite.com trick.
What should I do about it? How can I get rid of them? They were created with the SimpleUrl plugin which I deleted about 3 months ago though.
3. Got it!
Thanks!
Fabio -
Hi There
1. For the flash file NoReflectLight.swf - I would do a removal request in WMT and maintain the blocking in robots.txt of /plugins/
2. When you do a URL removal in WMT the files need to either be blocked in robots.txt or have a noindex on them or 404. Doesn't that sort of link redirect to your affiliate product? In other words, if I were to try to visit /go/affiliate-product/ it would redirect to www.affiliateproductwebsite.com ?Or does /go/affiliate-product/ load it's on page on your site?
3. I would maintain the robots.txt bloking on /plugins/ - if no other files from there are indexed, they will not be in the future.
-Dan
-
Hey Dan,
thanks for the quick reply.I have gone trough site:mywebsite.com and I found that tags and categories disappeared but there still is some content that shouldn't be indexed like this:
mywebsite.com/wp-content/plugins/wp-flash-countdown/counter_cs3_v2_NoReflectLight.swf
and this:
mywebsite.com/go/affiliate-product/and I found this:Disallow: /wp-content/plugins/
in my robots.txtThing is that:
- I have deleted that wp-flash-countdown plugin at least 9 months ago
- I have manually removed all the urls with /go/ from GWMT and when I search for a cached version of them they are not there
- If I remove Disallow: /wp-content/plugins/ from my robots.txt won't that get all my plugins' pages to be indexed? So how do I make sure they are not indexed?
Thank you so much for your help!So far you have been the most helpful answerer in this forum.
-
Hey There
You want to look for this;
You can just do a cntrl-f (to search text in the source) and type in "noindex" and it should be present on the Tag archives.
-Dan
-
Hey Dan, thanks a lot for your help.
I have tried the cache trick on my home page and the cached version was about 4-5 days old.
I have then tried to cache:mywebsite/tag/ and it gives me a google 404 not found which I suppose is a good sign.
But if they have been de-indexed why do they appear in search results then?
I am not sure how to check the double SEO no-index in the source code though. How do I do that exactly? What should I look for after right-clicking -> source code?
Thanks for your help!
My MOZ account ends in two days so I may not be able to reply back next time.
-
Hi There
Should have explained better
if you type cache: in front of any web URL for example cache:apple.com you get;
And see the "cache" date? This is not the same as the crawl date, but it can give you a rough indication of how often Google might be looking at your pages.
So try that on some of your tag archives and if the cache date is say 4+ weeks ago maybe Google isn't looking at the site very often.
But it's odd they haven't been removed yet, especially with the URL removal tool - that tool usually only takes a day. Noindex tags usually only take a week or two.
Have you examined the source code to make sure it does in fact say "noindex" by the robots tag - or that there is not a conflicting duplicate robots noindex tag? Sometimes wordpress themes and plugins both try adding SEO tags and you can end up with duplicates.
-Dan
-
Hey Dan thanks,
well, so google had indexed all my tags, categories and stuff.The only things I had blocked in my robots was
/go/ for affiliate links
and
/plugins/ for pluginsso I did let google see that categories and archives pages were no-indexed.
I have also submit the removal request many months ago but I haven't quite understood what you say about the cache dates. What should I check?
Thanks for your help!
-
Hi There
For all these cases above, this may be a situation where you've BOTH blocked these in robots.txt and added noindex tags. You can not block the directories in robots.txt and get them deindexed, because Google can not then crawl the URLs to see the noindex tag.
If this is the case, I would remove any disallows to /tag/ etc in robots.txt, allow Google to crawl the URLs to see the nodinex tags - wait a few weeks and see what happens.
As far as the URL removal not working, make sure you have the correct subdomain registered - www or non-www etc for the URLs you want removed.
If neither one of those is the issue, please write back so I can try to help you more with that. Google should noindex the pages in a week or two under normal situations. The other thing is, check the cache date of the pages. If the cache dates are prior to the date you added the noindex, Google might not have seen the noindex directives yet.
-Dan
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Getting Authority Social Blogs to Index
We have a few authority blogs that I manage to help increase our brand awareness and build power to our website. We have Blogspot, Wordpress, Tumblr & Typepad. Our content get's a summary syndicated to our authority blogs with an attribution link back to the original post. I also manually check them one a month to make sure it looks good and the content syndicated correctly. I even add unique content to these blogs once in awhile. I recently realized that the majority of the pages are not indexing. I added the blogs to our GSC & Bing webmasters and submitted the sitemaps. This was done on December 11th, as of now some pages indexed in Google and Bing says the sitemaps are still pending... Blogspot - 32/100 pages indexed Wordpress - 34/81 pages indexed Tumblr - 4/223 pages indexed Typepad - 3/63 pages indexed Can anyone help me figure out why I can't get Google to index more pages or Bing to process the sitemaps timely?
Intermediate & Advanced SEO | | LindsayE1 -
Please help - Duplicate Content
Hi, I am really struggling to understand why my site has a lot of duplicate content issues. It's flagging up as ridiculously high and I have no idea how to fix this, can anyone help me, please? Website is www.firstcapitol.co.uk
Intermediate & Advanced SEO | | Alix_SEO1 -
HTTP HTTPS Migration Gone Wrong - Please Help!
We have a large (25,000 Products) ecommerce website, and we did an HTTP=>HTTPS migration on 3/14/17, and our rankings went in the tank, but they are slowly coming back. We initially lost 80% of our organic traffic. We are currently down about 50%. Here are some of the issues. In retrospect, we may have been too aggressive in the move. We didn't post our old sitemaps on the new site until about 5 days into the move. We created a new HTTPS property in search console. Our redirects were 302, not 301 We also had some other redirect issues We changed our URL taxonomy from http://www.oursite.com/category-name.html to https://www.oursite.com/category-name (removed the .html) We changed our filters plugin. Proper canonicals were used, but the filters can generate N! canonical pages. I added some parameters (and posted to Search Console) and noindex for pages with multiple filter choices to cut down on our crawl budget yesterday. Here are some observations: Google is crawling like crazy. Since the move, 120,000+ pages per day. These are clearly the filtered pages, but they do have canonicals. Our old sitemaps got error messages "Roboted Out". When we test URLs in Google's robots.txt tester, they test fine. Very Odd. At this point, in search console
Intermediate & Advanced SEO | | GWMSEO
a. HTTPS Property has 23,000 pages indexed
b. HTTP Property has 7800 pages indexed
c. The crawl of our old category sitemap (852 categories) is still pending, and it was posted and submitted on Friday 3/17 Our average daily organic traffic in search console before the move was +/-5,800 clicks. The most recent Search Console had HTTP: 645 Clicks HTTPS: 2000 clicks. Our rank tracker shows a massive drop over 2 days, bottoming out, and then some recovery over the next 3 days. HTTP site is showing 500,000 backlinks. HTTPS is showing 23,000 backilinks. I am planning on resubmitting the old sitemaps today in an attempt to remap our redirects to 301s. Is this typical? Any ideas?0 -
How to de-index old URLs after redesigning the website?
Thank you for reading. After redesigning my website (5 months ago) in my crawl reports (Moz, Search Console) I still get tons of 404 pages which all seems to be the URLs from my previous website (same root domain). It would be nonsense to 301 redirect them as there are to many URLs. (or would it be nonsense?) What is the best way to deal with this issue?
Intermediate & Advanced SEO | | Chemometec0 -
Drop in indexed pages!
Hi everybody! I've been working on http://thewilddeckcompany.co.uk/ for a little while now. Until recently, everything was great - good rankings for the key terms of 'bird hides' and 'pond dipping platforms'. However, rankings have tanked over the past few days. I can't point my finger at it yet, but a site:thewilddeckcompany.co.uk search shows only three pages have been indexed. There's only 10 on the site, and it was fine beforehand. Any advice would be much appreciated,
Intermediate & Advanced SEO | | Blink-SEO0 -
CSS Display None / Hidden? Will I get in Trouble?
Hi, We're integrating over a dozen of videos to the site to be featured in a slideshow manner. A selected video will be featured in the center of the page, meanwhile the user can click on the small thumbnails and change it to something else. For the selected videos, there will be a transcript shown right next to the video. The trick is, we cant show the transcripts for all the videos at once, since that's just bad user experience - the page will be miles long. We want to hide the transcripts for the videos which are not shown either in a div - or in some other Google friendly manner. The question is? Is this Google - legit? Is there a chance of being flagged, since so much content will be hidden?
Intermediate & Advanced SEO | | sophia1231 -
Please help on this penalized site!
OK, this is slowly frying my brain and would like some clarification from someone in the know, we have posted multiple reconsideration requests the regular "site violates googles quality guidelines" .."look for unnatural links etc" email back in March 2012, I came aboard the business in August 2012 to overcome bad SEO companies work. So far i have filled several disavow requests by domain and cleared over 90% of our backlink profile which where all directory, multiple forum spam links etc from WMT, OSE and Ahrefs and compiled this to the disavow tool, as well as sending a google docs shared file in our reconsideration request of all the links we have been able to remove and the disavow tool, since most where built in 2009/2010 a lot where impossible to remove. We managed to shift about 12 - 15% of our backlink profile by working very very hard too remove them. The only links that where left where quality links and forum posts created by genuine users and relevant non spam links As well as this we now have a high quality link profile which has also counteracted a lot of the bad "seo" work done by these previous companies, i have explained this fully in our reconsideration request as well as a massive apology on behalf of the work those companies did, and we are STILL getting generic "site violates" messages, so far we have spent in excess of 150 hours to get this penalty removed and so far Google hasn't even batted an eyelid. We have worked SO hard to combat this issue it almost feels almost very personal, if Google read the reconsideration request they would see how much work we have done too remove this issue. If anyone can give any updates or help on anything we have missed i would appreciate it, i feel like we have covered every base!! Chris www.palicomp.co.uk
Intermediate & Advanced SEO | | palicomp0 -
Duplicate Page Content / Titles Help
Hi guys, My SEOmoz crawl diagnostics throw up thousands of Dup Page Content / Title errors which are mostly from the forum attached to my website. In-particular it's the forum user's profiles that are causing the issue, below is a sample of the URLs that are being penalised: http://www.mywebsite.com/subfolder/myforum/pop_profile.asp?mode=display&id=1308 I thought that by adding - http://www.mywebsite.com/subfolder/myforum/pop_profile.asp to my robots.txt file under 'Ignore' would cause the bots to overlook the thousands of profile pages but the latest SEOmoz crawl still picks them up. My question is, how can I get the bots to ignore these profile pages (they don't contain any useful content) and how much will this be affecting my rankings (bearing in mind I have thousands of errors for dup content and dup page titles). Thanks guys Gareth
Intermediate & Advanced SEO | | gaz33420