How can I get unimportant pages out of Google?
-
Hi Guys,
I have a (newbie) question, untill recently I didn't had my robot.txt written properly so Google indexed around 1900 pages of my site, but only 380 pages are real pages, the rest are all /tag/ or /comment/ pages from my blog. I now have setup the sitemap and the robot.txt properly but how can I get the other pages out of Google? Is there a trick or will it just take a little time for Google to take out the pages?
Thanks!
Ramon
-
If you want to remove an entire directory, you can exclude that directory in robots.txt, then go to Google Webmaster Tools and request a URL removal. You'll have an option to remove an entire directory there.
-
No, sorry. What I said is, if you mark the folder as disalow in robots.txt, it will not remove the pages are already indexed.
But the meta tag, when the spiders go again on the page and see that the pages are with the noindex tag will remove it.
Since you can not already include the directory on the robots.txt. Before removing the SE pages.
First you put the noindex tag on all pages you want to remove. After they are removed, it takes a week for a month. After you add the folders in robots.txt to your site who do not want to index.
After that, you dont need to worry about the tags.
I say this because when you add in the robots.txt first, the SE does not read the page anymore, so they would not read the meta noindex tag. Therefore you must first remove the pages with noindex tag and then add in robot.txt
Hope this has helped.
João Vargas
-
No, sorry. What I said is, if you mark the folder as disalow in robots.txt, it will not remove the pages are already indexed.
But the meta tag, when the spiders go again on the page and see that the pages are with the noindex tag will remove it.
Since you can not already include the directory on the robots.txt. Before removing the SE pages.
First you put the noindex tag on all pages you want to remove. After they are removed, it takes a week for a month. After you add the folders in robots.txt to your site who do not want to index.
After that, you dont need to worry about the tags.
I say this because when you add in the robots.txt first, the SE does not read the page anymore, so they would not read the meta noindex tag. Therefore you must first remove the pages with noindex tag and then add in robot.txt
Hope this has helped.
João Vargas
-
Thanks Vargas, If I choose for noindex, I should remove it from the robot.txt right?
I understood that if you have a noindex tag on the page and as well a dissallow in the robot.txt the SE will index it, is that true?
-
For you remove the pages you want, need to put a tag:
<meta< span="">name="robots" content="noindex">If you want internal links and external relevance to pass on these pages, you put:
<meta< span="">name="robots" content="noindex, follow">If you do the lock on robot.txt: only need to include the tag in the current urls, new search engines will index no.
In my opinion, I do not like using the google url remover. Because if someday you want to index these folders, will not, at least it has happened to me.
The noindex tag works very well to remove objectionable content, within 1 month or so now will be removed.</meta<></meta<>
-
Yes. It's only a secondary level aid, and not guaranteed, yet it could help speed up the process of devaluing those pages in Google's internal system. If the system sees those, and cross-references to the robots.txt file it could help.
-
Thanks guys for your answers....
Alan, do you mean that I place the tag below at all the pages that I want out of Google? -
I agree with Alan's reply. Try canonical 1st. If you don't see any change, remove the URLs in GWT.
-
There's no bulk page request form so you'd need to submit every URL one at a time, and even then it's not a guaranteed way. You could consider gettting a canonical tag on those specific pages that provides a different URL from your blog, such as an appropriate category page, or the blog home page. That could help speed things up, but canonical tags themselves are only "hints" to Google.
Ultimately it's a time and patience thing.
-
It will take time, but you can help it along by using the url removal tool in Google Webmaster Tools. https://www.google.com/webmasters/tools/removals
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can you force Google to use meta description?
Is it possible to force Google to use only the Meta description put in place for a page and not gather additional text from the page?
Technical SEO | | A_Q0 -
Site splitting value of our pages with multiple variations. How can I fix this with the least impact?
Just started at a company recently, and there is a preexisting problem that I could use some help with. Somebody please tell me there is a low impact fix for this: My company's website is structured so all of the main links used on the nav are listed as .asp pages. All the canonical stuff. However, for "SEO Purposes," we have a number of similar (not exact) pages in .html on the same topic on our site. So, for example, let's say we're a bakery. The main URL, as linked in the nav, for our Chocolate Cakes, would be http://www.oursite.com/chocolate-cakes.asp. This differentiates the page from our other cake varieties, such as http://www.oursite.com/pound-cakes.asp and http://www.oursite.com/carrot-cakes.asp. Alas, fully indexed in Google with links existing only in our sitemap, we also have: http://www.oursite.com/chocolate-cakes.html http://www.oursite.com/chocolatecakes.html http://www.oursite.com/cakes-chocolate.html This seems CRAZY to me, because wouldn't this split our search results 4 ways? Am I right in assuming this is destroying the rankings of our canonical pages? I want to change this, but problem is, none of the content is the same on any of the variants, and some of these pages rank really well - albeit mostly for long tail keywords instead of the good, solid keywords we're after. So, what I'm asking you guys is: How do I burn these .html pages to the ground without completely destroying our rankings for the other keywords? I want to 301 those pages to our canonical nav URLs but, because of the wildly different content, I'm afraid that we could see a heavy drop in search traffic. Am I just being overly cautious? Thanks in advance!
Technical SEO | | jdsnyc20 -
Page Load Timings: How accurate is Google Analytics Data?
Hello Guys, what are your experiences? How accurate is google analytics data regarding page load times? I know that one of my sites has trouble with pageload times, especially in India and USA. We are based in middle Europe and regarding to the GA data we have here in middle europe of about 2 seconds page load time. Moreover we have of about 4 seconds in USA and 10 seconds in India. Therefore I decided to test for a few sides a CDN (on these pages all static files are served over the CDN). However, first GA data indicates, that the page load times are even getting worse!!! But when I test it for example with pingdom (http://tools.pingdom.com/fpt/) and compare it with an old landing page without CDN implementation, the tool says it's faster. The CDN provider (maxcdn) send me also some reports, which indicate, that the page load time should be faster...That's the reason why I ask about your experience with the GA page load time data, because personally I get the impression you cannot trust the data... Thanks for your help! Cheers
Technical SEO | | _Heiko_2 -
Why Google ranks a page with Meta Robots: NO INDEX, NO FOLLOW?
Hi guys, I was playing with the new OSE when I found out a weird thing: if you Google "performing arts school london" you will see w w w . mountview . org. uk at the 3rd position. The point is that page has "Meta Robots: NO INDEX, NO FOLLOW", why Google indexed it? Here you can see the robots.txt allows Google to index the URL but not the content, in article they also say the meta robots tag will properly avoid Google from indexing the URL either. Apparently, in my case that page is the only one has the tag "NO INDEX, NO FOLLOW", but it's the home page. so I said to myself: OK, perhaps they have just changed that tag therefore Google needs time to re-crawl that page and de-index following the no index tag. How long do you think it will take to don't see that page indexed? Do you think it will effect the whole website, as I suppose if you have that tag on your home page (the root domain) you will lose a lot of links' juice - it's totally unnatural a backlinks profile without links to a root domain? Cheers, Pierpaolo
Technical SEO | | madcow780 -
Google+ Contibutor to: Link To Main Domain or Content Page?
Which is the best practice for the link to claim authorship for a guest post? I have tried both the main domain URL in the "contributor to" section of my Google plus and the page URL where the post is and both show my picture when testing in the Structured Data Testing Tool. Which is best to use? Thanks in advance.
Technical SEO | | WSIDW0 -
No confirmation page on Google's Disavow links tool?
I've been going through and doing some spring cleaning on some spammy links to my site. I used Google's Disavow links tool, but after I submit my text file, nothing happens. Should I be getting some sort of confirmation page? After I upload my file, I don't get any notifications telling me Google has received my file or anything like that. It just takes me back to this page: http://cl.ly/image/0S320q46321R/Image 2013-04-26 at 11.15.25 AM.png Am I doing something wrong or is this what everyone else is seeing too?
Technical SEO | | shawn810 -
Can I put the tag in the MasterPage of my ASP.NET website or does this need to be specific to each page?
Hi Moz Community, I am a designer/junior SEO'er and have been working with our web developer to setup SEO oriented redirects and the rel canonical tag on our ASP.NET page running MasterPages - www.tisbest.org. I know setting up an incorrect canonical tag can be devastating so I'm hoping for some guidance. Can we put the <title> </span>Charity Gift Cards | Donation Gift Ideas | TisBest Philanthropy</p> <p style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span style="color: #5e5e5e;"> </span></p> <p style="color: #5e5e5e; font-family: Helvetica, Arial, sans-serif; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span style="color: #5e5e5e;"></title> Thanks! Chad
Technical SEO | | TisBest0 -
Exclude mobile pages from non mobile Google serps
Hi Everybody I see that a lot of our pages on our mobile shop has started to turn up when i do site:domainname.com on google. As they could potentially compete with the similar non mobile version of the same page, is there some way to exlude the mobile domain in non mobile google result without blocking the mobile version altogether. We use an m.domain.com version for our mobile site.
Technical SEO | | AndersDK0