Panda Updates - robots.txt or noindex?
-
Hi,
I have a site that I believe has been impacted by the recent Panda updates. Assuming that Google has crawled and indexed several thousand pages that are essentially the same and the site has now passed the threshold to be picked out by the Panda update, what is the best way to proceed?
Is it enough to block the pages from being crawled in the future using robots.txt, or would I need to remove the pages from the index using the meta noindex tag? Of course if I block the URLs with robots.txt then Googlebot won't be able to access the page in order to see the noindex tag.
Anyone have and previous experiences of doing something similar?
Thanks very much.
-
This is a good read. http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world I think you should be careful with robot.txt because blocking access to the bot will not cause them to remove the content from their index. They will simply include a message saying not quite sure what's on this page.. I would use noindex to clear out the index first before attempting robot.txt exclusion.
-
Yes, both because if a page is linked to on another site google with spider that other site and follow your link without hitting the robots.txt and the page could get indexed if there is not a noindex on it.
-
Indeed try both.
Irving +1
-
both. block the lowest quality lowest traffic pages with nodindex and block the folder in robots.txt
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website still not recovered from Panda # 20 (Sep 2012 update)
Hi everyone,My website was hit by Panda around the 27th September 2012 (Panda # 20 or EMD) , since then, it's no longer in Google search results for a particular keyword [wallpapers], resulting in a massive sudden traffic drop (-90%) (see the screenshot below).Despite my best efforts auditing my links, identifying unnatural backlinks, disavowing bad links, enhancing my website content, improving user experience... (I even ended up with a completely revamped website: new design, new structure and new content), I didn't see any improvement! Can you please look at It and Advise me? I am ready to give up; I am in deep despair.What are my competitors doing better than me? Competitor #1 Competitor #2Thank you in advance - I appreciate your timeMy website: http://goo.gl/maaxazLroAvD5.jpg
Intermediate & Advanced SEO | | Spinodza0 -
Should I disallow all URL query strings/parameters in Robots.txt?
Webmaster Tools correctly identifies the query strings/parameters used in my URLs, but still reports duplicate title tags and meta descriptions for the original URL and the versions with parameters. For example, Webmaster Tools would report duplicates for the following URLs, despite it correctly identifying the "cat_id" and "kw" parameters: /Mulligan-Practitioner-CD-ROM
Intermediate & Advanced SEO | | jmorehouse
/Mulligan-Practitioner-CD-ROM?cat_id=87
/Mulligan-Practitioner-CD-ROM?kw=CROM Additionally, theses pages have self-referential canonical tags, so I would think I'd be covered, but I recently read that another Mozzer saw a great improvement after disallowing all query/parameter URLs, despite Webmaster Tools not reporting any errors. As I see it, I have two options: Manually tell Google that these parameters have no effect on page content via the URL Parameters section in Webmaster Tools (in case Google is unable to automatically detect this, and I am being penalized as a result). Add "Disallow: *?" to hide all query/parameter URLs from Google. My concern here is that most backlinks include the parameters, and in some cases these parameter URLs outrank the original. Any thoughts?0 -
Reviewing Category & Tag policy - Update
I recently (http://moz.com/community/q/less-tags-better-for-seo) started reviewing my category and tag policy, and things have been going very well. I thought I would share what I have done: Removed all tags from site Added unique descriptions for each post for the category excerpt. Only had the category description on the first page and use the description like a post to summarise and interlink to sub-categories or posts. This keeps pages from slipping down the number of clicks until it can be reached, improving link juice distribution. I also reduced the number of posts showing to 5, to allow more focus on the description (main part) of the category post. To add the category description on the first category page only in Wordpress, you need to go to the category.php or archive.php and change: to The overall aim was to have a hierarchal resource contained in the category page description. Whilst this is still a work in progress, you can see an example of what I am trying to achieve here: https://www.besthostnews.com/web-hosting-tutorials/cpanel/ https://www.besthostnews.com/web-hosting-tutorials/cpanel/mail/ If you have any further tips and advice as I continue to implement this (with good results so far), please feel free. Also, you can use the Visual Term Description Editor plugin to allow the wysiwyg editor for the category descriptions.
Intermediate & Advanced SEO | | TheWebMastercom1 -
Should I "NoIndex" Pages with Almost no Unique Content
I have a real estate site with MLS data (real estate listings shared across the Internet by Realtors, which means data exist across the Internet already). Important pages are the "MLS result pages" - the pages showing thumbnail pictures of all properties for sale in a given region or neighborhood. 1 MLS result page may be for a region and another for a neighborhood within the region:
Intermediate & Advanced SEO | | khi5
example.com/region-name and example.com/region-name/neighborhood-name
So all data on the neighborhood page will be 100% data from the region URL. Question: would it make sense to "NoIndex" such neighborhood page, since it would reduce nr of non-unique pages on my site and also reduce amount of data which could be seen as duplicate data? Will my region page have a good chance of ranking better if I "NoIndex" the neighborhood page? OR, is Google so advanced they know Realtors share MLS data and worst case simple give such pages very low value, but will NOT impact ranking of other pages on a website? I am aware I can work on making these MLS result pages more unique etc, but that isn't what my above question is about. thank you.0 -
Will disallowing in robots.txt noindex a page?
Google has indexed a page I wish to remove. I would like to meta noindex but the CMS isn't allowing me too right now. A suggestion o disallow in robots.txt would simply stop them crawling I expect or is it also an instruction to noindex? Thanks
Intermediate & Advanced SEO | | Brocberry0 -
Launching a new site with old, new and updated content: What’s best practice?
Hi all, We are launching a new site soon and I’d like your opinion on best practice related to its content. We will be retaining some pages and content (although the URLs might change a bit as I intend to replace under-scores with hyphens and remove .asp from some extensions in order to standardise a currently uneven URL structuring). I will also be adding a lot of new pages with new content, along with amend some pages and their content (and amend URLs again if need be), and a few pages are going to be done away with all together. Any advice from those who’ve done the same in the past as to how best to proceed? Does the URL rewriting sound OK to do in conjunction with adding and amending content? Cheers, Dave
Intermediate & Advanced SEO | | Martin_S0 -
Penguin/Panda/Domain Purchase
If I move forward with the acquisition: 1. Should I, if there is a way, just acquire the domain and then attempt to unlink existing links? 2. Can I just buy the domain, completely kill the site, and then build again from scratch? Even if I do that, the links to the domain will still be out there. 3. Should I even move forward with the purchase if I know these tactics have been used? Thanks!
Intermediate & Advanced SEO | | dbuckles0 -
Robots.txt 404 problem
I've just set up a wordpress site with a hosting company who only allow you to install your wordpress site in http://www.myurl.com/folder as opposed to the root folder. I now have the problem that the robots.txt file only works in http://www.myurl./com/folder/robots.txt Of course google is looking for it at http://www.myurl.com/robots.txt and returning a 404 error. How can I get around this? Is there a way to tell google in webmaster tools to use a different path to locate it? I'm stumped?
Intermediate & Advanced SEO | | SamCUK0