Medium sizes forum with 1000's of thin content gallery pages. Disallow or noindex?
-
I have a forum at http://www.onedirection.net/forums/ which contains a gallery with 1000's of very thin-content pages. We've currently got these photo pages disallowed from the main googlebot via robots.txt, but we do all the Google images crawler access.
Now I've been reading that we shouldn't really use disallow, and instead should add a noindex tag on the page itself.
It's a little awkward to edit the source of the gallery pages (and keeping any amends the next time the forum software gets updated).
Whats the best way of handling this?
Chris.
-
Hey Chris,
I agree that your current implementation, while not ideal, is perfectly adequate for the purposes of ensuring you don't have duplicate content or cannibalisation problems - but still allows Google to index the UCG images.
You're also preventing Googlebot from seeing the user profile pages, which is a good idea, since many of them are very thin and mostly duplicate.
So, from a pure SEO perspective, I think you've done a good job.
However... I think you should also consider the ethical implications of potentially blocking the image googlebot as well. By preventing Google from indexing all those images of young girls fawning over the vacuous runners up of a televised talent show, you would undoubtedly be doing the world a great service.
-
Hi Chris, I second Jarno's opinion in this regard. If it is going to be a huge overhead to add the page level blocking, you can rely on your current robots.txt setup. There is a small catch here though. Even if you block using robots.txt file, if Google finds a reference to the blocked content elsewhere on the Internet, then it would index the blocked content. In situations like this, page level content blocking is the way forward. So to fully restrict Google bot indexing your content, you should ideally be using the page level robots meta tag or x-robots-tag.
Here you go for more: https://support.google.com/webmasters/answer/156449?hl=en
Hope it helps.
Best,
Devanur Rafi.
-
Chris,
is the disallow meta update is too complicated for you to add due to software issues etc. then I feel that your current method is the right way to go. Normally you would be absolutely right for the simple reason that page level overrules the robots.txt. But if a software update overrules the rules places in your code then you have to manually add it after each and every update and i'm not sure you want to do that.
regards
Jarno
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can a page that's 301 redirected get indexed / show in search results?
Hey folks, have searched around and haven't been able to find an answer to this question. I've got a client who has very different search results when including his middle initial. His bio page on his company's website has the slug /people/john-smith; I'm wondering if we set up a duplicate bio page with his middle initial (e.g. /people/john-b-smith) and then 301 redirect it to the existent bio page, whether the latter page would get indexed by google and show in search results for queries that use the middle initial (e.g. "john b smith"). I've already got the metadata based on the middle initial version but I know the slug is a ranking signal and since it's a direct match to one of his higher volume branded queries I thought it might help to get his bio page ranking more highly. Would that work or does the 301'd page effectively cease to exist in Google's eyes?
Technical SEO | | Greentarget0 -
If I'm using a compressed sitemap (sitemap.xml.gz) that's the URL that gets submitted to webmaster tools, correct?
I just want to verify that if a compressed sitemap file is being used, then the URL that gets submitted to Google, Bing, etc and the URL that's used in the robots.txt indicates that it's a compressed file. For example, "sitemap.xml.gz" -- thanks!
Technical SEO | | jgresalfi0 -
What's the best way to integrate off site inventory?
I can't seem to make any progress with my car dealership client in rankings or traffic. I feel like I've narrowed out most of the common problems, the only other thing I can see is that all their inventory is on a subdomain using a dedicated auto dealership software. Any suggestion of a better way to handle this situation? Am I missing something obvious? The url is rcautomotive.com Thanks for your help!
Technical SEO | | GravitateOnline0 -
My website's pages are not being indexed correctly
Hi, One of our websites, which is actually a price comparison engine, facing indexing problem at Google. When we check “site:mywebsite.com “, there are lots of pages indexed which are not from mywebsite.com but from merchants websites. The index result page also shows merchant’s page title. In some cases the title is from merchant’s site but when the given link is accessed it points to mywebsite.com/index. Also the cache displays the merchant’s product page as the last indexed version rather than showing ours. The mywebsite.com has quite few Merchants that send us their product feed. Those products are listed on comparison page with prices. The merchant’s links on comparison page are all no-follow links but some of the (not all) merchant’s product pages are indexed against mywebsite.com as mentioned above instead of product comparison page of mywebsite.com How can we fix the issue? Thanks!
Technical SEO | | digitalMSB0 -
Should I worry about these 404's?
Just wondering what the thought was on this. We have a site that lets people generate user profiles and once they delete the profile the page then 404's. I was told there is nothing we can do about those from our developers, but I was wondering if I should worry about these...I don't think they will affect any of our rankings, but you never know so I thought I would ask. Thanks
Technical SEO | | KateGMaker1 -
Https-pages still in the SERP's
Hi all, my problem is the following: our CMS (self-developed) produces https-versions of our "normal" web pages, which means duplicate content. Our it-department put the <noindex,nofollow>on the https pages, that was like 6 weeks ago.</noindex,nofollow> I check the number of indexed pages once a week and still see a lot of these https pages in the Google index. I know that I may hit different data center and that these numbers aren't 100% valid, but still... sometimes the number of indexed https even moves up. Any ideas/suggestions? Wait for a longer time? Or take the time and go to Webmaster Tools to kick them out of the index? Another question: for a nice query, one https page ranks No. 1. If I kick the page out of the index, do you think that the http page replaces the No. 1 position? Or will the ranking be lost? (sends some nice traffic :-))... thanx in advance 😉
Technical SEO | | accessKellyOCG0 -
Best way to condense content on a page?
We want to add a video transcript to the same page as the video, but it doesn't really fit the design of the page. Is it fine to use CSS/DIVs to either have a "click to read full transcript" or a scroll box?
Technical SEO | | nicole.healthline0 -
What's the difference between a category page and a content page
Hello, Little confused on this matter. From a website architectural and content stand point, what is the difference between a category page and a content page? So lets say I was going to build a website around tea. My home page would be about tea. My category pages would be: White Tea, Black Tea, Oolong Team and British Tea correct? ( I Would write content for each of these topics on their respective category pages correct?) Then suppose I wrote articles on organic white tea, white tea recipes, how to brew white team etc...( Are these content pages?) Do I think link FROM my category page ( White Tea) to my ( Content pages ie; Organic White Tea, white tea receipes etc) or do I link from my content page to my category page? I hope this makes sense. Thanks, Bill
Technical SEO | | wparlaman0