Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search

HD_Leona

Hi!

I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us.

The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx

I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this:

User-agent: googlebot

Disallow: /community/photos/

Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible?

Thanks!

Leona

Dr-Pete

Are you seeing the images getting indexed, though? Even if GWT recognize the Robots.txt directives, blocking the pages may essentially keep the images from having any ranking value. Like Matt, I'm not sure this will work in practice.

Another option would be to create an alternate path to just the images, like an HTML sitemap with just links to those images and decent anchor text. The ranking power still wouldn't be great (you'd have a lot of links on this page, most likely), but it would at least kick the crawlers a bit.

HD_Leona

Thanks Matt for your time and assistance! Leona

Matt-Williamson

Hi Leona - what you have done is something along the lines of what I thought would work for you - sorry if I wasn't clear in my original response - I thought you meant if you created a robots.txt and specified Googlebot to be disallowed then Googlebot-image would pick up the photos still and as I said this wouldn't be the case as it Googlebot-image will follow what it set out for Googlebot unless you specify otherwise using the allow directive as I mentioned. Glad it has worked for you - keep us posted on your results.

HD_Leona

Hi Matt,

Thanks for your feedback!

It is not my belief that Googlebot overwrides googlebot-images otherwise specifying something for a specific bot of Google's wouldn't work, correct?

I setup the following:

User-agent: googlebot

Disallow: /community/photos/

User-agent: googlebot-Image

Allow: /community/photos/

I tested the results in Google Webmaster Tools which indicated:

Googlebot: Blocked by line 26: Disallow: /community/photos/Detected as a directory; specific files may have different restrictions

Googlebot-Image: Allowed by line 29: Allow: /community/photos/Detected as a directory; specific files may have different restrictions

Thanks for your help!

Leona

Matt-Williamson

Hi Leona

Googlebot-image and any of the other bots that Google uses follow the rules set out for Googlebot so blocking Googlebot would block your images as it overrides Googlebot-image. I don't think that there is a way around this using the disallow directive as you are blocking the directory which contains your images so they won't be indexed using specific images.

Something you may want to consider is the Allow directive -

Disallow: /community/photos/

Allow: /community/photos/~username~/

that is if Google is already indexing images under the username path?

The allow directive will only be successful if it contains more or equal number of characters as the disallow path, so bare in mind that if you had the following;

Disallow: /community/photos/

Allow: /community/photos/

the allow will win out and nothing will be blocked. please note that i haven't actioned the allow directive myself but looked into it in depth when i studied the robots.txt for my own sites it would be good if someone else had an experience of this directive. Hope this helps.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Moz Q&A is closed.

Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

What are best page titles for sub-domain pages?

Multiple pages optimised for the same keywords but pages are functionally different and visually different

Block in robots.txt instead of using canonical?

How important is the optional <priority>tag in an XML sitemap of your website? Can this help search engines understand the hierarchy of a website?</priority>

How long takes to a page show up in Google results after removing noindex from a page?

Do search engines crawl links on 404 pages?

Best practice for removing indexed internal search pages from Google?

Does Google crawl the pages which are generated via the site's search box queries?

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved