Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
-
Hi!
I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us.
The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx
I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this:
User-agent: googlebot
Disallow: /community/photos/
Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible?
Thanks!
Leona
-
Are you seeing the images getting indexed, though? Even if GWT recognize the Robots.txt directives, blocking the pages may essentially keep the images from having any ranking value. Like Matt, I'm not sure this will work in practice.
Another option would be to create an alternate path to just the images, like an HTML sitemap with just links to those images and decent anchor text. The ranking power still wouldn't be great (you'd have a lot of links on this page, most likely), but it would at least kick the crawlers a bit.
-
Thanks Matt for your time and assistance! Leona
-
Hi Leona - what you have done is something along the lines of what I thought would work for you - sorry if I wasn't clear in my original response - I thought you meant if you created a robots.txt and specified Googlebot to be disallowed then Googlebot-image would pick up the photos still and as I said this wouldn't be the case as it Googlebot-image will follow what it set out for Googlebot unless you specify otherwise using the allow directive as I mentioned. Glad it has worked for you - keep us posted on your results.
-
Hi Matt,
Thanks for your feedback!
It is not my belief that Googlebot overwrides googlebot-images otherwise specifying something for a specific bot of Google's wouldn't work, correct?
I setup the following:
User-agent: googlebot
Disallow: /community/photos/
User-agent: googlebot-Image
Allow: /community/photos/
I tested the results in Google Webmaster Tools which indicated:
Googlebot: Blocked by line 26: Disallow: /community/photos/Detected as a directory; specific files may have different restrictions
Googlebot-Image: Allowed by line 29: Allow: /community/photos/Detected as a directory; specific files may have different restrictions
Thanks for your help!
Leona
-
Hi Leona
Googlebot-image and any of the other bots that Google uses follow the rules set out for Googlebot so blocking Googlebot would block your images as it overrides Googlebot-image. I don't think that there is a way around this using the disallow directive as you are blocking the directory which contains your images so they won't be indexed using specific images.
Something you may want to consider is the Allow directive -
Disallow: /community/photos/
Allow: /community/photos/~username~/
that is if Google is already indexing images under the username path?
The allow directive will only be successful if it contains more or equal number of characters as the disallow path, so bare in mind that if you had the following;
Disallow: /community/photos/
Allow: /community/photos/
the allow will win out and nothing will be blocked. please note that i haven't actioned the allow directive myself but looked into it in depth when i studied the robots.txt for my own sites it would be good if someone else had an experience of this directive. Hope this helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is it ok to repeat a (focus) keyword used on a previous page, on a new page?
I am cataloguing the pages on our website in terms of which focus keyword has been used with the page. I've noticed that some pages repeated the same keyword / term. I've heard that it's not really good practice, as it's like telling google conflicting information, as the pages with the same keywords will be competing against each other. Is this correct information? If so, is the alternative to use various long-winded keywords instead? If not, meaning it's ok to repeat the keyword on different pages, is there a maximum recommended number of times that we want to repeat the word? Still new-ish to SEO, so any help is much appreciated! V.
Intermediate & Advanced SEO | | Vitzz1 -
Should I be using meta robots tags on thank you pages with little content?
I'm working on a website with hundreds of thank you pages, does it make sense to no follow, no index these pages since there's little content on them? I'm thinking this should save me some crawl budget overall but is there any risk in cutting out the internal links found on the thank you pages? (These are only standard site-wide footer and navigation links.) Thanks!
Intermediate & Advanced SEO | | GSO0 -
Search box within search results question
I work for a Theater news website. We have two sister sites, theatermania.com in the US and whatsonstage.com in London. Both sites have largely the same codebase and page layouts. We've implemented markup that allows google to show a search box for our site in its results page. For some reason, the search box is showing for one site but not the other: http://screencast.com/t/CSA62NT8 We're scratching our heads. Does anyone have any ideas?
Intermediate & Advanced SEO | | TheaterMania0 -
To index or not to index search pages - (Panda related)
Hi Mozzers I have a WordPress site with Relevanssi the search engine plugin, free version. Questions: Should I let Google index my site's SERPS? I am scared the page quality is to thin, and then Panda bear will get angry. This plugin (or my previous search engine plugin) created many of these "no-results" uris: /?s=no-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Ano-results%3Akids+wall&cat=no-results&pg=6 I have added a robots.txt rule to disallow these pages and did a GWT URL removal request. But links to these pages are still being displayed in Google's SERPS under "repeat the search with the omitted results included" results. So will this affect me negatively or are these results harmless? What exactly is an omitted result? As I understand it is that Google found a link to a page they but can't display it because I block GoogleBot. Thanx in advance guys.
Intermediate & Advanced SEO | | ClassifiedsKing0 -
Top Google News Result = Search Result #11 (first on second page)
Hey all, I've noticed that, in most cases, when we have an article that gets the top spot in Google News results for a given keyword, the search result for that same article will appear in position #11 (the first result on the second page for standard SERP viewing). This is nearly always the case, which suggests its built into Google's algorithm to prevent overlap. Has anyone else experienced this? I haven't seen it discussed previously on Moz or other SEO forums, but it makes sense. Or if you haven't experienced this, I'd love to hear about what you're seeing.
Intermediate & Advanced SEO | | dangaul0 -
How can we optimize content specific to particular tabs, but is loaded on one page?
Hi, Our website generates stock reports. Within those reports, we organize information into particular tabs. The entire report is loaded on one page and javascript is used to hide and show the different tabs. This makes it difficult for us to optimize the information on each particular tab. We're thinking about creating separate pages for each tab, but we're worried about affecting the user experience. We'd like to create separate pages for each tab, put links to them at the bottom of the reports, and still have the reports operate as they do today. Can we do this without getting in trouble with Google for having duplicate content? If not, is there another solution to this problem that we're not seeing? Here's a sample report: http://www.vuru.co/analysis/aapl In advance, thanks for your help!
Intermediate & Advanced SEO | | yosephwest0 -
Is there an search marketing / keyword tool in existence that can solve my need?
I'm looking for a tool that can do the following:
Intermediate & Advanced SEO | | PTC4SEO
Organize a keyword universe and its data/metrics:
-track keyword data over time (search volumes/trends, relative competition metrics, rankings,
etc)
-Sort keywords into buckets/silos/ad groups
-allow you to assign individual keywords to multiple silos/groups and show the relationships between groups based on keyword relationships.
-incorporate a site map
-tie keyword targets to static pages, informational content (SEO) & landing pages (PPC)
-help with KW and/or competitive research (optional)
-tie into web analytics / marketing on-demand software (optional) I know that this is a lot of functionality, but for enterprise search marketing, this could be a game changer for my strategy (if it exists currently) or for the industry (if it doesn't exist). Please share you solution suggestions here...0 -
Do in page links pointing to the parent page make the page more relevant for that term?
Here's a technical question. Suppose I have a page relevant to the term "Mobile Phones". I have a piece of text, on that page talking about "mobile phones", and within that text is the term "cell phones". Now if I link the text "cell phones", to the page it is already placed on (ie the parent page) - will the page gain more relevancy for the term "cell phones"?? Thanks
Intermediate & Advanced SEO | | James770