Dilemma about "images" folder in robots.txt
-
Hi, Hope you're doing well.
I am sure, you guys must be aware that Google has updated their webmaster technical guidelines saying that users should allow access to their css files and java-scripts file if it's possible. Used to be that Google would render the web pages only text based. Now it claims that it can read the css and java-scripts. According to their own terms, not allowing access to the css files can result in sub-optimal rankings. "Disallowing crawling of Javascript or CSS files in your site’s robots.txt directly harms how well our algorithms render and index your content and can result in suboptimal rankings."http://googlewebmastercentral.blogspot.com/2014/10/updating-our-technical-webmaster.htmlWe have allowed access to our CSS files. and Google bot, is seeing our webapges more like a normal user would do. (tested it in GWT)Anyhow, this is my dilemma. I am sure lot of other users might be facing the same situation. Like any other e commerce companies/websites.. we have lot of images. Used to be that our css files were inside our images folder, so I have allowed access to that. Here's the robots.txt --> http://www.modbargains.com/robots.txtRight now we are blocking images folder, as it is very huge, very heavy, and some of the images are very high res. The reason we are blocking that is because we feel that Google bot might spend almost all of its time trying to crawl that "images" folder only, that it might not have enough time to crawl other important pages. Not to mention, a very heavy server load on Google's and ours. we do have good high quality original pictures. We feel that we are losing potential rankings since we are blocking images. I was thinking to allow ONLY google-image bot, access to it. But I still feel that google might spend lot of time doing that. **I was wondering if Google makes a decision saying, hey let me spend 10 minutes for google image bot, and let me spend 20 minutes for google-mobile bot etc.. or something like that.. , or does it have separate "time spending" allocations for all of it's bot types. I want to unblock the images folder, for now only the google image bot, but at the same time, I fear that it might drastically hamper indexing of our important pages, as I mentioned before, because of having tons & tons of images, and Google spending enough time already just to crawl that folder.**Any advice? recommendations? suggestions? technical guidance? Plan of action? Pretty sure I answered my own question, but I need a confirmation from an Expert, if I am right, saying that allow only Google image access to my images folder. Sincerely,Shaleen Shah
-
Yup my images send me traffic from Google images on most of my sites and attractive images attract hotlinks as well. At the moment people are hosting their images on a different domain (cdn) and are still being credited with the images but I haven't tried to do that myself ie I don't know if they've set some "ownership" somewhere and somehow.
-
I recommend allowing Google to crawl those images. Google optimizes its crawl rate and once it has done a complete crawl it will understand how often to crawl certain areas of your site. My main concern would be that you are losing potential rankings and indexing from those images - if they are unique and high quality you definitely want them to index the images, understand the file names, and appropriately index them.
I wouldn't be concerned about Google bot eating up your server resources. If it does become a problem, then you can go back and adjust the bot access through the robots.txt, as you've done already. However, I would let them in first and only react if it becomes a problem.
I have tens of thousands of product images accessed by the google bot and it is no concern to my ecommerce company and the server resources. I'm not saying that it can't be a potential problem, but the benefit outweighs the risk of it being one - I choose a reactive stance in this situation.
Closely monitor your Google Webmaster Tools account, watch the crawl rate and statistics, and if it becomes an issue then decide on which image folders should or shouldn't be indexed.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Block session id URLs with robots.txt
Hi, I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt. Which directive should I use: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Disallow: ?filter= or User-agent: *
Disallow: /?filter= In other words, is the forward slash in the beginning of the disallow directive necessary? Thanks!1 -
Image Audit: Getting a list of *ALL* Images on a Site?
Hello! We are doing an image optimization audit, and are therefore trying to find a way to get a list of all images on a site. Screaming Frog seems like a great place to start (as per this helpful article: https://moz.com/ugc/how-to-perform-an-image-optimization-audit), but unfortunately, it doesn't include images in CSS. 😞 Does the community have any ideas for how we try to otherwise get list of images? Thanks in advance for any tips/advice.
Intermediate & Advanced SEO | | mirabile0 -
"sex" in non-adult domain name
I have a client with a domain that has "sex" in the domain name. For example, electronicsexpo.com. The domain ranks for a few keywords related to the services offered. It is an old domain that has been online for over 10 years. It ranks well for local keywords. No real SEO effort has been made on this domain, so it is rather a clean slate. I am going to be doing SEO on this site. Will the fact that the word "sex" exists in the name have any sort of negative consequence. There is ABSOLUTELY NOTHING adult related or pornographic on this site. I would think that search engines are sophisticated enough to differentiate, but would potential customers with things like parental filters be blocked from viewing content? Is this hurtful in anyway? If so, would I be better off changing domain names? TIA
Intermediate & Advanced SEO | | inhouseseo0 -
Where the "fudge-nuggets" are my internal links?
Ok, so... Google Webmaster Tools Internal Links are not showing any links to my site's homepage. I only link to the homepage by wrapping the logo with the link throughout the site. Does Google need these to be text links to show them? [/](<a class=)" title="Kona Coffee">![](<a class=)http://1s93mbet6ccj5zkm31703gqj8.wpengine.netdna-cdn.com/wp-content/uploads/kona-coffee-1.png" alt="Kona Coffee"/> Site is here:
Intermediate & Advanced SEO | | AhlerManagement
http://goo.gl/4C8GKc Could CDN image source be affecting it? Lost... please help!0 -
How much does "overall site semantic theme" influence rankings?
OK. I've optimized sites before that are dedicated to 1, 2 or 3 products and or services. These sites inherently talk about one main thing - so the semantics of the content across the whole site reflect this. I get these ranked well on a local level. Now, take an e-commerce site - which I am working on - 2000 products, all of which are quite varied - cookware, diningware, art, decor, outdoor, appliances... there is a lot of different semantics throughout the site's different pages. Does this influence the ranking possibilities? Your opinion and time is appreciated. Thanks in advance.
Intermediate & Advanced SEO | | bjs20100 -
How use Rel="canonical" for our Website
How is the best way to use Rel="canonical" for our website www.ofertasdeemail.com.br, for we can say goodbye for duplicated pages? I appreciate for every help. I also hope to contribute to the SEOmoz community. Sincerely,
Intermediate & Advanced SEO | | ZZNINTERNETMEDIAGROUP
Amador Goncalves0 -
Fluctuating Rankings on "difficult" keywords
Hi guys, I have a client who wants to rank well for two very "difficult" keywords and eight easier ones. The easy ones are "treadmills + city" and the difficult ones are "treadmills" and "treadmill". We have got great traction on the "+city" keywords and he now ranks on page one for all those. However, we have noticed that although he ranks on page 2-3 for "treadmill" treadmills", those rankings fluctuate widely day to day. Rankings for the "+city" versions are stable, a rising slowly as I would expect. Rankings for the difficult keywords can be 235 one day, 32 the next week, 218 the day after that, then stable at 30ish for a week, then fluctuation again. I know Google update every day, but what are the likely causes of the easier keywords being stable, while the harder ones fluctuate? Thanks.
Intermediate & Advanced SEO | | stevedeane0 -
Rel="prev" and view all question
Okay, I've read the posts by Google about the new prev, next tags and the suggestion for using a view all option. I've also read the posts here on SEOMoz on the topic but none of them quite address what we have. First, Some of our main categories are very large (over 6000 pieces of jewelry) so a view all option would take forever to load be completely useless to a visitor. Second, our category home pages provide (here's an example😞 A description of the category with links to important sections and articles A row of new items A dozen of the popular items from the category. Links to related articles if applicable. So we have a real category home page with content instead of just categories that start immediately with pages of product. Should we set the canonical url for all of the browse pages to the main category page, create a view all page or just use the next and previous rel tags with the category home pages as the first in the series?
Intermediate & Advanced SEO | | IanTheScot0