Robots.txt question
-
What is this robots.txt telling the search engines?
User-agent: * Disallow: /stats/
-
Oh - and it's affect the domain negatively.. when cleaning up your site directories via robots.txt. Its actually better as I explained below
-
Hey Mark,
It's good practice to disallow access to any folder/content you don't want indexed as well as anything that has any security involved (login's, databases etc).
It will also keep the most important pages from the domain in front of the search spiders eyes, while keeping poor content out of the indes. This helps the domain on a site authority level provide valuable content and information to users.
Lower ranking pages, can cause the domain to be pulled down by serarch engines (Google and Bing have attested to this already) as they want businesses to focus on high value content - which leads to better user experience.
Cheers!
-
Thanks- wanted to make sure all was copacetic there. I'm assuming that it's good practice to disallow access to stats and won't impact the site negatively?
-
Assuming that this is the entire contents of this file: It says that no robot (search engine spider, other crawler, etc.) should visit or index anything in the /stats/ directory or any directories inside of it.
More info available here: http://www.robotstxt.org/robotstxt.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Last Question Regarding URLs - I Promise...
Hello I've recently asked the community which urls would be best for a company with a variety of wood flooring products. This question relates to "keywords" within the url which relates to each and every product. Which would you choose, 1. a or b? 2. a or b? 1. - Product: CIRO a. www.thewoodgalleries.co.uk/engineered-flooring/rustic-oak-ciro - Keyword Match, YES. "Rustic Oak Flooring" b. www.thewoodgalleries.co.uk/engineered-flooring/ciro - Keyword Match, NO. "Rustic Oak Flooring" 2. - Product: VOGUE a. www.thewoodgalleries.co.uk/engineered-flooring/prefinished-oak-vogue - Keyword Match, YES. "Pr_efinished Oak Flooring"_ b. www.thewoodgalleries.co.uk/engineered-flooring/vogue - Keyword Match, NO. "Pr_efinished Oak Flooring"_ Although seemingly a basic part of SEO, I find myself revisiting this question time and time again - what is really better for SEO? Shorter URL's or "slightly" longer to achieve keyword match? _After researching many keywords which we have chosen to use as part of this project, it seems to have any chance of ranking on the first page, the key word (or part of the keyword) must appear within the url. _ I would like to get some "extra" clarification. Thanks for your help!
Technical SEO | | GaryVictory0 -
Site Penalized - 301 Redirect Question
Hello, We have a website that was penalized roughly two years by Google for "Unnatural Links"... We are experiencing a lot of problems with this site, completely unrelated to the penalty or SERPS, and we're debating doing a 301 Re-direct to another site we own that is totally clean and has no "Unnatural Links". If we do a 301 from the penalized site to our alternative website, will there be any cross-contamination? Will the penalty carry over to our other site? Please let me know what you guys think. Thanks
Technical SEO | | Prime850 -
Easy Question: regarding no index meta tag vs robot.txt
This seems like a dumb question, but I'm not sure what the answer is. I have an ecommerce client who has a couple of subdirectories "gallery" and "blog". Neither directory gets a lot of traffic or really turns into much conversions, so I want to remove the pages so they don't drain my page rank from more important pages. Does this sound like a good idea? I was thinking of either disallowing the folders via robot.txt file or add a "no index" tag or 301redirect or delete them. Can you help me determine which is best. **DEINDEX: **As I understand it, the no index meta tag is going to allow the robots to still crawl the pages, but they won't be indexed. The supposed good news is that it still allows link juice to be passed through. This seems like a bad thing to me because I don't want to waste my link juice passing to these pages. The idea is to keep my page rank from being dilluted on these pages. Kind of similar question, if page rank is finite, does google still treat these pages as part of the site even if it's not indexing them? If I do deindex these pages, I think there are quite a few internal links to these pages. Even those these pages are deindexed, they still exist, so it's not as if the site would return a 404 right? ROBOTS.TXT As I understand it, this will keep the robots from crawling the page, so it won't be indexed and the link juice won't pass. I don't want to waste page rank which links to these pages, so is this a bad option? **301 redirect: **What if I just 301 redirect all these pages back to the homepage? Is this an easy answer? Part of the problem with this solution is that I'm not sure if it's permanent, but even more importantly is that currently 80% of the site is made up of blog and gallery pages and I think it would be strange to have the vast majority of the site 301 redirecting to the home page. What do you think? DELETE PAGES: Maybe I could just delete all the pages. This will keep the pages from taking link juice and will deindex, but I think there's quite a few internal links to these pages. How would you find all the internal links that point to these pages. There's hundreds of them.
Technical SEO | | Santaur0 -
Will an XML sitemap override a robots.txt
I have a client that has a robots.txt file that is blocking an entire subdomain, entirely by accident. Their original solution, not realizing the robots.txt error, was to submit an xml sitemap to get their pages indexed. I did not think this tactic would work, as the robots.txt would take precedent over the xmls sitemap. But it worked... I have no explanation as to how or why. Does anyone have an answer to this? or any experience with a website that has had a clear Disallow: / for months , that somehow has pages in the index?
Technical SEO | | KCBackofen0 -
Questions about root domain setup
Hi There, I'm a recent addition to SEOmoz and over the past few weeks I've been trying to figure things out. This whole SEO process has been a bit of a brain burner but its slowly becoming a little more clearer. For awhile I noticed that I was unable to get Open Site Explorer to display information about my site. It mentioned that that there was not enough data for the URL. Too recent of a site, no links, etc. Eventually I changed the the URL to include "www." and it pulled up results. I also noticed that a few of my page warnings are because of duplicate page content. One page will be listed as http://enbphotos.com. The other will be listed as http://www.enbphotos.com. I guess I'm not sure what this all means and how to change it. I'm also not really sure what the terminology even is and something regarding root domain seemed appropriate but I'm not sure if it is accurate. Any help/suggestions/links would be appreciated! Thanks, Chris
Technical SEO | | enbphotos0 -
Question about an older more obsolete site
I have a website that I don't use much anymore but it ranks on the first page for one of my main keywords. I am using another few websites in different niches right now that are doing better and are more functional. It may cost around 1,300 or so to get the website that I don't use anymore, to look and function in the new ways of the internet. Would you suggest that I: Do a site redesign (which is more difficult because to make the site do what I want it needs to be out of a wordpress theme) or 301 redirect the site to another one of my sites? Would it make sense to do a 301? The domain is 5 years old but doesn't bring in any leads anymore because it would take a redesign for that to happen. How can I still benefit from the SEO that I have done on that site? Thanks and sorry if this message is hard to follow. If I need to clear anything up please let me know.
Technical SEO | | blake-766240 -
Confused about robots.txt
There is a lot of conflicting and/or unclear information about robots.txt out there. Somehow, I can't make out what's the best way to use robots even after visiting the official robots website. For example I have the following format for my robots. User-agent: * Disallow: javascript.js Disallow: /images/ Disallow: /embedconfig Disallow: /playerconfig Disallow: /spotlightmedia Disallow: /EventVideos Disallow: /playEpisode Allow: / Sitemap: http://www.example.tv/sitemapindex.xml Sitemap: http://www.example.tv/sitemapindex-videos.xml Sitemap: http://www.example.tv/news-sitemap.xml Is this correct and/or recommended? If so, then how come I see a list of over 200 or so links blocked by robots when Im checking out Google Webmaster Tools! Help someone, anyone! Can't seem to understand this robotic business! Regards,
Technical SEO | | Netpace0 -
Home Page Canonical Question
I have an online store through hosting service Volusion. I have asked them about this and was told that this is normal. I would like to confirm this with you guys because I'm not convinced of the quality of their customer service and I'm not an expert. When I check Analytics the landing page that is visited most often is www....../default.asp and the second most visited is www........./ . These are, of course, both my home page. Volusion has radio button that allows the admin to "enable canonical links", which I have enabled, and they told me that it is normal to see this on google analytics regardless. When I type in either of those addreses, the homepage comes up as the address that I typed. In other words it doesn't redirect so that it is always the same. Am I right to be concerned about this?
Technical SEO | | berglin0