Robots.txt question
-
What is this robots.txt telling the search engines?
User-agent: * Disallow: /stats/
-
Oh - and it's affect the domain negatively.. when cleaning up your site directories via robots.txt. Its actually better as I explained below
-
Hey Mark,
It's good practice to disallow access to any folder/content you don't want indexed as well as anything that has any security involved (login's, databases etc).
It will also keep the most important pages from the domain in front of the search spiders eyes, while keeping poor content out of the indes. This helps the domain on a site authority level provide valuable content and information to users.
Lower ranking pages, can cause the domain to be pulled down by serarch engines (Google and Bing have attested to this already) as they want businesses to focus on high value content - which leads to better user experience.
Cheers!
-
Thanks- wanted to make sure all was copacetic there. I'm assuming that it's good practice to disallow access to stats and won't impact the site negatively?
-
Assuming that this is the entire contents of this file: It says that no robot (search engine spider, other crawler, etc.) should visit or index anything in the /stats/ directory or any directories inside of it.
More info available here: http://www.robotstxt.org/robotstxt.html
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Questions about the DA,PA of website
I am counting on some more ads on the site https://gogoanime.city/, is it a problem if I add some ads about sex, porn ..., to make a little more money. So does it affect the PA DA score. Thank you!
Technical SEO | | gogoanimetp0 -
Should I add my html sitemap to Robots?
I have already added the .xml to Robots. But should I also add the html version?
Technical SEO | | Trazo0 -
Site Migration Questions
Hello everyone, We are in the process of going from a .net to a .com and we have also done a complete site redesign as well as refreshed all of our content. I know it is generally ideal to not do all of this at once but I have no control over that part. I have a few questions and would like any input on avoiding losing rankings and traffic. One of my first concerns is that we have done away with some of our higher ranking pages and combined them into one parallax scrolling page. Basically, instead of having a product page for each product they are now all on one page. This of course has made some difficulty because search terms we were using for the individual pages no longer apply. My next concern is that we are adding keywords to the ends of our urls in attempt to raise rankings. So an example: website.com/product/product-name/keywords-for-product if a customer deletes keywords-for-product they end up being re-directed back to the page again. Since the keywords cannot be removed is a redirect the best way to handle this? Would a canonical tag be better? I'm trying to avoid duplicate content since my request to remove the keywords in urls was denied. Also when a customer deletes everything but website.com/product/ it goes to the home page and the url turns to website.com/product/#. Will those pages with # at the end be indexed separately or does google ignore that? Lastly, how can I determine what kind of loss in traffic we are looking at upon launch? I know some is to be expected but I want to avoid it as much as I can so any advice for this migration would be greatly appreciated.
Technical SEO | | Sika220 -
Googlebot does not obey robots.txt disallow
Hi Mozzers! We are trying to get Googlebot to steer away from our internal search results pages by adding a parameter "nocrawl=1" to facet/filter links and then robots.txt disallow all URLs containing that parameter. We implemented this late august and since that, the GWMT message "Googlebot found an extremely high number of URLs on your site", stopped coming. But today we received yet another. The weird thing is that Google gives many of our nowadays robots.txt disallowed URLs as examples of URLs that may cause us problems. What could be the reason? Best regards, Martin
Technical SEO | | TalkInThePark0 -
Google Knowledge Graph related question
I have a client who is facing age discrimination in the film industry. (Big surprise there.) The problem is, when you type in his name, Google's new Knowledge Graph displays a brief bio about him to the right of the search results. This bio snippet includes his year of birth. Wikipedia is credited as the source for the bio information about him, and yet, his Wikipedia entry doesn't include his age or birth date. Neither does his iMDb bio. So the question is, How can he figure out where Google is getting that birthdate from? He wants to try and remove it, not falsify it. Thanks for any help you can offer.
Technical SEO | | JamesAMartin0 -
Allow or Disallow First in Robots.txt
If I want to override a Disallow directive in robots.txt with an Allow command, do I have the Allow command before or after the Disallow command? example: Allow: /models/ford///page* Disallow: /models////page
Technical SEO | | irvingw0 -
Robots.txt versus sitemap
Hi everyone, Lets say we have a robots.txt that disallows specific folders on our website, but a sitemap submitted in Google Webmaster Tools that lists content in those folders. Who wins? Will the sitemap content get indexed even if it's blocked by robots.txt? I know content that is blocked by robot.txt can still get indexed and display a URL if Google discovers it via a link so I'm wondering if that would happen in this scenario too. Thanks!
Technical SEO | | anthematic0 -
What are your thoughts on security of placing CMS-related folders in a robots.txt file?
So I was just about to add a whole heap of CMS-related folders to my robots.txt file to exclude them from search, and thought "hey, I'm publicly telling people where my admin folders are"...surely that's not right?! Should I leave them out of the robots.txt file, and hope for the best that they never get indexed? Should I use noindex meta data on every page? What are people's thoughts? Thanks, James PS. I know this is similar to lots of other discussions around meta noindex vs. robots.txt, but I'm after specific thoughts around the security aspect of listing your admin folders in a robots.txt file...
Technical SEO | | James-Distinction0