Best way to create robots.txt for my website
-
How I can create robots.txt file for my website guitarcontrol.com ?
It is having login and Guitar lessons.
-
Hi,
First you need to understand your website need, you have to decide which part of your website should not be indexed or crawled by SE bots, like your website provides user login and user areas, if you are providing private dashboard for your user then it should be blocked by robots.txt (or you can use meta tag to prevent robots from crawling and indexing your particular page like ) or you can learn more about robots.txt here https://moz.com/learn/seo/robotstxt
Hope it helps
-
I see that you're on WordPress.
This CMS create "virtual" robots.txt. You can see this here:
https://codex.wordpress.org/Search_Engine_Optimization_for_WordPress#Robots.txt_OptimizationBut on your website there is error in robots.txt and you should see in web server log files (access and error) why this is happening. Also you may need looking .htaccess because something preventing this text file to be accessed.
There is alternative way for using robots.txt in WordPress. All you need is to create new and blank robots.txt in same folder and put this there:
User-agent: *
Disallow:Then save file and that's all. Now bad news - WP can't control indexing and crawling anymore.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Website crawl error
Hi all, When I try to crawl a website, I got next error message: "java.lang.IllegalArgumentException: Illegal cookie name" For the moment, I found next explanation: The errors indicate that one of the web servers within the same cookie domain as the server is setting a cookie for your domain with the name "path", as well as another cookie with the name "domain" Does anyone has experience with this problem, knows what it means and knows how to solve it? Thanks in advance! Jens
Technical SEO | | WeAreDigital_BE0 -
Google indexing despite robots.txt block
Hi This subdomain has about 4'000 URLs indexed in Google, although it's blocked via robots.txt: https://www.google.com/search?safe=off&q=site%3Awww1.swisscom.ch&oq=site%3Awww1.swisscom.ch This has been the case for almost a year now, and it does not look like Google tends to respect the blocking in http://www1.swisscom.ch/robots.txt Any clues why this is or what I could do to resolve it? Thanks!
Technical SEO | | zeepartner0 -
What is the best way to handle these duplicate page content errors?
MOZ reports these as duplicate page content errors and I'm not sure the best way to handle it. Home
Technical SEO | | ElykInnovation
http://myhjhome.com/
http://myhjhome.com/index.php Blog
http://myhjhome.com/blog/
http://myhjhome.com/blog/?author=1 Should I just create 301 redirects for these? 301 http://myhjhome.com/index.php to http://myhjhome.com/ ? 301 http://myhjhome.com/blog/?author=1 to http://myhjhome.com/ ? Or is there a better way to handle this type of duplicate page content errors? and0 -
Best way to handle pages with iframes that I don't want indexed? Noindex in the header?
I am doing a bit of SEO work for a friend, and the situation is the following: The site is a place to discuss articles on the web. When clicking on a link that has been posted, it sends the user to a URL on the main site that is URL.com/article/view. This page has a large iframe that contains the article itself, and a small bar at the top containing the article with various links to get back to the original site. I'd like to make sure that the comment pages (URL.com/article) are indexed instead of all of the URL.com/article/view pages, which won't really do much for SEO. However, all of these pages are indexed. What would be the best approach to make sure the iframe pages aren't indexed? My intuition is to just have a "noindex" in the header of those pages, and just make sure that the conversation pages themselves are properly linked throughout the site, so that they get indexed properly. Does this seem right? Thanks for the help...
Technical SEO | | jim_shook0 -
Google insists robots.txt is blocking... but it isn't.
I recently launched a new website. During development, I'd enabled the option in WordPress to prevent search engines from indexing the site. When the site went public (over 24 hours ago), I cleared that option. At that point, I added a specific robots.txt file that only disallowed a couple directories of files. You can view the robots.txt at http://photogeardeals.com/robots.txt Google (via Webmaster tools) is insisting that my robots.txt file contains a "Disallow: /" on line 2 and that it's preventing Google from indexing the site and preventing me from submitting a sitemap. These errors are showing both in the sitemap section of Webmaster tools as well as the Blocked URLs section. Bing's webmaster tools are able to read the site and sitemap just fine. Any idea why Google insists I'm disallowing everything even after telling it to re-fetch?
Technical SEO | | ahockley0 -
New website
Hello, How bad is going to be if I change my Joomla website to Wordpress? I can check the 100 best pages and redirect them to the new url with 301 but my website has 424 pages. If is this needs time, how long does it take to be in the same position? Is Google review my new website quickly? What about if I make my services more specific and the main topic is going to be smaller in pages? (Mpre social services pages vs. less pages about the main webdesign topic) I should change my website to WP but I am afraid because now I am in the 2. 🙂 Thanks! Regards,
Technical SEO | | Netkreativ
Misi0 -
I'm getting a Duplicate Content error in my Pro Dashboard for 2 versions of my Homepage. What is the best way to handle this issue?
Hi SEOMoz,I am trying to fix the final issues in my site crawl. One that confuses me is this canonical homepage URL fix. It says I have duplicate content on the following pages:http://www.accupos.com/http://www.accupos.com/index.phpWhat would be the best way to fix this problem? (...the first URL has a higher page authority by 10 points and 100+ more inbound links).Respectfully Yours,Derek M.
Technical SEO | | DerekM880 -
How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
Today's sitemap webinar made me think about the disallow feature, seems opposite of sitemaps, but it also seems both are kind of ignored in varying ways by the engines. I don't need help semantically, I got that part. I just can't seem to find a contemporary answer about what should be blocked using the robots.txt file. For example, I have folders containing site comps for clients that I really don't want showing up in the SERPS. Is it better to not have these folders on the domain at all? There are also security issues I've heard of that make sense, simply look at a site's robots file to see what they are hiding. It makes it easier to hunt for files when they know the directory the files are contained in. Do I concern myself with this? Another example is a folder I have for my xml sitemap generator. I imagine google isn't going to try to index this or count it as content, so do I need to add folders like this to the disallow list?
Technical SEO | | SpringMountain0