Is our robots.txt file correct?
-
Could you please review our robots.txt file and let me know if this is correct.
Thank you!
-
What's the end goal here?
Are you actively trying to block all bots?If so, I would still suggest "Disallow:/".
The other syn-text may also work, but if Google suggests using a backslash, you should probably use it. -
Hi, it seems correct to me however try to use the robots.txt checker tool in GWTools. You may try to include a couple of your urls and see if google can crawl them.
I find only redundant the follwing rule:
User-agent: Mediapartners-Google.
If you have already set up a disallow: rule for all bot excluding rogerbot which can't access the community folder why create a new rule stating the same for mediapartners?
Again, why are you saying to all bots they can access the entire site, being that the default rule? Avoid those lines, include just the rogerbot and sitemaps rule and you're done.
-
Thank you for the reply. We want to allow all crawling, except for rogerbot in the community folder.
I have updated the robots.txt to the following, does this look right?:
User-agent: * Disallow: User-agent: rogerbot Disallow: /community/ User-agent: Mediapartners-Google Disallow: Sitemap: http://www.faithology.com/sitemap.xml view the robots here: http://www.faithology.com/robots.txt
-
There are some errors, but since I'm not sure what you are trying to accomplish, I recommend checking it with a tool first. Here is a great tool to check your robots.txt file and give you information on errors - http://tool.motoricerca.info/robots-checker.phtml
If you still need assistance after running it through the tool, please reply and we can help you further.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots blocked by pages webmasters tools
a mistake made in software. How can I solve the problem quickly? help me. XTRjH
Intermediate & Advanced SEO | | mihoreis0 -
Google cache is for a 3rd parties site for HTTP version and correct for HTTPS
If I search Google for my cache I get the following: cache:http://www.saucydates.com -> Returns the cache of netball.org (HTTPS page with Plesk default page) cache:https://www.saucydates.com -> Displays the correct page Prior to this my http cache was the Central Bank of Afghanistan. For most searches at present my index page is not returned and when it is, it’s the Net Ball Plesk page. This is, of course hurting my search traffic considerably. ** I have tried many things, here is the current list:** If I fetch as Google in webmaster tools the HTTPS fetch and render is correct. If I fetch the HTTP version I get a redirect (which is correct as I have a 301 HTTP to HTTPS redirect). If I turn off HTTPS on my server and remove the redirect the fetch and render for HTTP version is correct. The 301 redirect is controlled with the 301 Safe redirect option in Plesk 12.x The SSL cert is valid and with COMODO I have ensured the IP address (which is shared with a few other domains that form my sites network / functions) has a default site I have placed a site on my PTR record and ensured the HTTPS version goes back to HTTP as it doesn’t need SSL I have checked my site in Waybackwhen for 1 year and there are no hacked redirects I have checked the Netball site in Waybackwhen for 1 year, mid last year there is an odd firewall alert page. If you check the cache for the https version of the netball site you get another sites default plesk page. This happened at the same time I implemented SSL Points 6 and 7 have been done to stop the server showing a Plesk Default page as I think this could be the issue (duplicate content) ** Ideas:** Is this a 302 redirect hi-jack? Is this a Google bug? Is this an issue with duplicate content as both servers can have a default Plesk page (like millions of others!) A network of 3 sites mixed up that have plesk could be a clue? Over to the experts at MOZ, can you help? Thanks, David
Intermediate & Advanced SEO | | dmcubed0 -
Disavow files and net com org etc ....
When looking at my backlinks if I see something like this: www.domainPizza.net
Intermediate & Advanced SEO | | HLTalk
www.domainPizza.com
sub.domainPizza.com
www.domainpizza.org
domainPizza.net
https://domainpizza.com
https://www.domainpizza.net What is the actual list of disavows that I put into the file if I want to disavow this domain? I am seeing so many variations of the same domain. Thank you.0 -
Recovering old disallow file?
Hi guys, We had aN SEO agency do a disallow request on one of our sites a while back. They have no trace of the disallow txt file and all the links they disallowed. Does anyone know if there is a way to recover this file in google webmaster tools or anyway to find which links were disallowed? Cheers.
Intermediate & Advanced SEO | | jayoliverwright0 -
Question about robots file on mobile devices
Hi We have a robots.txt file, but do I need to create a separate file for the m.site or can I just add the line into my normal robots file. Ive just read the Google Guidelines (what a great read it was) and couldn't find my answer. Thanks in Advance Andy
Intermediate & Advanced SEO | | Andy-Halliday0 -
Robots.txt vs noindex
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out. Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac. The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole. BTW, the site is currently dealing with a hit from Panda 4.0 last month. Thanks! Best... Darcy
Intermediate & Advanced SEO | | 945010 -
Site Explorer, Social Media Count. Am I linking my social media correctly?
How do I correctly link my social media pages? I have link going from my Twitter, Facebook and Google + to my website. But a quick Open site explorer check says that I have, 0 Facebook Friends, 0 Twitter followers and 0 Google + Followers. Where as in relaity, I have 100 - 1000 follwers on each. Infact, the hyperlink from my Twitter Profile section doesn't appear as a no follow link atall on an OSE check of my website. Am I linking social media wrong?
Intermediate & Advanced SEO | | Paul_Tovey0 -
Correcting an unnatural link profile
A site I work with ranked page 1 for a competitive keyphrase until recently. (Not Panda-related as far as we can tell.) We've done extensive on-site tweaking and the page is still parked at 27-32 in the SERPs. We believe the only viable explanation at this point is an unnatural link profile. Over the course of several years the site has racked up a large collection of footer links with anchor text due to business relationships with the sites in question. So the profile is now skewed, with the result as follows: 100,000 domain links (top 10 competitors range 1800-50k) 87% anchor text optimized (competitors 0-41%) 99% follow links (competitors 85-100%) The vast majority of links are footer links We're working on creating more natural, high-value links but this of course takes time. In the short term, two questions: Should we aim to remove or change some of the footer links? If so, do we remove them, or just change anchor text? How many? How many new links should we pursue each month to make a meaningful impact on the profile without being too aggressive? Any other thoughts on how to fix this are also appreciated. Thanks!
Intermediate & Advanced SEO | | kdcomms0