Robot.txt File Not Appearing, but seems to be working?
-
Hi Mozzers,
I am conducting a site audit for a client, and I am confused with what they are doing with their robot.txt file. It shows in GWT that there is a file and it is blocking about 12K URLs (image attached). It also shows in GWT that the file was downloaded 10 hours ago successfully. However, when I go to the robot.txt file link, the page is blank.
Would they be doing something advanced to be blocking URLs to hide it it from users? It appears to correctly be blocking log-ins, but I would like to know for sure that it is working correctly. Any advice on this would be most appreciated. Thanks!
Jared
-
There is an old webmaster world thread that explains how to hide the robots.txt file from browsers.... not sure why one would do this however....
http://www.webmasterworld.com/forum93/74.htm
Perhaps they are doing something like this?
-
I verified that I was checking /robots.txt. I had trouble verifying if it was under the non-www because everything redirects to the www. I also checked to see if it was being blocked, and it is not.
I went to Archive.org (Wayback Machine), and I can see the robot.txt file in previous versions of the site. I cannot, however, view it online, even though Google says they are downloading it successfully, and the robots.txt file is successfully blocking URLs from the search index.
-
Be sure you are visiting /robots.txt In all of your copy above, you are referencing robot.txt
Also, check to see if it possibly is only showing up on the www. version or the site or the non-www version of the site.
To be sure if it's working, you can test URLs of your website within Google Webmaster Tools. Go to Crawl->Blocked URLs and scroll down to the bottom.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
After hack and remediation, thousands of URL's still appearing as 'Valid' in google search console. How to remedy?
I'm working on a site that was hacked in March 2019 and in the process, nearly 900,000 spam links were generated and indexed. After remediation of the hack in April 2019, the spammy URLs began dropping out of the index until last week, when Search Console showed around 8,000 as "Indexed, not submitted in sitemap" but listed as "Valid" in the coverage report and many of them are still hack-related URLs that are listed as being indexed in March 2019, despite the fact that clicking on them leads to a 404. As of this Saturday, the number jumped up to 18,000, but I have no way of finding out using the search console reports why the jump happened or what are the new URLs that were added, the only sort mechanism is last crawled and they don't show up there. How long can I expect it to take for these remaining urls to also be removed from the index? Is there any way to expedite the process? I've submitted a 'new' sitemap several times, which (so far) has not helped. Is there any way to see inside the new GSC view why/how the number of valid URLs in the indexed doubled over one weekend?
Intermediate & Advanced SEO | | rickyporco0 -
Block session id URLs with robots.txt
Hi, I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt. Which directive should I use: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Disallow: ?filter= or User-agent: *
Disallow: /?filter= In other words, is the forward slash in the beginning of the disallow directive necessary? Thanks!1 -
Influencers, links, guest blogging, content what really works ?
Hi there, I am scratching my head lately and wondering what is the best in order to increase the ranking of my website... I am sure most of you will say a combination of everything but what else... I know that if an influencer (someone who has a important Klout score) writes about you and links to your website is helpful but what about having a link on a webpage like that : https://goo.gl/YYy5f9 is it still worth my time asking for link on those types of webpages (not that they aren't considered spam according to moz spam score) What about having a link in article on the USA today, what is more important, the fact that the usa today writes about you and has a high DA or the person who writes it ? I am in tourism industry and work with hotels is it worth my time contacting hotels I work with for links or see that a hotel page is not related to what I do which is bicycle tours, or am I wasting my time ? Finally, can't I outrank my competitors by just being more relevant in my content than them know I have a DA of already 38...without chasing links and a website that is 10 years old. Thank you,
Intermediate & Advanced SEO | | seoanalytics0 -
Search engine blocked by robots-crawl error by moz & GWT
Hello Everyone,. For My Site I am Getting Error Code 605: Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag, Also google Webmaster Also not able to fetch my site, tajsigma.com is my site Any expert Can Help please, Thanx
Intermediate & Advanced SEO | | falguniinnovative0 -
Robots.txt vs noindex
I recently started working on a site that has thousands of member pages that are currently robots.txt'd out. Most pages of the site have 1 to 6 links to these member pages, accumulating into what I regard as something of link juice cul-d-sac. The pages themselves have little to no unique content or other relevant search play and for other reasons still want them kept out of search. Wouldn't it be better to "noindex, follow" these pages and remove the robots.txt block from this url type? At least that way Google could crawl these pages and pass the link juice on to still other pages vs flushing it into a black hole. BTW, the site is currently dealing with a hit from Panda 4.0 last month. Thanks! Best... Darcy
Intermediate & Advanced SEO | | 945010 -
Does Link Detox Boost Work?
That is a question I am sure many of your have been asking since they launched the product several weeks ago. Cemper claims they helped get a penalty removed in 3 days by using this product. Sounds great doesn't it? Maybe even sounds too good to be true. Well, here is my experience with it. We have been working to get a site's rankings back up for several months now. While it has no penalty, it clearly got hit by the algo change. So we have been very busy creating new content and attempting to remove as much "keyword rich" links as possible. This really hasn't been working very well at all, so when I heard about link detox boost I thought this was the answer to our prayers. The basic idea is link detox boost forces google to crawl your bad links so it know you no longer have links from those sites or have disavowed them. So we ran it and it was NOT cheap. Roughly $300. Now, 3 weeks after running it, the report only shows it has actually crawled 25% of our links, but they assure us it is a reporting issue and the full process has ran its course. The results. No change at all. Some of our rankings are worse, some are better, but nothing worth mentioning. Many products from Link Research Tools are very good, but i'm afraid this isn't one of them. Anyone else use this product? What were your results?
Intermediate & Advanced SEO | | netviper2 -
Location appearing on search result. how can this be achieved?
I'm pretty sure this site is not doing any SEO but i think what made them no. 1 is the location. I already tried adding a google publisher tag to my site that points to my google page which contains my address but i still can't have the location appear.. here's a screenshot of the search result that i want to achieve: https://www.dropbox.com/s/tbdv3121rrs6zp5/Screen Shot 2013-04-15 at 9.39.30 AM.png Screen%20Shot%202013-04-15%20at%209.39.30%20AM.png
Intermediate & Advanced SEO | | optimind0 -
Google authorship program-sometimes it works..and sometimes ...
sometimes my picture shows up in search results and sometimes it doesn't. I find it depends who's computer I search. Is there a way to make it show up all the time for search results? Also, what determines if your picture shows up besides the article itself or just on the right in a general box showing your google+ profile?
Intermediate & Advanced SEO | | StreetwiseReports0