Is my robots.txt file working?
-
Greetings from medieval York UK
Everytime to you enter my name & Liz this page is returned in Google:
http://www.davidclick.com/web_page/al_liz.htmBut i have the following robots txt file which has been in place a few weeks
User-agent: * Disallow: /york_wedding_photographer_advice_pre_wedding_photoshoot.htm Disallow: /york_wedding_photographer_advice.htm Disallow: /york_wedding_photographer_advice_copyright_free_wedding_photography.htm Disallow: /web_page/prices.htm Disallow: /web_page/about_me.htm Disallow: /web_page/thumbnails4.htm Disallow: /web_page/thumbnails.html Disallow: /web_page/al_liz.htm Disallow: /web_page/york_wedding_photographer_advice.htm Allow: /
So my question is please...
"Why is this page appearing in the SERPS when its blocked in the robots txt file e.g.: Disallow: /web_page/al_liz.htm"
ANy insights welcome
-
Glad we could help
Fredrik
PS Dont forget to mark as answered
-
Brill answers guys thanks
-
Nightwing
Frederick gives some good pointers and here is a little trick to try: Fetch as Google from GWMT
- On the Webmaster Tools Home page, click the site you want.
- On the Dashboard, under Health, click Fetch as Google.
- In the text box, type the path to the page you want to check.
- In the dropdown list, select the type of fetch you want. To see what our web crawler Googlebot sees, select Web. To see what our mobile crawler Googlebot-Mobile sees, select cHTML (this is used mainly for Japanese web sites) or Mobile XHTML/WML.
- Click Fetch.
This will likely give you a quick re index and you will know whassup...
Best,
Robert
-
Hi David
How long have you had the robots.txt file? Preventeing Google from indexing the page would not automatically remove it if its already indexed. That would take some time.
You could try using the removal tool:
https://www.google.com/webmasters/tools/removals
If its urgent you could check the header and do a 301 redirect if the user comes from Google. But I think it should sort itself out within not too long.
Fredrik
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Temporary redirect from 302 to 301 for PNG File?
#302HTTP #temporaryredirect
Technical SEO | | Damian_Ed 0
Hi everyone, Recently I have faced a crawl issue with my media images on website. For example this page url https://intreface.com/wp-content/uploads/2022/12/Horion-screen-side-2.png has 302 HTTP Status and the recommendation is to change it 301. I have read the article on temporary redirections here:
https://moz.com/learn/seo/redirection?_ga=2.45324708.1293586627.1702571936-916254120.1702571936
but its not written here how to redirect in my HTML 1 image url not the landing page.
Screenshot 2023-12-15 at 11.02.40.png
I have messaged to MOZ Support but they recommended to go for the MOZ Community!
Screenshot 2023-12-15 at 11.06.02.png Could you assist me wit this issue please? I can reach HTTML of the necessary page and change what I need for permanent redirection but firstly I need to understand how to do that correctly.0 -
Crawl solutions for landing pages that don't contain a robots.txt file?
My site (www.nomader.com) is currently built on Instapage, which does not offer the ability to add a robots.txt file. I plan to migrate to a Shopify site in the coming months, but for now the Instapage site is my primary website. In the interim, would you suggest that I manually request a Google crawl through the search console tool? If so, how often? Any other suggestions for countering this Meta Noindex issue?
Technical SEO | | Nomader1 -
How do I complete a reverse DNS check when completing log file analysis?
I'm doing some log file analysis and need to run a reverse DNS check to ensure that I'm analysing logs from Google and not any imposters. Is there a command I can use in terminal to do this? If not, whats the best way to verify Googlebot? Thanks
Technical SEO | | daniel-brooks0 -
Multiple robots.txt files on server
Hi! I have previously hired a developer to put up my site and noticed afterwards that he did not know much about SEO. This lead me to starting to learn myself and applying some changes step by step. One of the things I am currently doing is inserting sitemap reference in robots.txt file (which was not there before). But just now when I wanted to upload the file via FTP to my server I found multiple ones - in different sizes - and I dont know what to do with them? Can I remove them? I have downloaded and opened them and they seem to be 2 textfiles and 2 dupplicates. Names: robots.txt (original dupplicate)
Technical SEO | | mjukhud
robots.txt-Original (original)
robots.txt-NEW (other content)
robots.txt-Working (other content dupplicate) Would really appreciate help and expertise suggestions. Thanks!0 -
Robots.txt
Hi All Having a robots.txt looking like the below will this stop Google crawling the site User-agent: *
Technical SEO | | internetsalesdrive0 -
Adding directories to robots nofollow cause pages to have Blocked Resources
In order to eliminate duplicate/missing title tag errors for a directory (and sub-directories) under www that contain our third-party chat scripts, I added the parent directory to the robots disallow list. We are now receiving a blocked resource error (in Webmaster Tools) on all of the pages that have a link to a javascript (for live chat) in the parent directory. My host is suggesting that the warning is only a notice and we can leave things as is without worrying about the page being de-ranked/penalized. I am wondering if this is true or if we should remove the one directory that contains the js from the robots file and find another way to resolve the duplicate title tags?
Technical SEO | | miamiman1000 -
If Google's index contains multiple URLs for my homepage, does that mean the canonical tag is not working?
I have a site which is using canonical tags on all pages, however not all duplicate versions of the homepage are 301'd due to a limitation in the hosting platform. So some site visitors get www.example.com/default.aspx while others just get www.example.com. I can see the correct canonical tag on the source code of both versions of this homepage, but when I search Google for the specific URL "www.example.com/default.aspx" I see that they've indexed that specific URL as well as the "clean" one. Is this a concern... shouldn't Google only show me the clean URL?
Technical SEO | | JMagary0 -
Directory Naming & File Organization
We're redoing an entire site and are going to reorganize, and link to the site's pages by directory instead of page name. So instead of:xyz.com/services/fixingtvs.phpit will be:xyz.com/fixingtvsAt first I was thinking 1 index.php page per directory but that will make content management really confusing with a bunch of files with the same name.Anyone have a better idea?Thanks,Matt
Technical SEO | | mattloht0