Is my robots.txt file working?
-
Greetings from medieval York UK
Everytime to you enter my name & Liz this page is returned in Google:
http://www.davidclick.com/web_page/al_liz.htmBut i have the following robots txt file which has been in place a few weeks
User-agent: * Disallow: /york_wedding_photographer_advice_pre_wedding_photoshoot.htm Disallow: /york_wedding_photographer_advice.htm Disallow: /york_wedding_photographer_advice_copyright_free_wedding_photography.htm Disallow: /web_page/prices.htm Disallow: /web_page/about_me.htm Disallow: /web_page/thumbnails4.htm Disallow: /web_page/thumbnails.html Disallow: /web_page/al_liz.htm Disallow: /web_page/york_wedding_photographer_advice.htm Allow: /
So my question is please...
"Why is this page appearing in the SERPS when its blocked in the robots txt file e.g.: Disallow: /web_page/al_liz.htm"
ANy insights welcome
-
Glad we could help
Fredrik
PS Dont forget to mark as answered
-
Brill answers guys thanks
-
Nightwing
Frederick gives some good pointers and here is a little trick to try: Fetch as Google from GWMT
- On the Webmaster Tools Home page, click the site you want.
- On the Dashboard, under Health, click Fetch as Google.
- In the text box, type the path to the page you want to check.
- In the dropdown list, select the type of fetch you want. To see what our web crawler Googlebot sees, select Web. To see what our mobile crawler Googlebot-Mobile sees, select cHTML (this is used mainly for Japanese web sites) or Mobile XHTML/WML.
- Click Fetch.
This will likely give you a quick re index and you will know whassup...
Best,
Robert
-
Hi David
How long have you had the robots.txt file? Preventeing Google from indexing the page would not automatically remove it if its already indexed. That would take some time.
You could try using the removal tool:
https://www.google.com/webmasters/tools/removals
If its urgent you could check the header and do a 301 redirect if the user comes from Google. But I think it should sort itself out within not too long.
Fredrik
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Is sitemap required on my robots.txt?
Hi, I know that linking your sitemap from your robots.txt file is a good practice. Ok, but... may I just send my sitemap to search console and forget about adding ti to my robots.txt? That's my situation: 1 multilang platform which means... ... 2 set of pages. One for each lang, of course But my CMS (magento) only allows me to have 1 robots.txt file So, again: may I have a robots.txt file woth no sitemap AND not suffering any potential SEO loss? Thanks in advance, Juan Vicente Mañanas Abad
Technical SEO | | Webicultors0 -
Does uploading a new disavow file wipe out the original?
Hi guys, Just struggling to get a definitive answer on this one. If say I disavow 55 domains, then upload a brand new disavow file with on 35 domains in it, does this mean the original disavow file will be overwritten and those original domains will be forgotten about? Kind regards!
Technical SEO | | WCR0 -
Can the Hosting location of image files have a negative effect if on the developers own media server rather than on client site server ?
Hi Can the Hosting location of image files have a negative effect if on the developers own media server as opposed to on the actual websites server ? In the case i'm looking at the image files are hosted on a totally separate server (a media subdomain of the developers site server) from the subject sites dedicated server. Will engines still attribute the properties of files hosted in this manner to the main website (such as file name or should they really be on the subject sites server own media folder ? Cheers Dan
Technical SEO | | Dan-Lawrence0 -
Robots.txt and joomla
Hello, I use joomla for my website and automatically all those files are blocked is that good or bad, so I remove anything and if so why ? User-agent: *
Technical SEO | | seoanalytics
Disallow: /administrator/
Disallow: /cache/
Disallow: /components/
Disallow: /images/
Disallow: /includes/
Disallow: /installation/
Disallow: /language/
Disallow: /libraries/
Disallow: /media/
Disallow: /modules/
Disallow: /plugins/
Disallow: /templates/
Disallow: /tmp/
Disallow: /xmlrpc/ I also added to my robots.txt files my email address ( is that useful, I am afraid google passes PR to the email address )
and a javascript: void (0) because I have tabs on my webpage ( is that useful )
as well as a .pdf ( is it also useful ) any comments ? does anything need to be changed or is it ok ? Thank you,0 -
How can I make Google Webmaster Tools see the robots.txt file when I am doing a .htacces redirec?
We are moving a site to a new domain. I have setup an .htaccess file and it is working fine. My problem is that Google Webmaster tools now says it cannot access the robots.txt file on the old site. How can I make it still see the robots.txt file when the .htaccess is doing a full site redirect? .htaccess currently has: Options +FollowSymLinks -MultiViews
Technical SEO | | RalphinAZ
RewriteEngine on
RewriteCond %{HTTP_HOST} ^(www.)?michaelswilderhr.com$ [NC]
RewriteRule ^ http://www.s2esolutions.com/ [R=301,L] Google webmaster tools is reporting: Over the last 24 hours, Googlebot encountered 1 errors while attempting to access your robots.txt. To ensure that we didn't crawl any pages listed in that file, we postponed our crawl. Your site's overall robots.txt error rate is 100.0%.0 -
Question about Robot.txt
I just started my own e-commerce website and I hosted it to one of the popular e-commerce platform Pinnacle Cart. It has a lot of functions like, page sorting, mobile website, etc. After adjusting the URL parameters in Google webmaster last 3 weeks ago, I still get the same duplicate errors on meta titles and descriptions based from Google Crawl and SEOMOZ crawl. I am not sure if I made a mistake of choosing pinnacle cart because it is not that flexible in terms of editing the core website pages. There is now way to adjust the canonical, to insert robot.txt on every pages etc. however it has a function to submit just one page of robot.txt. and edit the .htcaccess. The website pages is in PHP format. For example this URL: www.mycompany.com has a duplicate title and description with www.mycompany.com/site-map.html (there is no way of editing the title and description of my sitemap) Another error is www.mycompany.com has a duplicate title and description with http://www.mycompany.com/brands?url=brands Is it possible to exclude those website with "url=" and my "sitemap.html" in the robot.txt? or the URL parameters from Google is enough and it just takes a lot of time. Can somebody help me on the format of Robot.txt. Please? thanks
Technical SEO | | paumer800 -
Destination URL in SERPs keeps changing and I can't work out why.. Help.
I am befuddled as to why our destination URL in SERPs keeps changing oak furniture was nicely returning http://www.thefurnituremarket.co.uk/oakfurniture.asp then I changed something yesterday I did 2 things. published a link to that on facebook as part of a competition. redirected dynamic pages to the static URL for oak furniture.. Now for oak furniture the SERPs in GG UK is returning our home page as the most relevant landing page.. Any Idea why? I'm leaning to an onpage issue than posting on FB.. Thoughts?
Technical SEO | | robertrRSwalters0 -
301 Redirect NOT Working as Expected - HELP!
Hi! I just launched our newly coded site and just realized the installed 301 is NOT working. The URL string is the same EXCEPT for the removal of /shop/. Here is the code in .htaccess: ############################################ enable rewrites Options +FollowSymLinks RewriteEngine on #RedirectMatch 301 ^/shop?/$ http://hiphound.com/ RedirectMatch 301 ^/shop?/$ http://hiphound.com ########################################### When I go to Google and click on an old link I get a 404. No bueno!! Here is an example: http://hiphound.com/shop/rubit-dog-tag-clip I thought (and was told) that the installed 301 would send this page to: http://hiphound.com/rubit-dog-tag-clip It's not. Please HELP!! 🙂 What am I doing wrong??? Lynn
Technical SEO | | hiphound0