Exclude root url in robots.txt ?
-
Hi,
I have the following setup:
www.example.com/nl
www.example.com/de
www.example.com/uk
etc
www.example.com is 301'ed to www.example.com/nlBut now www.example.com is ranking instead of www.example.com/nl
Should is block www.example.com in robots.txt so only the subfolders are being ranked?
Or will i lose my ranking by doing this. -
Yes, when clicking the link in google you get redirected.
I will wait some time.Thank you.
-
The site just launched? It sounds like I am right, you just need to give Google some time to drop the page from the index.
When you find the homepage in the index, and you click the link, do you get redirected? If so, Google will eventually drop it.
-
Thanks for answering Philip,
Yes i really used a 301.
I used it in .htaccess
And i have set this up before lauching the site, so i should be good from the beginning.The site was launched last friday.
When searching for the brand name it shows up as example.com
When searching for my main keywords it shows example.com/ned/landing-page -
If you put disallow: / in your robots.txt file, you will tell bots not to crawl the homepage plus ALL interior pages. You'd be shooting yourself in the foot (or head, really).
Are you sure the redirect is setup properly? Is it definitely a 301 redirect, or maybe a 302 (temporary)? How long ago did you implement the redirect? If the 301 redirect is setup properly, and you're still seeing the homepage in the index, you might just need to wait for it to drop out.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Broken URL Links
Hi everyone, I have a question regarding broken URL links on my website. Late last year I move my site from an old platform to Shopify, and now have broken URL links giving out 4xx errors. When I look at Moz Pro>Campaigns>Insights>links, I can see the top broken URL links, however there is a difference if copy & paste URL directly from Moz Pro and by Export CSV file. For example below, If I copy and paste links direct from Moz Pro, it has the “http://” in front as below: http://www.thehairhub.com.au/WebRoot/ecshared01/Shops/thehairhub/57F3/1D8F/D244/C675/E27D/AC10/003F/35AD/manic-panic-colours.jpg But when I export the list of links as an CSV file, the http:// is removed. www.thehairhub.com.au/WebRoot/ecshared01/Shops/thehairhub/57F3/1D8F/D244/C675/E27D/AC10/003F/35AD/manic-panic-colours.jpg Another Example below: By copy & paste URL direct from Moz Pro
Technical SEO | | johnwall
http://thehairhub.com.au/Shop-Brands/Vitafive-CPR/CPR-Rescue By export CSV file.
thehairhub.com.au/Shop-Brands/Vitafive-CPR/CPR-Rescue Which one do I use to enter into the “Redirect From” field in Shopify URL Redirects? Do I need to have the http:// in front of the URL? Or is it not required for redirects to work? Kind Regards, John Wall
The Hair Hub0 -
Does the URL structure matter?
I have a blog on entertainment. does the url structure matter to rank my blog and iam also facing the issue of indexing of my blog. visit and check this if i need further changes.
Technical SEO | | Hammad784540 -
Include or exclude noindex urls in sitemap?
We just added tags to our pages with thin content. Should we include or exclude those urls from our sitemap.xml file? I've read conflicting recommendations.
Technical SEO | | vcj0 -
Robots file set up
The robots file looks like it has been set up in a very messy way.
Technical SEO | | mcwork
I understand the # will comment out a line, does this mean the sitemap would
not be picked up?
Disallow: /js/ should this be allowed like /*.js$
Disallow: /media/wysiwyg/ - this seems to be causing alerts in webmaster tools as it can not access
the images within.
Can anyone help me clean this up please #Sitemap: https://examplesite.com/sitemap.xml Crawlers Setup User-agent: *
Crawl-delay: 10 Allowable Index Mind that Allow is not an official standard Allow: /index.php/blog/
Allow: /catalog/seo_sitemap/category/ Allow: /catalogsearch/result/ Allow: /media/catalog/ Directories Disallow: /404/
Disallow: /app/
Disallow: /cgi-bin/
Disallow: /downloader/
Disallow: /errors/
Disallow: /includes/
Disallow: /js/
Disallow: /lib/
Disallow: /magento/ Disallow: /media/ Disallow: /media/captcha/ Disallow: /media/catalog/ #Disallow: /media/css/
#Disallow: /media/css_secure/
Disallow: /media/customer/
Disallow: /media/dhl/
Disallow: /media/downloadable/
Disallow: /media/import/
#Disallow: /media/js/
Disallow: /media/pdf/
Disallow: /media/sales/
Disallow: /media/tmp/
Disallow: /media/wysiwyg/
Disallow: /media/xmlconnect/
Disallow: /pkginfo/
Disallow: /report/
Disallow: /scripts/
Disallow: /shell/
#Disallow: /skin/
Disallow: /stats/
Disallow: /var/ Paths (clean URLs) Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /catalog/product/gallery/
Disallow: */catalog/product/upload/
Disallow: /catalogsearch/
Disallow: /checkout/
Disallow: /control/
Disallow: /contacts/
Disallow: /customer/
Disallow: /customize/
Disallow: /newsletter/
Disallow: /poll/
Disallow: /review/
Disallow: /sendfriend/
Disallow: /tag/
Disallow: /wishlist/ Files Disallow: /cron.php
Disallow: /cron.sh
Disallow: /error_log
Disallow: /install.php
Disallow: /LICENSE.html
Disallow: /LICENSE.txt
Disallow: /LICENSE_AFL.txt
Disallow: /STATUS.txt
Disallow: /get.php # Magento 1.5+ Paths (no clean URLs) #Disallow: /.js$
#Disallow: /.css$
Disallow: /.php$
Disallow: /?SID=
Disallow: /rss*
Disallow: /*PHPSESSID Disallow: /:
Disallow: /😘 User-agent: Fatbot
Disallow: / User-agent: TwengaBot-2.0
Disallow: /0 -
Robots.txt versus sitemap
Hi everyone, Lets say we have a robots.txt that disallows specific folders on our website, but a sitemap submitted in Google Webmaster Tools that lists content in those folders. Who wins? Will the sitemap content get indexed even if it's blocked by robots.txt? I know content that is blocked by robot.txt can still get indexed and display a URL if Google discovers it via a link so I'm wondering if that would happen in this scenario too. Thanks!
Technical SEO | | anthematic0 -
Robots.txt question
What is this robots.txt telling the search engines? User-agent: * Disallow: /stats/
Technical SEO | | DenverKelly0 -
Keyword and URL
I have a client who has a popular name (like 'Joe Smith'). His blog URL has only his first name and the name of his company in it, like joe.company.com. His blog doesn't rank well at all in the first 3-4 Google SERPs. I was thinking of advising him to change the URL of his blog to joesmith.company.com, and having his webmaster do 301 redirects from the old URL to the new one. Do you think this is a good strategy, or would you recommend something else? I realize ranking isn't just about the URL, it's about links, etc. But I think making his URL more specific to his name could help. Any advice greatly appreciated! Jim
Technical SEO | | JamesAMartin0 -
Robots exclusion
Hi All, I have an issue whereby print versions of my articles are being flagged up as "duplicate" content / page titles. In order to get around this, I feel that the easiest way is to just add them to my robots.txt document with a disallow. Here is my URL make up: Normal article: www.mysite.com/displayarticle=12345 Print version of my article www.mysite.com/displayarticle=12345&printversion=yes I know that having dynamic parameters in my URL is not best practise to say the least, but I'm stuck with this for the time being... My question is, how do I add just the print versions of articles to my robots file without disallowing articles too? Can I just add the parameter to the document like so? Disallow: &printversion=yes I also know that I can do add a meta noindex, nofollow tag into the head of my print versions, but I feel a robots.txt disallow will be somewhat easier... Many thanks in advance. Matt
Technical SEO | | Horizon0