Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
-
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank?
User-agent: *
Disallow: /
Sitemap: http://www.morganlindsayphotography.com/sitemap.xml
Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
-
Hi there
If you configured this properly, I wouldn't worry about this at all.
Check your internal links and sitemap to make sure that your URLs listed as a reflection of this www. version.
Beyond that, you're all good, no need to block non www.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Splitting One Site Into Two Sites Best Practices Needed
Okay, working with a large site that, for business reasons beyond organic search, wants to split an existing site in two. So, the old domain name stays and a new one is born with some of the content from the old site, along with some new content of its own. The general idea, for more than just search reasons, is that it makes both the old site and new sites more purely about their respective subject matter. The existing content on the old site that is becoming part of the new site will be 301'd to the new site's domain. So, the old site will have a lot of 301s and links to the new site. No links coming back from the new site to the old site anticipated at this time. Would like any and all insights into any potential pitfalls and best practices for this to come off as well as it can under the circumstances. For instance, should all those links from the old site to the new site be nofollowed, kind of like a non-editorial link to an affiliate or advertiser? Is there weirdness for Google in 301ing to a new domain from some, but not all, content of the old site. Would you individually submit requests to remove from index for the hundreds and hundreds of old site pages moving to the new site or just figure that the 301 will eventually take care of that? Is there substantial organic search risk of any kind to the old site, beyond the obvious of just not having those pages to produce any more? Anything else? Any ideas about how long the new site can expect to wander the wilderness of no organic search traffic? The old site has a 45 domain authority. Thanks!
Intermediate & Advanced SEO | | 945010 -
Not sure how we're blocking homepage in robots.txt; meta description not shown
Hi folks! We had a question come in from a client who needs assistance with their robots.txt file. Metadata for their homepage and select other pages isn't appearing in SERPs. Instead they get the usual message "A description for this result is not available because of this site's robots.txt – learn more". At first glance, we're not seeing the homepage or these other pages as being blocked by their robots.txt file: http://www.t2tea.com/robots.txt. Does anyone see what we can't? Any thoughts are massively appreciated! P.S. They used wildcards to ensure the rules were applied for all locale subdirectories, e.g. /en/au/, /en/us/, etc.
Intermediate & Advanced SEO | | SearchDeploy0 -
Meta robots or robot.txt file?
Hi Mozzers! For parametric URL's would you recommend meta robot or robot.txt file?
Intermediate & Advanced SEO | | eLab_London
For example: http://www.exmaple.com//category/product/cat no./quickView I want to stop indexing /quickView URLs. And what's the real difference between the two? Thanks again! Kay0 -
Robots.txt - blocking JavaScript and CSS, best practice for Magento
Hi Mozzers, I'm looking for some feedback regarding best practices for setting up Robots.txt file in Magento. I'm concerned we are blocking bots from crawling essential information for page rank. My main concern comes with blocking JavaScript and CSS, are you supposed to block JavaScript and CSS or not? You can view our robots.txt file here Thanks, Blake
Intermediate & Advanced SEO | | LeapOfBelief0 -
Robots.txt issue for international websites
In Google.co.uk, our US based (abcd.com) is showing: A description for this result is not available because of this site's robots.txt – learn more But UK website (uk.abcd.com) is working properly. We would like to disappear .com result totally, if possible. How to fix it? Thanks in advance.
Intermediate & Advanced SEO | | JinnatUlHasan0 -
Splitting sites similar to Diapers.com
Our site (theoilhub.com) sells automotive products. We want to split it into multiple sites (autoparthub,com, motoparthub.com) and so on. Some products will be listed on multiple domains because they have application in multiple domains (lubricants, helmets...). I was wondering what the best solution was for avoiding any problems with Google detecting too much duplicate content from our sites? would doing rel=canonical help? Any suggestions would be appreciated.
Intermediate & Advanced SEO | | theoilhub0 -
Why are these results being showed as blocked by robots.txt?
If you perform this search, you'll see all m. results are blocked by robots.txt: http://goo.gl/PRrlI, but when I reviewed the robots.txt file: http://goo.gl/Hly28, I didn't see anything specifying to block crawlers from these pages. Any ideas why these are showing as blocked?
Intermediate & Advanced SEO | | nicole.healthline0 -
How do you prevent the mobile site becoming a duplicate of the full browser site?
We have a larger site with 100k+ pages, we need to create a mobile site which gets indexed in the mobile engines but I am afraid that google bot will consider these pages duplicates of the normal site pages. I know I can block it on the robots.txt but I still need it to be indexed for mobile search engines and I think google has a mobile crawler as well. Feel free to give me any other tips that I should follow while trying to optimize the mobile version. Any help would be appreciated 🙂
Intermediate & Advanced SEO | | pulseseo0