Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
-
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank?
User-agent: *
Disallow: /
Sitemap: http://www.morganlindsayphotography.com/sitemap.xml
Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
-
Hi there
If you configured this properly, I wouldn't worry about this at all.
Check your internal links and sitemap to make sure that your URLs listed as a reflection of this www. version.
Beyond that, you're all good, no need to block non www.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Should i stop non targeted countries from viewing my site to stop pogo sticking?
i recently Learned that if someone clicks on your site and goes back within a few seconds or maybe a minute and then goes to the next search result that tells Google that previous website was not so good In order to prevent that is it OK to block non-targeted countries from looking at my site that could stop user engagement a little bit I mean is it worth it what do you think?
Intermediate & Advanced SEO | | Sam09schulz0 -
Website Indexing Issues - Search Bots will only crawl Homepage of Website, Help!
Hello Moz World, I am stuck on a problem, and wanted to get some insight. When I attempt to use Screaming Spider or SEO Powersuite, the software is only crawling the homepage of my website. I have 17 pages associated with the main domain i.e. example.com/home, example.com/sevices, etc. I've done a bit of investigating, and I have found that my client's website does not have Robot.txt file or a site map. However, under Google Search Console, all of my client's website pages have been indexed. My questions, Why is my software not crawling all of the pages associated with the website? If I integrate a Robot.txt file & sitemap will that resolve the issue? Thanks ahead of time for all of the great responses. B/R Will H.
Intermediate & Advanced SEO | | MarketingChimp100 -
If I put a piece of content on an external site can I syndicate to my site later using a rel=canonical link?
Could someone help me with a 'what if ' scenario please? What happens if I publish a piece of content on an external website, but then later decide to also put this content on my website. I want my website to rank first for this content, even though the original location for the content was the external website. Would it be okay for me to put a rel=canonical tag on the external website's content pointing to the copy on my website? Or would this be seen as manipulative?
Intermediate & Advanced SEO | | RG_SEO1 -
How to make an AJAX site crawlable when PushState and #! can't be used?
Dear Mozzers, Does anyone know a solution to make an AJAX site crawlable if: 1. You can't make use of #! (with HTML snapshots) due to tracking in Analytics 2. PushState can't be implemented Could it be a solution to create two versions of each page (one without #!, so campaigns can be tracked in Analytics & one with #! which will be presented to Google)? Or is there another magical solution that works as well? Any input or advice is highly appreciated! Kind regards, Peter
Intermediate & Advanced SEO | | ConversionMob0 -
I run an (unusual) clothing company. And I'm about to set up a version of our existing site for kids. Should I use a different domain? Or keep the current root domain?
Hello. I have a burning question which I have been trying to answer for a while. I keep getting conflicting answers and I could really do with your help. I currently run an animal fancy dress (onesie) company in the UK called Kigu through the domain www.kigu.co.uk. We're the exclusive distributor for a supplier of Japanese animal costumes and we've been selling directly through this domain for about 3 years. We rank well across most of our key words and get about 2000 hits each day. We're about to start selling a Kids range - miniature versions of the same costumes. We're planning on doing this through a different domain which is currently live - www.kigu-kids.co.uk. It' been live for about 3-4 weeks. The idea behind keeping them on separate domains is that it is a different target market and we could promote the Kids site separately without having to bring people through the adult site. We want to keep the adult site (or at least the homepage) relatively free from anything kiddy as we promote fancy dress events in nightclubs and at festivals for over 18s (don't worry, nothing kinky) and we wouldn't want to confuse that message. I've since been advised by an expert in the field that that we should set up a redirect from www.kigu-kids.co.uk and house the kids website under www.kigu.co.uk/kids as this will be better from an SEO perspective and if we don't we'll only be competing with ourselves. Are we making a big mistake by not using the same root domain for both thus getting the most of the link juice for the kids site? And if we do decide to switch to have the domain as www.kigu.co.uk/kids, is it a mistake to still promote the www.kigu-kids.co.uk (redirecting) as our domain online? Would these be wasted links? Or would we still see the benefit? Is it better to combine or is two websites better than one? Any help and advice would be much appreciated. Tom.
Intermediate & Advanced SEO | | KIGUCREW0 -
What is the best way to allow content to be used on other sites for syndication without taking the chance of duplicate content filters
Cookstr appears to be syndicating content to shape.com and mensfitness.com a) They integrate their data into partner sites with an attribution back to their site and skinned it with the partners look. b) they link the image back to their image hosted on cookstr c) The page does not have microformats or as much data as their own page does so their own page is better SEO. Is this the best strategy or is there something better they could be doing to safely allow others to use our content, we don't want to share the content if we're going to get hit for a duplicate content filter or have another site out rank us with our own data. Thanks for your help in advance! their original content page: http://www.cookstr.com/recipes/sauteacuteed-escarole-with-pancetta their syndicated content pages: http://www.shape.com/healthy-eating/healthy-recipes/recipe/sauteacuteed-escarole-with-pancetta
Intermediate & Advanced SEO | | irvingw
http://www.mensfitness.com/nutrition/healthy-recipes/recipe/sauteacuteed-escarole-with-pancetta0 -
New Site: Use Aged Domain Name or Buy New Domain Name?
Hi,
Intermediate & Advanced SEO | | peterwhitewebdesign
I have the opportunity to build a new website and use a domain name that is older than 5 years or buy a new domain name. The aged domain name is a .net and includes a keyword.
The new domain would include the same keyword as well as the U.S. state abbreviation. Which one would you use and why? Thanks for your help!0 -
Blocking Dynamic URLs with Robots.txt
Background: My e-commerce site uses a lot of layered navigation and sorting links. While this is great for users, it ends up in a lot of URL variations of the same page being crawled by Google. For example, a standard category page: www.mysite.com/widgets.html ...which uses a "Price" layered navigation sidebar to filter products based on price also produces the following URLs which link to the same page: http://www.mysite.com/widgets.html?price=1%2C250 http://www.mysite.com/widgets.html?price=2%2C250 http://www.mysite.com/widgets.html?price=3%2C250 As there are literally thousands of these URL variations being indexed, so I'd like to use Robots.txt to disallow these variations. Question: Is this a wise thing to do? Or does Google take into account layered navigation links by default, and I don't need to worry. To implement, I was going to do the following in Robots.txt: User-agent: * Disallow: /*? Disallow: /*= ....which would prevent any dynamic URL with a '?" or '=' from being indexed. Is there a better way to do this, or is this a good solution? Thank you!
Intermediate & Advanced SEO | | AndrewY1