Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Robots.txt - Do I block Bots from crawling the non-www version if I use www.site.com ?
-
my site uses is set up at http://www.site.com I have my site redirected from non- www to the www in htacess file. My question is... what should my robots.txt file look like for the non-www site? Do you block robots from crawling the site like this? Or do you leave it blank?
User-agent: *
Disallow: /
Sitemap: http://www.morganlindsayphotography.com/sitemap.xml
Sitemap: http://www.morganlindsayphotography.com/video-sitemap.xml
-
Hi there
If you configured this properly, I wouldn't worry about this at all.
Check your internal links and sitemap to make sure that your URLs listed as a reflection of this www. version.
Beyond that, you're all good, no need to block non www.
Hope this helps! Good luck!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Block session id URLs with robots.txt
Hi, I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt. Which directive should I use: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Disallow: ?filter= or User-agent: *
Disallow: /?filter= In other words, is the forward slash in the beginning of the disallow directive necessary? Thanks!1 -
Crawl Stats Decline After Site Launch (Pages Crawled Per Day, KB Downloaded Per Day)
Hi all, I have been looking into this for about a month and haven't been able to figure out what is going on with this situation. We recently did a website re-design and moved from a separate mobile site to responsive. After the launch, I immediately noticed a decline in pages crawled per day and KB downloaded per day in the crawl stats. I expected the opposite to happen as I figured Google would be crawling more pages for a while to figure out the new site. There was also an increase in time spent downloading a page. This has went back down but the pages crawled has never went back up. Some notes about the re-design: URLs did not change Mobile URLs were redirected Images were moved from a subdomain (images.sitename.com) to Amazon S3 Had an immediate decline in both organic and paid traffic (roughly 20-30% for each channel) I have not been able to find any glaring issues in search console as indexation looks good, no spike in 404s, or mobile usability issues. Just wondering if anyone has an idea or insight into what caused the drop in pages crawled? Here is the robots.txt and attaching a photo of the crawl stats. User-agent: ShopWiki Disallow: / User-agent: deepcrawl Disallow: / User-agent: Speedy Disallow: / User-agent: SLI_Systems_Indexer Disallow: / User-agent: Yandex Disallow: / User-agent: MJ12bot Disallow: / User-agent: BrightEdge Crawler/1.0 (crawler@brightedge.com) Disallow: / User-agent: * Crawl-delay: 5 Disallow: /cart/ Disallow: /compare/ ```[fSAOL0](https://ibb.co/fSAOL0)
Intermediate & Advanced SEO | | BandG0 -
301 redirect hops from non-https and www
It's best practice to minimize the amount of 301 redirect hops. Ideally only one redirect hop. It's also best practice to 301 redirect (or at least canonical) your non-https and/or your non-www (or www) to the canonical protocol/subdomain. The simplest (and possibly the most common) way to implement canonical protocol/subdomain redirects is through a load balancer or before your app processes the request. Both of which will just blanket 301 to the canonical domain/protocol regardless if the path exists or not In which case, you could have: Two hops. i.e. hop #1 http://example.com/foo to https://example.com/foo, hop #2 https://example.com/foo to https://example.com/bar 301 to a 404. Let's say https://example.com/dog never existed, but somebody for whatever reason linked to it (maybe a typo). If I request https://www.example.com/dog, the load balancer would 301 to a 404 page. Either scenario above should be fairly rare. However, you can't control how people link to you. Should I care about either above scenario? I could have my app attempt to check if the page exists before forwarding, but that code could be complicated.
Intermediate & Advanced SEO | | dsbud0 -
Using the same image across the site?
Hi just wondering i'm using the same image across 20 pages which are optimized for SEO purposes. I was wondering is there issues with this from SEO standpoint? Will Google devalue the page because the same image is being used? Cheers.
Intermediate & Advanced SEO | | seowork2140 -
Changing from .com to .com.au
Hi All, we are looking for some guidance please, if at all possible. We have .com domain (the domain is older than 10 years), we have been using it for 2 years. We also have .com.au version of the domain (the domain is 2 years old, pointing to the .com domain) and isn't being used. We are an Australian based company. Our question is, should we be using .com.au instead of .com and if so, how would you advise going about doing the change over without having huge SEO impact on our business (negatively). We are on the home page for most of the searches we have optimized for, but we are always below the .com.au's - which is why we are considering the possibility of the move? Any advice would be GREATLY appreciated 🙂
Intermediate & Advanced SEO | | creativeground0 -
New Site: Use Aged Domain Name or Buy New Domain Name?
Hi,
Intermediate & Advanced SEO | | peterwhitewebdesign
I have the opportunity to build a new website and use a domain name that is older than 5 years or buy a new domain name. The aged domain name is a .net and includes a keyword.
The new domain would include the same keyword as well as the U.S. state abbreviation. Which one would you use and why? Thanks for your help!0 -
Using 2 wildcards in the robots.txt file
I have a URL string which I don't want to be indexed. it includes the characters _Q1 ni the middle of the string. So in the robots.txt can I use 2 wildcards in the string to take out all of the URLs with that in it? So something like /_Q1. Will that pickup and block every URL with those characters in the string? Also, this is not directly of the root, but in a secondary directory, so .com/.../_Q1. So do I have to format the robots.txt as //_Q1* as it will be in the second folder or just using /_Q1 will pickup everything no matter what folder it is on? Thanks.
Intermediate & Advanced SEO | | seo1234560 -
Linking Sister-Sites - Diapers.com Example
Many of the big guns like 1800 Flowers, Diapers.com and others all have their sister sites in tabs at the top. Example: http://www.diapers.com/ with their 3 other properties. Since all properties link to one another on every page, it's really a wash, right? No real gain as engines know they are connected and it's the same link multiple times. No real problem either as it's natural for the user experience to have reciprocal links here between the brands. Any additional thoughts here?
Intermediate & Advanced SEO | | SEOPA0