Website URL, Robots.txt and Google Search Console (www. vs non www.)
-
Hi MOZ Community,
I would like to request your kind assistance on domain URLs - www. VS non www.Recently, my team have moved to a new website where a 301 Redirection has been done.
- Original URL : https://www.example.com.my/ (with www.)
- New URL : https://example.com.my/ (without www.)
Our current robots.txt sitemap : https://www.example.com.my/sitemap.xml (with www.)
Our Google Search Console property : https://www.example.com.my/ (with www.)Question:
1. How/Should I standardize these so that Google crawler can effectively crawl my website?
2. Do I have to change back my website URLs to (with www.) or I just need to update my robots.txt?
3. How can I update my Google Search Console property to reflect accordingly (without www.), because I cannot see the options in the dashboard.
4. Is there any to dos such as Canonicalization needed, or should I wait for Google to automatically detect and change it, especially in GSC property?Really appreciate your kind assistance.
Thank you,
Badiuzz -
Hi there,
1. You need to run a technical site audit to see if all redirects are redirecting to the no www version and your canonical l URLs do not include www in the them. Also, you need resubmit your site map to Google Search console.
2. You need to update your robots.txt file with new site site map URL.
3. You need to add another property without www to the Google Search Console.
4. You need resubmit your site map to the Google Search console. You need to run a technical audit to see your canonical tags.
Ross
Ross
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Soft 404 in Search Console
Search console is showing quite a lot of soft 404 pages on my site, but when I click on the links, the pages are all there. Is there a reason for this? It's a pretty big site - I'm getting 141 soft 404s from about 20,000 pages
Technical SEO | | abisti20 -
The use of robots.txt
Could someone please confirm that if I do not want to block any pages from my URL, then I do not need a robots.txt file on my site? Thanks
Technical SEO | | ICON_Malta0 -
Google does not show my website anymore
Hi All, We developed a new website for the domain: www.instral.com. Before we build the webste, the domain was indexed by google and was showed as first result on a search "Instral". Without a website! Only a super simple hostingprovider webpage. Now, with the new website, all the website pages are found in google from page 7... home page is not even in the results. When i search on "instral.com" in google, it will show my website on first result including homepage. Is there something wrong with the website or DNS settings? Or mabe some other webhosting setting... am i on a blacklist or something? Bing and Yahoo are showing better results (first page). I hope someone can help me out here...
Technical SEO | | extrememedia0 -
Robots.txt issue - site resubmission needed?
We recently had an issue when a load of new files were transferred from our dev server to the live site, which unfortunately included the dev site's robots.txt file which had a disallow:/ instruction. Bad! Luckily I spotted it quickly and the file has been replaced. The extent of the damage seems to be that some descriptions aren't displaying and we're getting a message about robots.txt in the SERPs for a few keywords. I've done a site: search and generally it seems to be OK for 99% of our pages. Our positions don't seem to be affected right now but obviously it's not great for the CTRs on those keywords affected. My question is whether there is anything I can do to bring the updated robots.txt file to Google's attention? Or should we just wait and sit it out? Thanks in advance for your answers!
Technical SEO | | GBC0 -
Google (GWT) says my homepage and posts are blocked by Robots.txt
I guys.. I have a very annoying issue.. My Wordpress-blog over at www.Trovatten.com has some indexation-problems.. Google Webmaster Tools data:
Technical SEO | | FrederikTrovatten22
GWT says the following: "Sitemap contains urls which are blocked by robots.txt." and shows me my homepage and my blogposts.. This is my Robots.txt: http://www.trovatten.com/robots.txt
"User-agent: *
Disallow: /wp-admin/
Disallow: /wp-includes/ Do you have any idea why it says that the URL's are being blocked by robots.txt when that looks how it should?
I've read a couple of places that it can be because of a Wordpress Plugin that is creating a virtuel robots.txt, but I can't validate it.. 1. I have set WP-Privacy to crawl my site
2. I have deactivated all WP-plugins and I still get same GWT-Warnings. Looking forward to hear if you have an idea that might work!0 -
How do I resolve Twin domains? redirect website.com to www.website.com?
I am new to this website. Tried to run a campain and got a warning that website.com resolves to www.website.com which hinders SERP by competing for Keyword indexing!. (website is my domain name) Would appreciate help with this. Thanks. S.H. PS: here is the exact wording of error : We have detected that the domain www.yfvaccine.com and the domain yfvaccine.com both respond to web requests and do not redirect. Having two "twin" domains that both resolve forces them to battle for SERP positions, making your SEO efforts less effective. We suggest redirecting one, then entering the other here.
Technical SEO | | sherohass0 -
Www v.s non www
The canonical URLs (and all our link building efforts) is on the www version of the site. However, the site is having a massive technical problem and need to redirect some links (some of which are very important) from the www to the non www version of the site (for these pages the canonical link is still the www version). How big of a SEO problem is this? Can you please explain the exact SEO dangers? Thanks!
Technical SEO | | theLotter0 -
Robots.txt
My campaign hse24 (www.hse24.de) is not being crawled any more ... Do you think this can be a problem of the robots.txt? I always thought that Google and friends are interpretating the file correct, seen that he site was crawled since last week. Thanks a lot Bernd NB: Here is the robots.txt: User-Agent: * Disallow: / User-agent: Googlebot User-agent: Googlebot-Image User-agent: Googlebot-Mobile User-agent: MSNBot User-agent: Slurp User-agent: yahoo-mmcrawler User-agent: psbot Disallow: /is-bin/ Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_Storefront-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-DE-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-AT-Site/de_DE/-/EUR/hse24_DisplayProductInformation-Start Allow: /is-bin/INTERSHOP.enfinity/WFS/HSE24-CH-Site/de_DE/-/CHF/hse24_DisplayProductInformation-Start Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/ Allow: /is-bin/intershop.static/WFS/HSE24-Site/-/Editions/Root%20Edition/units/HSE24/Beratung/
Technical SEO | | remino630