Blocking in Robots.txt and the re-indexing - DA effects?
-
I have two good high level DA sites that target the US (.com) and UK (.co.uk). The .com ranks well but is dormant from a commercial aspect - the .co.uk is the commercial focus and gets great traffic.
Issue is the .com ranks for brand in the UK - I want the .co.uk to rank for brand in the UK.
I can't 301 the .com as it will be used again in the near future. I want to block the .com in Robots.txt with a view to un-block it again when I need it.
I don't think the DA would be affected as the links stay and the sites live (just not indexed) so when I unblock it should be fine - HOWEVER - my query is things like organic CTR data that Google records and other factors won't contribute to its value.
Has anyone ever blocked and un-blocked and whats the affects pls?
All answers greatly received - cheers GB
-
Blocking in Robots.txt doesn't affect your website DA. Instead, you can use it in a better way to help your website's ranking.
*Deal with Duplicate content - You can use it to hide a specific page from search engines while it issues duplicate content.
*Hide different types of theme templates of your website that you don't want to list in search results. -
@Bush_JSM , Depend on allowing , better when you update simply update the sitemap in google search console
-
I don't think it affects your website DA and PA.
Robots.txt helps you to block the post and pages that you don't want Google to index.
Especially all the internal links for your site. It doesn't have any link with the external links of your site.
According to me. It doesn't affect your website da pa.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Unsolved Crawler was not able to access the robots.txt
I'm trying to setup a campaign for jessicamoraninteriors.com and I keep getting messages that Moz can't crawl the site because it can't access the robots.txt. Not sure why, other crawlers don't seem to have a problem and I can access the robots.txt file from my browser. For some additional info, it's a SquareSpace site and my DNS is handled through Cloudflare. Here's the contents of my robots.txt file: # Squarespace Robots Txt User-agent: GPTBot User-agent: ChatGPT-User User-agent: CCBot User-agent: anthropic-ai User-agent: Google-Extended User-agent: FacebookBot User-agent: Claude-Web User-agent: cohere-ai User-agent: PerplexityBot User-agent: Applebot-Extended User-agent: AdsBot-Google User-agent: AdsBot-Google-Mobile User-agent: AdsBot-Google-Mobile-Apps User-agent: * Disallow: /config Disallow: /search Disallow: /account$ Disallow: /account/ Disallow: /commerce/digital-download/ Disallow: /api/ Allow: /api/ui-extensions/ Disallow: /static/ Disallow:/*?author=* Disallow:/*&author=* Disallow:/*?tag=* Disallow:/*&tag=* Disallow:/*?month=* Disallow:/*&month=* Disallow:/*?view=* Disallow:/*&view=* Disallow:/*?format=json Disallow:/*&format=json Disallow:/*?format=page-context Disallow:/*&format=page-context Disallow:/*?format=main-content Disallow:/*&format=main-content Disallow:/*?format=json-pretty Disallow:/*&format=json-pretty Disallow:/*?format=ical Disallow:/*&format=ical Disallow:/*?reversePaginate=* Disallow:/*&reversePaginate=* Any ideas?
Getting Started | | andrewrench0 -
GoogleBot still crawling HTTP/1.1 years after website moved to HTTP/2
Whole website moved to https://www. HTTP/2 version 3 years ago. When we review log files, it is clear that - for the home page - GoogleBot continues to only access via HTTP/1.1 protocol Robots file is correct (simply allowing all and referring to https://www. sitemap Sitemap is referencing https://www. pages including homepage Hosting provider has confirmed server is correctly configured to support HTTP/2 and provided evidence of accessing via HTTP/2 working 301 redirects set up for non-secure and non-www versions of website all to https://www. version Not using a CDN or proxy GSC reports home page as correctly indexed (with https://www. version canonicalised) but does still have the non-secure version of website as the referring page in the Discovery section. GSC also reports homepage as being crawled every day or so. Totally understand it can take time to update index, but we are at a complete loss to understand why GoogleBot continues to only go through HTTP/1.1 version not 2 Possibly related issue - and of course what is causing concern - is that new pages of site seem to index and perform well in SERP ... except home page. This never makes it to page 1 (other than for brand name) despite rating multiples higher in terms of content, speed etc than other pages which still get indexed in preference to home page. Any thoughts, further tests, ideas, direction or anything will be much appreciated!
Technical SEO | | AKCAC1 -
How Can I influence the Google Selected Canonical
Our company recently rebranded and launched a new website. The website was developed by an overseas team and they created the test site on their subdomain. The only problem is that Google crawled and indexed their site and ours. I noticed Google indexed their sub domain ahead of our domain and based on Search Console it has deemed our content as the duplicate of theirs and the Google selected theirs as the canonical.
Community | | Spaziohouston
The website in question is https://www.spaziointerni.us
What would be the best course of action to get our content ranked and selected instead of being marked as the duplicate?
Not sure if I have to modify the content to make it more unique or have them submit a removal in their search console.
Our indexed pages continue to go down due to this issue.
Any help is greatly appreciated.1 -
Unsolved Moz crawler not crawling on my site
Hi all, im facing an issue where moz crawler is unable to crawl my site. The following error keeps showing Our crawler was banned by a page on your site, either through your robots.txt, the X-Robots-Tag HTTP header, or the meta robots tag. This is my robots.txt file : https://www.wearefutureheads.com/robots.txt I'm not sure what else am I missing.. can anyone help
Product Support | | teikh0 -
Can't get Google to index our site although all seems very good
Hi there, I am having issues getting our new site, https://vintners.co indexed by Google although it seems all technical and content requirements are well in place for it. In the past, I had way poorer websites running with very bad setups and performance indexed faster. What's concerning me, among others, is that the crawler of Google comes from time to time when looking on Google Search Console but does not seem to make progress or to even follow any link and the evolution does not seem to do what google says in GSC help. For instance, our sitemap.xml was submitted, for a few days, it seemed like it had an impact as many pages were then visible in the coverage report, showing them as "detected but not yet indexed" and now, they disappeared from the coverage report, it's like if it was not detected any more. Anybody has any advice to speed up or accelerate the indexing of a new website like ours? It's been launched since now almost two months and I was expected, at least on some core keywords, to quickly get indexed.
Technical SEO | | rolandvintners1 -
Google Search Console - Excluded Pages and Multiple Properties
I have used Moz to identify keywords that are ideal for my website and then I optimized different pages for those keywords, but unfortunately rankings for some of the pages have declined. Since I am working with an ecommerce site, I read that having a lot of Excluded pages on the Google Search Console was to be expected so I initially ignored them. However, some of the pages I was trying to optimize are listed there, especially under the 'Crawled - currently not indexed' and the 'Discovered - currently not indexed' sections. I have read this page (link: https://moz.com/blog/crawled-currently-not-indexed-coverage-status ) and plan on focusing on Steps 5 & 7, but wanted to ask if anyone else has had experience with these issues. Also, does anyone know if having multiple properties (https vs http, www vs no www) can negatively affect a site? For example, could a sitemap from one property overwrite another? Would removing one property from the Console have any negative impact on the site? I plan on asking these questions on a Google forum, but I wanted to add it to this post in case anyone here had any insights. Thank you very much for your time,
SEO Tactics | | ForestGT
Forest0 -
Unsolved Why did I stop ranking on a keyword and how will I rank on it again?
I often see in my campaigns, that keywords which ranked on a page between spot 1 to 5 on the SERP stop being ranked on that respective page, causing the website to be in the 5th page or worse on Google. I also see that the keyword is not linked to a page anymore. What causes this to happen and how can I solve this from happening in the future? Capture.PNG
Moz Pro | | Ginovdw0 -
Block or remove pages using a robots.txt
I want to use robots.txt to prevent googlebot access the specific folder on the server, Please tell me if the syntax below is correct User-Agent: Googlebot Disallow: /folder/ I want to use robots.txt to prevent google image index the images of my website , Please tell me if the syntax below is correct User-agent: Googlebot-Image Disallow: /
Technical SEO | | semer0