Can I use a "no index, follow" command in a robot.txt file for a certain parameter on a domain?
-
I have a site that produces thousands of pages via file uploads. These pages are then linked to by users for others to download what they have uploaded.
Naturally, the client has blocked the parameter which precedes these pages in an attempt to keep them from being indexed. What they did not consider, was they these pages are attracting hundreds of thousands of links that are not passing any authority to the main domain because they're being blocked in robots.txt
Can I allow google to follow, but NOT index these pages via a robots.txt file --- or would this have to be done on a page by page basis?
-
Since you have those pages blocked via robots.txt, the bots would never even crawl these pages in theory...which means the Noindex,follow is not helping.
Also, if you do a report on the domain on opensiteexplorer and dig, you should be able to find tons of those links already showing up. So if my site is linking to a page on that site, that page may not be cached/indexed because of the robots.txt exclusion, but that as long as my site is follow, your domain is still getting the credit for the link.
Does that make sense ?
-
Answered my own question.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Insane traffic loss and indexed pages after June Core Update, what can i do to bring it back?
Hello Everybody! After June Core Update was released, we saw an insane drop on traffic/revenue and indexed pages on GSC (Image attached below) The biggest problem here was: Our pages that were out of the index were shown as "Blocked by robots.txt", and when we run the "fetch as Google" tool, it says "Crawl Anomaly". Even though, our robots.txt it's completely clean (Without any disallow's or noindex rules), so I strongly believe that the reason that this pattern of error is showing, is because of the June Core Update. I've come up with some solutions, but none of them seems to work: 1- Add hreflang on the domain: We have other sites in other countries, and ours seems like it's the only one without this tag. The June update was primarily made to minimize two SERP results per domain (or more if google thinks it's relevant). Maybe other sites have "taken our spot" on the SERPS, our domain is considerably newer in comparison to the other countries. 2- Mannualy index all the important pages that were lost The idea was to renew the content on the page (title, meta description, paragraphs and so on) and use the manual GSC index tool. But none of that seems to work as well, all it says is "Crawl Anomaly". 3- Create a new domain If nothing works, this should. We would be looking for a new domain name and treat it as a whole new site. (But frankly, it should be some other way out, this is for an EXTREME case and if nobody could help us. ) I'm open for ideas, and as the days have gone by, our organic revenue and traffic doesn't seem like it's coming up again. I'm Desperate for a solution Any Ideas gCi46YE
Intermediate & Advanced SEO | | muriloacct0 -
Can cross domain canonicals help with international SEO when using ccTLDs?
Hello. My question is:** Can cross domain canonicals help with international SEO when using ccTLDs and a gTLD - and the gTLD is much more authoritative to begin with? ** I appreciate this is a very nuanced subject so below is a detailed explanation of my current approach, problem, and proposed solutions I am considering testing. Thanks for the taking the time to read this far! The Current setup Multiple ccTLD such as mysite.com (US), mysite.fr (FR), mysite.de (DE). Each TLD can have multiple languages - indeed each site has content in English as well as the native language. So mysite.fr (defaults to french) and mysite.fr/en-fr is the same page but in English. Mysite.com is an older and more established domain with existing organic traffic. Each language variant of each domain has a sitemap that is individually submitted to Google Search Console and is linked from the of each page. So: mysite.fr/a-propos (about us) links to mysite.com/sitemap.xml that contains URL blocks for every page of the ccTLD that exists in French. Each of these URL blocks contains hreflang info for that content on every ccTLD in every language (en-us, en-fr, de-de, en-de etc) mysite.fr/en-fr/about-us links to mysite.com/en-fr/sitemap.xml that contains URL blocks for every page of the ccTLD that exists in English. Each of these URL blocks contains hreflang info for that content on every ccTLD in every language (en-us, en-fr, de-de, en-de etc). There is more English content on the site as a whole so the English version of the sitemap is always bigger at the moment. Every page on every site has two lists of links in the footer. The first list is of links to every other ccTLD available so a user can easily switch between the French site and the German site if they should want to. Where possible this links directly to the corresponding piece of content on the alternative ccTLD, where it isn’t possible it just links to the homepage. The second list of links is essentially just links to the same piece of content in the other languages available on that domain. Mysite.com has its international targeting in Google Search console set to the US. The problems The biggest problem is that we didn’t consider properly how we would need to start from scratch with each new ccTLD so although each domain has a reasonable amount of content they only receive a tiny proportion of the traffic that mysite.com achieves. Presumably this is because of a standing start with regards to domain authority. The second problem is that, despite hreflang, mysite.com still outranks the other ccTLDs for brand name keywords. I guess this is understandable given the mismatch of DA. This is based on looking at search results via the Google AdWords Ad Preview tool and changing language, location, and domain. Solutions So the first solution is probably the most obvious and that is to move all the ccTLDs into a subfolder structure on the mysite.com site structure and 301 all the old ccTLD links. This isn’t really an ideal solution for a number of reasons, so I’m trying to explore some alternative possible routes to explore that might help the situation. The first thing that came to mind was to use cross-domain canonicals: Essentially this would be creating locale specific subfolders on mysite.com and duplicating the ccTLD sites in there, but using a cross-domain canonical to tell Google to index the ccTLD url instead of the locale-subfolder url. For example: mysite.com/fr-fr has a canonical of mysite.fr
Intermediate & Advanced SEO | | danatello
mysite.com/fr-fr/a-propos has a canonical of mysite.fr/a-propos Then I would change the links in the mysite.com footer so that they wouldn’t point at the ccTLD URL but at the sub-folder URL so that Google would crawl the content on the stronger domain before indexing the ccTLD domain version of the URL. Is this worth exploring with a test, or am I mad for even considering it? The alternative that came to my mind was to do essentially the same thing but use a 301 to redirect from mysite.com/fr-fr to mysite.fr. My question is around whether either of these suggestions might be worth testing, or am I completely barking up the wrong tree and liable to do more harm than good?0 -
Can adding "noindex" help with quality penalizations?
Hello Moz fellows, I have another question about content quality and Panda related penalization. I was wondering this: If I have an entire section of my site that has been penalized due to thin content, can adding "noindex,follow" to all pages belonging to that section help de-penalizing the rest of the site in the short term, while we work to improve those penalized pages, which is going to take a long time? Can that be considered a "short term solution" to improve the overall site scoring on Google index while we work to improve those penalized pages, and, once ready, we remove the "noindex" tag? I am eager to know your thoughts on this possible strategy. Thank you in advance to everyone!
Intermediate & Advanced SEO | | fablau0 -
URLs with parameters + canonicals + meta robots
Hi Moz community! I'm posting a new question here as I couldn't find specific answer to the case I'm facing. Along with canonical tags, we are implementing meta robots on our pages (e-commerce website with thousands of pages). Most of the cases have been covered but I still have one unanswered case: our products are linked from list pages (mostly categories) but they almost always include a tracking parameter (ie /my-product.html?ref=xxx) products urls are secured with a canonical tag (referring only to the clean url /my-product.html) but what would be the best solution regarding the meta robots? For now we opted for a meta robot 'noindex, follow' for non canonical urls (so the ones unfortunately linked from our category/list pages), but I'm afraid that it could hurt our SEO (apparently no juice is given from URLs with a noindex robots), and even maybe prevent bots from crawling our website properly ... Would it be best to have no meta robots at all on these product urls with parameters? (we obviously can't have 'index, follow' when the canonical ref points to another url!). Thanks for your help!
Intermediate & Advanced SEO | | JessicaZylberberg0 -
Can I Use Multiple rel="alternate" Tags on Multiple Domains With the Same Language?
Hoping someone can answer this for me, as I have spent a ton of time researching with no luck... Is there anything misleading/wrong with using multiple rel="alternate" tags on a single webpage to reference multiple alternate versions? We currently use this tag to specify a mobile-equivalent page (mobile site served on an m. domain), but would like to expand so that we can cover another domain for desktop (possibly mobile in the future). In essence: MAIN DOMAIN would get The "Other Domain" would then use Canonical to point back to the main site. To clarify, this implementation idea is for an e-commerce site that maintains the same product line across 2 domains. One is homogeneous with furniture & home decor, which is a sub-set of products on our "main" domain that includes lighting, furniture & home decor. Any feedback or guidance is greatly appreciated! Thanks!
Intermediate & Advanced SEO | | LampsPlus0 -
Use Canonical or Robots.txt for Map View URL without Backlink Potential
I have a Page X with lots of unique content. This page has a "Map view" option, which displays some of the info from Page X, but a lot is ommitted. Questions: Should I add canonical even though Map View URL does not display a lot of info from Page X or adding to robots.txt or noindex, follow? I don't see any back links coming to Map View URL Should Map View page have unique H1, title tag, meta des?
Intermediate & Advanced SEO | | khi50 -
Block in robots.txt instead of using canonical?
When I use a canonical tag for pages that are variations of the same page, it basically means that I don't want Google to index this page. But at the same time, spiders will go ahead and crawl the page. Isn't this a waste of my crawl budget? Wouldn't it be better to just disallow the page in robots.txt and let Google focus on crawling the pages that I do want indexed? In other words, why should I ever use rel=canonical as opposed to simply disallowing in robots.txt?
Intermediate & Advanced SEO | | YairSpolter0 -
Can a domain rank for a competitive term with no links?
Hi, I know that this topic has received a lot of attention recently (Not all of it good) and I am not normally one to re-open a can of worms but the whole 'Camper Mens Shoes' fiasco has got me thinking. If you're not familiar with the story then you can get the highlights of it here - http://martinmacdonald.net/the-curios-case-of-camper-shoes/ My question is this - Say that you had a domain (Domain A) that was ranking well for a competitve keyword and that it had a good backlink profile. If you used rel="canonical" on every page of Domain A to point to a duplicate site on a different domain (Domain B) , would Domain B then rank well in place of Domain A? I know that this probably doesn't have much practical use but I am trying to get a better understanding of the effect of using rel="canonical" Would the result of doing the above mean that Domain B would rank well without having any links pointing directly to it?
Intermediate & Advanced SEO | | AdeLewis0