Restricted by robots.txt does this cause problems?
-
I have restricted around 1,500 links which are links to retailers website and links that affiliate links accorsing to webmaster tools
Is this the right approach as I thought it would affect the link juice? or should I take the no follow out of the restricted by robots.txt file
-
Hello Ocelot,
I am assuming you have a site that has affiliate links and you want to keep Google from crawling those affiliate links. If I am wrong, please let me know. Going forward with that assumption then...
That is one way to do it. So perhaps you first send all of those links through a redirect via a folder called /out/ or /links/ or whatever, and you have blocked that folder in the robots.txt file. Correct? If so, this is how many affiliate sites handle the situation.
I would not rely on rel nofollow alone, though I would use that in addition to the robots.txt block.
There are many other ways to handle this. For instance, you could make all affilaite links javascript links instead of href links. Then you could put the javascript into a folder called /js/ or something like that, and block that in the robots.txt file. This works less and less now that Google Preview Bot seems to be ignoring the disallow statement in those situations.
You could make it all the same URL with a unique identifyer of some sort that tells your database where to redirect the click. For example:
www.yoursite.com/outlink/mylink#123
or
www.yoursite.com/mylink?link-id=123
In which case you could then block /mylink in the robots.txt file and tell Google to ignore the link-ID parameter via Webmaster Tools.
As you can see, there is more than one way to skin this cat. The problem is always going to be doing it without looking like you're trying to "fool" Google - because they WILL catch up with any tactic like that eventually.
Good luck!
Everett
-
From a coding perspective, applying the nofollow to the links is the best way to go.
With the robots.txt file, only the top tier search engines respect the information contained within, so lesser known bots or spammers might check your robots.txt file to see what you don't want listed, and that info will give them a starting point to look deeper into your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Problems with Meta Title on Bing
On the Bing search engine, it isn't showing the actual meta title we have for a website. It's showing something different. However, the correct meta title is showing on the Google search engine. Has anyone had the same issue? Has anyone been able to fix this issue? Thanks for your help!
Technical SEO | | Harrison.Stickboy0 -
.htaccess probelem causing 605 Error?
I'm working on a site, it's just a few html pages and I've added a WP blog. I've just noticed that moz is giving me the following error with reference to http://website.com: (webmaster tools is set to show the www subdomain, so it appears OK). Error Code 605: Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag Here's the code from my htaccess, is this causing the problem? RewriteEngine on
Technical SEO | | Stevie-G
Options +FollowSymLinks
RewriteCond %{THE_REQUEST} ^./index.html
RewriteRule ^(.)index.html$ http://www.website.com/$1 [R=301,L]
RewriteCond %{THE_REQUEST} ^./index.php
RewriteRule ^(.)index.php$ http://www.website.com/$1 [R=301,L] RewriteCond %{HTTP_HOST} ^website.com$ [NC]
RewriteRule ^(.*)$ http://www.website.com/$1 [R=301,L] BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /
RewriteRule ^index.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]</ifmodule> END WordPress Thanks for any advice you can offer!0 -
Adding directories to robots nofollow cause pages to have Blocked Resources
In order to eliminate duplicate/missing title tag errors for a directory (and sub-directories) under www that contain our third-party chat scripts, I added the parent directory to the robots disallow list. We are now receiving a blocked resource error (in Webmaster Tools) on all of the pages that have a link to a javascript (for live chat) in the parent directory. My host is suggesting that the warning is only a notice and we can leave things as is without worrying about the page being de-ranked/penalized. I am wondering if this is true or if we should remove the one directory that contains the js from the robots file and find another way to resolve the duplicate title tags?
Technical SEO | | miamiman1000 -
What should I do with URLs that cause site map errors?
Hi Mozzers, I have a client who uses an important customer database and offers gift cards via https://clients.mindbodyonline.com located within the navigation which causes sitemap errors whenever it is submitted since domain is different. Should I ask to remove those links from navigation? if so where can I relocate those links? If not what should I do to have a site map without any errors? Thanks! 1n16jlL.png
Technical SEO | | Ideas-Money-Art0 -
Robots.txt
www.mywebsite.com**/details/**home-to-mome-4596 www.mywebsite.com**/details/**home-moving-4599 www.mywebsite.com**/details/**1-bedroom-apartment-4601 www.mywebsite.com**/details/**4-bedroom-apartment-4612 We have so many pages like this, we do not want to Google crawl this pages So we added the following code to Robots.txt User-agent: Googlebot Disallow: /details/ This code is correct?
Technical SEO | | iskq0 -
How ro write a robots txt file to point to your site map
Good afternoon from still wet & humid wetherby UK... I want to write a robots text file that instruct the bots to index everything and give a specific location to the sitemap. The sitemap url is:http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx Is this correct: User-agent: *
Technical SEO | | Nightwing
Disallow:
SITEMAP: http://business.leedscityregion.gov.uk/CMSPages/GoogleSiteMap.aspx Any insight welcome 🙂0 -
My home page 301 redirects - is this an SEO problem
When ever a browser calls my site canineconcepts.co.uk, it is automatically 301 redirected to canineconcepts.co.uk/en I am not sure if I should be concerned about this from an SEO perspective or not. Any thoughts?
Technical SEO | | CanineConcepts0 -
Problem with canonical url and session ids
Hi, i have a problem with the following website: http://goo.gl/EuF4E Google always indexes the site with session-id, although i use canonical url in this page. Indexed sites: http://goo.gl/RQnaD Sometimes it goes right, but sometimes wrong. Is it because we separate our session-id with ";" as separator? In the Google Webmaster Tools, i can´t choose jsessid as a parameter, so i think google does not recognize this. But if we have to change it (f.e. ? as separator) we have to spend many days in programming. Any ideas? thanks for your help!
Technical SEO | | tdberlin0