Robots.txt - What is the correct syntax?

teleman

Hello everyone

I have the following link:

http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167

I want to prevent google from indiexing everything that is related to "view=send_friend"

The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.

My problem is how i disallow it correctly via robots.txt

I tried this syntax:

Disallow: /view=send_friend/

However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.

What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?

teleman

I added your suggestion to robots.txt and requested a crawl again.

I only have 3 pages with dublicate page content now

So your suggestion seemes to have worked.

Thanks for your reply.. it worked!

JarnoNijzing

you are right. misinterpreted the explanation. Apologies

Martijn_Scheijbeler

Jarno,

The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.

JarnoNijzing

Henrik,

i think you should be looking into something like this:

User-agent: Googlebot
Disallow: /*view=send_friend$

hope this helps

Kind regards

Jarno

Martijn_Scheijbeler

Hi Henrik,

I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.

Hope this helps!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt - What is the correct syntax?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Website URL, Robots.txt and Google Search Console (www. vs non www.)

X-robots tag causing no index issues

What are the negative implications of listing URLs in a sitemap that are then blocked in the robots.txt?

Googlebot does not obey robots.txt disallow

Removing robots.txt on WordPress site problem

Blocking robots.txt

Robots.txt file getting a 500 error - is this a problem?

How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?