Robots.txt - What is the correct syntax?

teleman

Hello everyone

I have the following link:

http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167

I want to prevent google from indiexing everything that is related to "view=send_friend"

The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.

My problem is how i disallow it correctly via robots.txt

I tried this syntax:

Disallow: /view=send_friend/

However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.

What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?

teleman

I added your suggestion to robots.txt and requested a crawl again.

I only have 3 pages with dublicate page content now

So your suggestion seemes to have worked.

Thanks for your reply.. it worked!

JarnoNijzing

you are right. misinterpreted the explanation. Apologies

Martijn_Scheijbeler

Jarno,

The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.

JarnoNijzing

Henrik,

i think you should be looking into something like this:

User-agent: Googlebot
Disallow: /*view=send_friend$

hope this helps

Kind regards

Jarno

Martijn_Scheijbeler

Hi Henrik,

I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.

Hope this helps!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt - What is the correct syntax?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Robots.txt & meta noindex--site still shows up on Google Search

Can I Block https URLs using Host directive in robots.txt?

Can't find mistake in robots.txt

Block or remove pages using a robots.txt

Restricted by robots.txt does this cause problems?

Robots.txt Sitemap with Relative Path

How long does it take for traffic to bounce back from and accidental robots.txt disallow of root?

Un-Indexing a Page without robots.txt or access to HEAD