Robots.txt - What is the correct syntax?

teleman

Hello everyone

I have the following link:

http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167

I want to prevent google from indiexing everything that is related to "view=send_friend"

The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.

My problem is how i disallow it correctly via robots.txt

I tried this syntax:

Disallow: /view=send_friend/

However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.

What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?

teleman

I added your suggestion to robots.txt and requested a crawl again.

I only have 3 pages with dublicate page content now

So your suggestion seemes to have worked.

Thanks for your reply.. it worked!

JarnoNijzing

you are right. misinterpreted the explanation. Apologies

Martijn_Scheijbeler

Jarno,

The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.

JarnoNijzing

Henrik,

i think you should be looking into something like this:

User-agent: Googlebot
Disallow: /*view=send_friend$

hope this helps

Kind regards

Jarno

Martijn_Scheijbeler

Hi Henrik,

I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.

Hope this helps!

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Robots.txt - What is the correct syntax?

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Removing robots.txt on WordPress site problem

Correct Way to Write Meta

Getting home page content at top of what robots see

Search engines have been blocked by robots.txt., how do I find and fix it?

Backtracking from verification meta tag to the correct Google account is difficult

Quick robots.txt check

Subdomain Robots.txt

How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?