Robots.txt - What is the correct syntax?
-
Hello everyone
I have the following link:
http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167
I want to prevent google from indiexing everything that is related to "view=send_friend"
The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.
My problem is how i disallow it correctly via robots.txt
I tried this syntax:
Disallow: /view=send_friend/
However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.
What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?
-
I added your suggestion to robots.txt and requested a crawl again.
I only have 3 pages with dublicate page content now
So your suggestion seemes to have worked.
Thanks for your reply.. it worked!
-
you are right. misinterpreted the explanation. Apologies
-
Jarno,
The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.
-
Henrik,
i think you should be looking into something like this:
User-agent: Googlebot
Disallow: /*view=send_friend$hope this helps
Kind regards
Jarno
-
Hi Henrik,
I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Robots.txt error
Moz Crawler is not able to access the robots.txt due to server error. Please advice on how to tackle the server error.
Technical SEO | | Shanidel0 -
Trailing slash on the main website - do i need a 301 ? Is my 301 correct?
Hello, Im a bit confused. If i use a tool like majestic to look at my website links, www.example.com and www.example.com**/ have huge difference in their authority.** Do i need to make a 301 redirect to the site with the splash or not? Will google itself understand that they are my main site? Is this the "http://www.website.com.com/"/> correct canonical? Meaning it has trailing splash and also RewriteCond %{HTTP_HOST} ^www.example.com [NC]
Technical SEO | | advertisingcloud
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301] - this one has trailing splash, correct?0 -
Tough SEO problem, Google not caching page correctly
My web site is http://www.mercimamanboutique.com/ Cached version of French version is, cache:www.mercimamanboutique.com/fr-fr/ showing incorrectly The German version: cache:www.mercimamanboutique.com/de-de/ is showing correctly. I have resubmitted site links, and asked Google re-index the web site many times. The German version always gets cached properly, but the French version never does. This is frustrating me, any idea why? Thanks.
Technical SEO | | ss20160 -
Does Canonical Tag Syntax Matter?
Does anyone know definitively if the format of the canonical tag matters? Silly question I know. vs
Technical SEO | | Healio0 -
Will a robots.txt disallow apply to a 301ed URL?
Hi there, I have a robots.txt query which I haven't tried before and as we're nearing a big time for sales, I'm hesitant to just roll out to live! Say for example, in my robots.txt I disallow the URL 'example1.html'. In reality, 'example1.html' 301s/302s to 'example2.html'. Would the robots.txt directive also apply to 'example2.html' (disallow) or as it's a separate URL, would the directive be ignored as it's not valid? I have a feeling that as it's a separate URL, the robots disallow directive won't apply. However, just thought I'd sense-check with the community.
Technical SEO | | ecommercebc0 -
Robots.txt vs. meta noindex, follow
Hi guys, I wander what your opinion is concerning exclution via the robots.txt file.
Technical SEO | | AdenaSEO
Do you advise to keep using this? For example: User-agent: *
Disallow: /sale/*
Disallow: /cart/*
Disallow: /search/
Disallow: /account/
Disallow: /wishlist/* Or do you prefer using the meta tag 'noindex, follow' instead?
I keep hearing different suggestions.
I'm just curious what your opinion / suggestion is. Regards,
Tom Vledder0 -
Am I using 301 correctly?
Hello, I have a 'Free download' type site for free graphics for designers. To prevent hot linking we authenticate the downloads and use a 301 redirect. So for example: The download URL looks like this if someone is clicking on the download button: http://www.website.com**/resources/243-name-of-the-file/download/dc37** and then we 301 that URL back to: http://www.website.com**/category-name/243-name-of-the-file** Is a 301 the correct way to do that?
Technical SEO | | shawn810 -
How to verify a page-by-page level 301 redirect was done correctly?
Hello, I told some tech guys to do a page-by-page relevant 301 redirect (as talked about in Matt Cutts video https://www.youtube.com/watch?v=r1lVPrYoBkA) when a company wanted to move to a new domain when their site was getting redesigned. I found out they did a 302 redirect on accident and had to fix that, so now I don't trust they did the page-by-page relevant redirect. I have a feeling they just redirected all of the pages on the old domain to the homepage of the new domain. How could I confirm this suspicion? I run the old domain through screaming frog and it only shows 1 URL - the homepage. Does that mean they took all of the pages on the old domain offline? Thanks!
Technical SEO | | EvolveCreative0