Robots.txt - What is the correct syntax?
-
Hello everyone
I have the following link:
http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167
I want to prevent google from indiexing everything that is related to "view=send_friend"
The problem is that its giving me dublicate content, and the content of the links has no SEO value of any sort.
My problem is how i disallow it correctly via robots.txt
I tried this syntax:
Disallow: /view=send_friend/
However after doing a crawl on request the 200+ dublicate links that contains view=send_friend is still present in the CSV crawl report.
What is the correct syntax if i want to prevent google from indexing everything that is related to this kind of link?
-
I added your suggestion to robots.txt and requested a crawl again.
I only have 3 pages with dublicate page content now
So your suggestion seemes to have worked.
Thanks for your reply.. it worked!
-
you are right. misinterpreted the explanation. Apologies
-
Jarno,
The $ would suggest this parameter is always on the end of a URL. And within Henrik's example it's already somewhere in the middle of the URL.
-
Henrik,
i think you should be looking into something like this:
User-agent: Googlebot
Disallow: /*view=send_friend$hope this helps
Kind regards
Jarno
-
Hi Henrik,
I would suggest trying: Disallow: &view=send_friend
Optional you could try this without the & as I'm not sure this is always at the start of this parameter.Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Blocking in Robots.txt and the re-indexing - DA effects?
I have two good high level DA sites that target the US (.com) and UK (.co.uk). The .com ranks well but is dormant from a commercial aspect - the .co.uk is the commercial focus and gets great traffic. Issue is the .com ranks for brand in the UK - I want the .co.uk to rank for brand in the UK. I can't 301 the .com as it will be used again in the near future. I want to block the .com in Robots.txt with a view to un-block it again when I need it. I don't think the DA would be affected as the links stay and the sites live (just not indexed) so when I unblock it should be fine - HOWEVER - my query is things like organic CTR data that Google records and other factors won't contribute to its value. Has anyone ever blocked and un-blocked and whats the affects pls? All answers greatly received - cheers GB
Technical SEO | | Bush_JSM0 -
Robots.txt blocking Addon Domains
I have this site as my primary domain: http://www.libertyresourcedirectory.com/ I don't want to give spiders access to the site at all so I tried to do a simple Disallow: / in the robots.txt. As a test I tried to crawl it with Screaming Frog afterwards and it didn't do anything. (Excellent.) However, there's a problem. In GWT, I got an alert that Google couldn't crawl ANY of my sites because of robots.txt issues. Changing the robots.txt on my primary domain, changed it for ALL my addon domains. (Ex. http://ethanglover.biz/ ) From a directory point of view, this makes sense, from a spider point of view, it doesn't. As a solution, I changed the robots.txt file back and added a robots meta tag to the primary domain. (noindex, nofollow). But this doesn't seem to be having any effect. As I understand it, the robots.txt takes priority. How can I separate all this out to allow domains to have different rules? I've tried uploading a separate robots.txt to the addon domain folders, but it's completely ignored. Even going to ethanglover.biz/robots.txt gave me the primary domain version of the file. (SERIOUSLY! I've tested this 100 times in many ways.) Has anyone experienced this? Am I in the twilight zone? Any known fixes? Thanks. Proof I'm not crazy in attached video. robotstxt_addon_domain.mp4
Technical SEO | | eglove0 -
HTTP Status showing up in opensiteexplorer top pages as blocked by robot.txt file
I am trying to find an answer to this question it has alot of url on this page with no data when i go into the data source and search for noindex or robot.txt but the site is visible in the search engines ?
Technical SEO | | ReSEOlve0 -
404 not found page appears as 200 success in Google Fetch. What to do to correct?
We have received messages in Google webmaster tools that there is an increase in soft 404 errors. When we check the URLs they send to the 404 not found page:
Technical SEO | | Madlena
For example, http://www.geographics.com/images/01904_S.jpg
redirects to http://www.geographics.com/404.shtml.
When we used fetch as Google, here is what we got: .
#1 Server Response: http://www.geographics.com/404.shtml
HTTP/1.1 200 OK Date: Thu, 26 Sep 2013 14:26:59 GMT
What is wrong and what shall we do? The soft 404 errors are mainly for images that no longer exist in the server. Thanks!0 -
Robots.txt
www.mywebsite.com**/details/**home-to-mome-4596 www.mywebsite.com**/details/**home-moving-4599 www.mywebsite.com**/details/**1-bedroom-apartment-4601 www.mywebsite.com**/details/**4-bedroom-apartment-4612 We have so many pages like this, we do not want to Google crawl this pages So we added the following code to Robots.txt User-agent: Googlebot Disallow: /details/ This code is correct?
Technical SEO | | iskq0 -
Robots.txt Showing in SERP Results
Currently doing a technical audit for a website and when I search "Site:website.com -www" the only result is website.com/robots.txt I was wondering if anyone else has come across this before -- or what this may mean from a technical audit standpoint. Thank you!
Technical SEO | | vectormedia0 -
Need Help With Robots.txt on Magento eCommerce Site
Hello, I am having difficulty getting my robots.txt file to be configured properly. I am getting error emails from Google products stating they can't view our products because they are being blocked, and this past week, in my SEO dashboard, the URL's receiving search traffic dropped by almost 40%. Is there anyone that can offer assistance on a good template robots.txt file I can use for a Magento eCommerce website? The one I am currently using was found at this site here: e-commercewebdesign.co.uk/blog/magento-seo/magento-robots-txt-seo.php - However, I am getting problems from Google now because of it. I searched and found this thread here: http://www.magentocommerce.com/wiki/multi-store_set_up/multiple_website_setup_with_different_document_roots#the_root_folder_robots.txt_file - But I felt like maybe I should get some additional help on properly configuring a robots for a Magento site. Thanks in advance for any help. Please, let me know if you need more info to provide assistance.
Technical SEO | | JerDoggMckoy0