Robots txt is case senstive? Pls suggest
-
Hi i have seen few urls in the html improvements duplicate titles
Can i disable one of the below url in the robots.txt?
/store/Solar-Home-UPS-1KV-System/75652
/store/solar-home-ups-1kv-system/75652if i disable this
Disallow: /store/Solar-Home-UPS-1KV-System/75652
will the Search engines scan this /store/solar-home-ups-1kv-system/75652
im little confused with case senstive.. Pls suggest go ahead or not in the robots.txt
-
Hi Already there is some equity for duplicate links, wht is going to happen?
-
Actually, you have just one option to not index them - the second one. The first will, still keep them in index if google can find them. I currently have roughly 27k URLs indexed that were blocked via robots.txt from the start (generated with a time-based parameter; yeah: ouch.).
Those results do not usually appear in "normal" search but can be forced (currently you may try site:grimoires.de inurl:fakechecknr and showing skipped results to see the effect of that). So basically I'd advise against using robots.txt - it does not prevent indexing, only the visiting/reading of that page.
Regards
Nico
-
Hi Abdul,
Yes, it is case sensitive.
Remember that you must not have many pages like that.
The first thing you should do is elimiate those duplicate pages.In the case you can´t eliminate them, you have 2 way to ask the google bot not to index them:
1- By robots.txt with a 'Disallow:' instruction
2- By a meta tag with a_ '' _in theHope it helps.
GR
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Block session id URLs with robots.txt
Hi, I would like to block all URLs with the parameter '?filter=' from being crawled by including them in the robots.txt. Which directive should I use: User-agent: *
Intermediate & Advanced SEO | | Mat_C
Disallow: ?filter= or User-agent: *
Disallow: /?filter= In other words, is the forward slash in the beginning of the disallow directive necessary? Thanks!1 -
What does Disallow: /french-wines/?* actually do - robots.txt
Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?* Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark? Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL? I think this has been done to block URLs containing query strings. Thanks, Luke
Intermediate & Advanced SEO | | McTaggart0 -
Robots.txt Allowed
Hello all, We want to block something that has the following at the end: http://www.domain.com/category/product/some+demo+-text-+example--writing+here So I was wondering if doing: /*example--writing+here would work?
Intermediate & Advanced SEO | | ThomasHarvey0 -
Ranking Suggestions
Hi There, I am currently working on this site: completesteamclean.ca to rank as high as possible for "carpet cleaning windsor" on Google.ca and I can't seem to make any head way against the other websites ranking. I've optimized the page, created social media accounts for activity, even invested in Google Adwords and Facebook ads to try to boost my presence. Does anyone have any suggestions on any tactics/revisions I can make that may help boost my site?
Intermediate & Advanced SEO | | MainstreamMktg0 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
Can't find X-Robots tag!
Hi all. I've been checking out http://www.unthankbooks.com/ as it seems to have some indexing problems. I ran a server header check, and got a 200 response. However, it also shows the following: X-Robots-Tag:
Intermediate & Advanced SEO | | Blink-SEO
noindex, nofollow It's not in the page HTML though. Could it be being picked up from somewhere else?0 -
XML Sitemap instruction in robots.txt = Worth doing?
Hi fellow SEO's, Just a quick one, I was reading a few guides on Bing Webmaster tools and found that you can use the robots.txt file to point crawlers/bots to your XML sitemap (they don't look for it by default). I was just wondering if it would be worth creating a robots.txt file purely for the purpose of pointing bots to the XML sitemap? I've submitted it manually to Google and Bing webmaster tools but I was thinking more for the other bots (I.e. Mozbot, the SEOmoz bot?). Any thoughts would be appreciated! 🙂 Regards, Ash
Intermediate & Advanced SEO | | AshSEO20110 -
10,000 New Pages of New Content - Should I Block in Robots.txt?
I'm almost ready to launch a redesign of a client's website. The new site has over 10,000 new product pages, which contain unique product descriptions, but do feature some similar text to other products throughout the site. An example of the page similarities would be the following two products: Brown leather 2 seat sofa Brown leather 4 seat corner sofa Obviously, the products are different, but the pages feature very similar terms and phrases. I'm worried that the Panda update will mean that these pages are sand-boxed and/or penalised. Would you block the new pages? Add them gradually? What would you recommend in this situation?
Intermediate & Advanced SEO | | cmaddison0