Search Engine blocked by robots.txt
-
I do that, because i am using joomla. is bad? thanks
-
yeah, the structure of the site can get confusing.
-
Oh yeh this is good for query strings, will not crawl non SEF URLs
So your are good
-
Yes, i have that.
then put:
Block all query
stringsDisallow: /*?
if i don´t put that, the crawler index me x 2 all the web pages.
example: now i have 400 indexed. if i take off that, will index like 800
-
this is a sample of a Joomla Site that i have for robots.txt.
User-agent: * Disallow: /administrator/ Disallow: /cache/ Disallow: /includes/ Disallow: /installation/ Disallow: /language/ Disallow: /libraries/ Disallow: /media/ Disallow: /plugins/ Disallow: /templates/ Disallow: /tmp/ Disallow: /xmlrpc/
-
just put this in robots:
Block all query stringsDisallow: /*?
saing that not index pages with this string. this don´t generate duplicated files.
it´s bad too?
thanks
Regards
Gabo
-
If you need to Index your website and gets rankings, yes this is bad for your website.
This is means that you don't want any Search engine to index your website, hence, people wont find you in the search engines.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Best use of robots.txt for "garbage" links from Joomla!
I recently started out on Seomoz and is trying to make some cleanup according to the campaign report i received. One of my biggest gripes is the point of "Dublicate Page Content". Right now im having over 200 pages with dublicate page content. Now.. This is triggerede because Seomoz have snagged up auto generated links from my site. My site has a "send to freind" feature, and every time someone wants to send a article or a product to a friend via email a pop-up appears. Now it seems like the pop-up pages has been snagged by the seomoz spider,however these pages is something i would never want to index in Google. So i just want to get rid of them. Now to my question I guess the best solution is to make a general rule via robots.txt, so that these pages is not indexed and considered by google at all. But, how do i do this? what should my syntax be? A lof of the links looks like this, but has different id numbers according to the product that is being send: http://mywebshop.dk/index.php?option=com_redshop&view=send_friend&pid=39&tmpl=component&Itemid=167 I guess i need a rule that grabs the following and makes google ignore links that contains this: view=send_friend
Technical SEO | | teleman0 -
Confirming Robots.txt code deep Directories
Just want to make sure I understand exactly what I am doing If I place this in my Robots.txt Disallow: /root/this/that By doing this I want to make sure that I am ONLY blocking the directory /that/ and anything in front of that. I want to make sure that /root/this/ still stays in the index, its just the that directory I want gone. Am I correct in understanding this?
Technical SEO | | cbielich0 -
Blocking https from being crawled
I have an ecommerce site where https is being crawled for some pages. Wondering if the below solution will fix the issue www.example.com will be my domain In the nav there is a login page www.example.com/login which is redirecting to the https://www.example.com/login If I just disallowed /login in the robots file wouldn't it not follow the redirect and index that stuff? The redirect part is what I am questioning.
Technical SEO | | Sean_Dawes0 -
Temporarily suspend Googlebot without blocking users
We'll soon be launching a redesign, on a new platform, migrating millions of pages to new URLs. How can I tell Google (and other crawlers) to temporarily (a day or two) ignore my site? We're hoping to buy ourselves a small bit of time to verify redirects and live functionality before allowing Google to crawl and index the new architecture. GWT's recommendation is to 503 all pages - including robots.txt, but that also makes the site invisible to real site visitors, resulting in significant business loss. Bad answer. I've heard some recommendations to disallow all user agents in robots.txt. Any answer that puts the millions of pages we already have indexed at risk is also a bad answer. Thanks
Technical SEO | | lzhao0 -
Do the search engines penalise you for images being WATERMARKED?
Our site contains a library of thousands of images which we are thinking of watermarking. Does anyone know if Google penalise sites for this or is it best practice in order to protect revenues? As watermarking these images makes them less shareable (but protects revenues) i was thinking Google might then penalise us - which might affect traffic Any ideas?
Technical SEO | | KevinDunne0 -
Should I add my blog posts to my sitemap.txt file?
This seems like it should be an obvious no, just because of the amount of work that would entail, and then remembering to do it every time I make a post, but since I couldn't find anything on Google about it and have never heard anyone mention it, I figured I'd ask.
Technical SEO | | UnderRugSwept0 -
Search engines have been blocked by robots.txt., how do I find and fix it?
My client site royaloakshomesfl.com is coming up in my dashboard as having Search engines have been blocked by robots.txt, only I have no idea where to find it and fix the problem. Please help! I do have access to webmaster tools and this site is a WP site, if that helps.
Technical SEO | | LeslieVS0 -
Robots.txt file getting a 500 error - is this a problem?
Hello all! While doing some routine health checks on a few of our client sites, I spotted that a new client of ours - who's website was not designed built by us - is returning a 500 internal server error when I try to look at the robots.txt file. As we don't host / maintain their site, I would have to go through their head office to get this changed, which isn't a problem but I just wanted to check whether this error will actually be having a negative effect on their site / whether there's a benefit to getting this changed? Thanks in advance!
Technical SEO | | themegroup0