Robots.txt help
-
Hi Moz Community,
Google is indexing some developer pages from a previous website where I currently work:
ddcblog.dev.examplewebsite.com/categories/sub-categories
Was wondering how I include these in a robots.txt file so they no longer appear on Google. Can I do it under our homepage GWT account or do I have to have a separate account set up for these URL types?
As always, your expertise is greatly appreciated,
-Reed
-
The robots.txt would allow the OP to go back into GWT and request removal of the dev site from the index. Password protecting a dev site is usually a pretty good idea, too.
-
Can you not just add a htaccess password to the directory to keep the dev site up, but keep bots out?
-
You'll want a separate account for that subdomain, and also put the robots.txt excluding that subdomain in that subdomain itself.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Teaser Content Help!!
I'm in the process of a redesign and upgrade to Drupal 8 and have used Drupal's taxonomy feature to add a fairly large database of Points of Interest, Services etc. initially this was just for a Map/Filter for site users. The developer also wants to use teasers from these content types (such as a scenic vista description) as a way to display the content on relevant pages (such as the scenic vistas page, as well as other relevant pages). Along with the content it shows GPS coordinates and icons related to the description. In short, it looks cool, can be used in multiple relevant locations and creates a great UX. However, many of these teasers would basically be pieces of content from pages with a lot of SEO value, like descriptive paragraphs about scenic viewpoints from the scenic viewpoints page. Below is an example of how the descriptions of the scenic viewpoints would be displayed on the scenic viewpoints pages, as well as other potential relevant pages. HOW WILL THIS AFFECT THE SEO VALUE OF THE CONTENT?? Thanks in advance for any help, I can't find an answer anywhere. About 250 words worth of content about a scenic vista. There’s about 8 scenic vista descriptions like this from the scenic vistas page, so a good chunk of valuable content. There are numerous long form content pages like this that have descriptions and information about sites and points of interest that don't warrant having their own page. For more specific content with a dedicated page, I can just the the intro paragraph as a teaser and link to that specific page of content. Not sure what to do here.
Intermediate & Advanced SEO | | talltrees0 -
Divi Help!
I've added our phone number and email address in the header settings in Divi. For whatever reason, when I'm editing the header elements I can see it, but when I view the website it's not showing... I cannot figure out what the issue is. I've never run into it before. Also, the menu looks different, it does not match what it shows in the header elements edit area vs live site. XLRpuxghzHUN LxPX4iND6B 2ekykrCH7Pn
Intermediate & Advanced SEO | | LindsayE0 -
Please help - Duplicate Content
Hi, I am really struggling to understand why my site has a lot of duplicate content issues. It's flagging up as ridiculously high and I have no idea how to fix this, can anyone help me, please? Website is www.firstcapitol.co.uk
Intermediate & Advanced SEO | | Alix_SEO1 -
My direct traffic went up and my organic traffic went down. Help!
So on Oct. 21, our direct traffic increased 3x and our organic traffic decreased 3x. And it has been that way ever since. Almost like they flip flopped. Additionally, that was the same day I started retargeting to our site. I have tagged all the links from the ads and they're being counted as google paid clicks in GA. And our accounts are linked. I am just dumbfounded as to how this could happen.
Intermediate & Advanced SEO | | Eric_OWPP1 -
Need help with Robots.txt
An eCommerce site built with Modx CMS. I found lots of auto generated duplicate page issue on that site. Now I need to disallow some pages from that category. Here is the actual product page url looks like
Intermediate & Advanced SEO | | Nahid
product_listing.php?cat=6857 And here is the auto generated url structure
product_listing.php?cat=6857&cPath=dropship&size=19 Can any one suggest how to disallow this specific category through robots.txt. I am not so familiar with Modx and this kind of link structure. Your help will be appreciated. Thanks1 -
"noindex, follow" or "robots.txt" for thin content pages
Does anyone have any testing evidence what is better to use for pages with thin content, yet important pages to keep on a website? I am referring to content shared across multiple websites (such as e-commerce, real estate etc). Imagine a website with 300 high quality pages indexed and 5,000 thin product type pages, which are pages that would not generate relevant search traffic. Question goes: Does the interlinking value achieved by "noindex, follow" outweigh the negative of Google having to crawl all those "noindex" pages? With robots.txt one has Google's crawling focus on just the important pages that are indexed and that may give ranking a boost. Any experiments with insight to this would be great. I do get the story about "make the pages unique", "get customer reviews and comments" etc....but the above question is the important question here.
Intermediate & Advanced SEO | | khi50 -
.com ranked where .co.uk site should After Manual Penalty Revoked - Help!!!
Hi All, I wondered if some could help me as I am at my wits end. Our website www.domain.co.uk was hit with a manual penalty back in April 26th 2012 for over optomizing our inbound links and after 9 reconciliation request later and over a year and many links removed the penalty was revoked. Yay I hear you cry! During the year .co.uk was banned we built .com yet did not build any links to it. The purpose of the .com site was to attract an American audience for our products. .com was hosted on a US server and Geo Targeting set to United States in WMT. So here is my problem after the ban was revoke we expected .co.uk to spring back to some reasonable positions. Nope that is not the case Google now is ranking our .com site where our .co.uk should be for powerdull keywords in position 1st to 10th .com has Zero link equity and .co.uk is very reasonable, So how can I rectify this balls ups and get co.uk listed back where it should be…. I am not bothered where .com ranks. Note: To the best of my knowledge there are NO cross domain 301 or the like only an image link between the two sites. I have posted this on WMT forum and it has fallen on deaf ears! ....help me MOZ members you’re my only hope! Thanks in advance Richard PS: If anyone would like the URL’s in question PM me and I will let you know.
Intermediate & Advanced SEO | | Tricky-400 -
Infinite Redirect Loop without trailing slash, please help
I've been searching for an answer all day, I can't seem to figure this out. When I Fetch my blog as Google(http://www.mysite.com/blog) WITHOUT a trailing slash at the end, I get this error: The page seems to redirect to itself. This may result in an infinite redirect loop **HTTP/1.1 301 Moved Permanently** When I Fetch my blog as Google WITH the trailing slash at the end(http://www.mysite.com/blog/), it is fine without errors. When I pull it up in a browser comes up fine both with and without the trailing slash. My .htaccess file in the root directory contains this: RewriteEngine On
Intermediate & Advanced SEO | | debc
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index.htm\ HTTP/
RewriteRule ^index.htm$ http://www.mysite.com/ [R=301,L]
RewriteCond %{HTTP_HOST} ^mysite.com$
RewriteRule ^(.*)$ http://www.mysite.com/$1 [R=301,L] My .htaccess file in the blog directory contains this: BEGIN WordPress <ifmodule mod_rewrite.c="">RewriteEngine On
RewriteBase /blog/
RewriteCond %{REQUEST_URI} ^./index.php/. [NC]
RewriteRule ^index.php/(.*)$ http://www.mysite.com/blog/$1 [R=301,L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]</ifmodule> END WordPress Do I have something incorrectly coded in these .htaccess files that could be causing this? Or is there something else I should look at? Thank you for any help!!0