Should I robots block this directory?
-
There's about 43k pages indexed in this directory, and while helpful to end users, I don't see it being a great source of unique content for search engines.
Would you robots block or meta noindex nofollow these pages in the /blissindex/ directory?
ie.
http://www.careerbliss.com/blissindex/petsmart-index-980481/
http://www.careerbliss.com/blissindex/att-index-1043730/
http://www.careerbliss.com/blissindex/facebook-index-996632/
-
Totally agree with Ryan Kent. You should write a paragraph of content that is unique to the company featured. The chart is not unique enough and you will get flagged as having a high ratio of duplicate content. You should also look at all the other SEO elements on this page, understand what keyphrases you are targeting and modify the title, meta and H1 tags.
-
Should I robots block this directory?
I wouldn't.
Robots.txt in general should only be used when there is no other alternate means available to block content. An example is when your site is created by a CMS or e-commerce platform which does not offer the flexibility to noindex individual pages.
By blocking your site's content, you are preventing search engines not only from indexing the pages, but from following any links on those pages. You are restricting the way a crawler can travel on your site, which is generally a bad idea.
Additionally, I would suggest those pages offer value. "Petco salary comparison", "Target wages" and other search queries could generate results for those pages. Those pages contain helpful information which is otherwise not easily found on the internet. If that was my site, I would work to improve the optimization of those pages, not block them.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
SEO Implications of firewalls that block "foreign connections"
Hello! A client's IT security team has firewalls on the site with GEO blocking enabled. This is to prevent foreign connections to applications as part of a contractual agreements with their own clients. Does anyone have any experience with workarounds for this? Thank you!
Intermediate & Advanced SEO | | SimpleSearch0 -
Robots.txt, Disallow & Indexed-Pages..
Hi guys, hope you're well. I have a problem with my new website. I have 3 pages with the same content: http://example.examples.com/brand/brand1 (good page) http://example.examples.com/brand/brand1?show=false http://example.examples.com/brand/brand1?show=true The good page has rel=canonical & it is the only page should be appear in Search results but Google has indexed 3 pages... I don't know how should do now, but, i am thinking 2 posibilites: Remove filters (true, false) and leave only the good page and show 404 page for others pages. Update robots.txt with disallow for these parameters & remove those URL's manually Thank you so much!
Intermediate & Advanced SEO | | thekiller990 -
Not sure how we're blocking homepage in robots.txt; meta description not shown
Hi folks! We had a question come in from a client who needs assistance with their robots.txt file. Metadata for their homepage and select other pages isn't appearing in SERPs. Instead they get the usual message "A description for this result is not available because of this site's robots.txt – learn more". At first glance, we're not seeing the homepage or these other pages as being blocked by their robots.txt file: http://www.t2tea.com/robots.txt. Does anyone see what we can't? Any thoughts are massively appreciated! P.S. They used wildcards to ensure the rules were applied for all locale subdirectories, e.g. /en/au/, /en/us/, etc.
Intermediate & Advanced SEO | | SearchDeploy0 -
Robots.txt for Facet Results
Hi Does anyone know how to properly add facets URL's to Robots txt? E.g. of our facets URL - http://www.key.co.uk/en/key/platform-trolleys-trucks#facet:-10028265807368&productBeginIndex:0&orderBy:5&pageView:list& Everything after the # will need to be blocked on all pages with a facet. Thank you
Intermediate & Advanced SEO | | BeckyKey0 -
Robots.txt: Can you put a /* wildcard in the middle of a URL?
We have noticed that Google is indexing the language/country directory versions of directories we have disallowed in our robots.txt. For example: Disallow: /images/ is blocked just fine However, once you add our /en/uk/ directory in front of it, there are dozens of pages indexed. The question is: Can I put a wildcard in the middle of the string, ex. /en/*/images/, or do I need to list out every single country for every language in the robots file. Anyone know of any workarounds?
Intermediate & Advanced SEO | | IHSwebsite0 -
Googlebot Can't Access My Sites After I Repair My Robots File
Hello Mozzers, A colleague and I have been collectively managing about 12 brands for the past several months and we have recently received a number of messages in the sites' webmaster tools instructing us that 'Googlebot was not able to access our site due to some errors with our robots.txt file' My colleague and I, in turn, created new robots.txt files with the intention of preventing the spider from crawling our 'cgi-bin' directory as follows: User-agent: * Disallow: /cgi-bin/ After creating the robots and manually re-submitting it in Webmaster Tools (and receiving the green checkbox), I received the same message about Googlebot not being able to access the site, only difference being that this time it was for a different site that I manage. I repeated the process and everything, aesthetically looked correct, however, I continued receiving these messages for each of the other sites I manage on a daily-basis for roughly a 10-day period. Do any of you know why I may be receiving this error? is it not possible for me to block the Googlebot from crawling the 'cgi-bin'? Any and all advice/insight is very much welcome, I hope I'm being descriptive enough!
Intermediate & Advanced SEO | | NiallSmith1 -
How can I block unwanted urls being indexed on google?
Hi, I have to block unwanted urls (not that page) from being indexed on google. I have to block urls like example.com/entertainment not the exact page example.com/entertainment.aspx . Is there any other ways other than robot.txt? If i add this to robot.txt will that block my other url too? Or should I make a 301 redirection from example.com/entertainment to example.com/entertainment.aspx. Because some of the unwanted urls are linked from other sites. thanks in advance.
Intermediate & Advanced SEO | | VipinLouka780 -
Large Site SEO - Dev Issue Forcing URL Change - 301, 302, Block, What To Do?
Hola, Thanks in advance for reading and trying to help me out. A client of mine recently created a large scale company directory (500k+ pages) in Drupal v6 while the "marketing" type pages of their site was still in manual hard-coded HTML. They redesigned their "marketing" pages, but used Drual v7. They're now experiencing server conflicts with both instances of Drupal not allowing them to communicate/be on the same server. Eventually the directory will be upgraded to Drupal v7, but could take weeks to months the client does not want to wait for the re-launch. The client wants to push the new marketing site live, but also does not want to ruin the overall SEO value of the directory and have a few options, but I'm looking to help guide them down the path of least resistance: Option 1: Move the company directory onto a subdomain and the "marketing site" on the www. subdomain. Client gets to push their redesign live, but large scale 301s to the directory cause major issues in terms of shaking up the structure of the site causing ripple effects into getting pulled out of the index for days to weeks. Rankings and traffic drop, subdomain authority gets lost and the company directory health looks bad for weeks to months. However, 301 maintains partial SEO value and some long tail traffic still exists. Once the directory gets moved to Drupal v7, the directory will then cancel the 301 to the subdomain and revert back to original www. subdomain URLs Option 2: Block the company directory from search engines with robots.txt and meta instructions, essentially cutting off the floodgates from the established marketing pages. No major scaling 301 ripple effect, directory takes a few weeks to filter out of the index, traffic is completely lost, however once drupal v7 gets upgraded and the directory is then re-opened, directory will then slowly gain back SEO value to get close to old rankings, traffic, etc. Option 3: 302 redirect? Lose all accumulate SEO value temporarily... hmm Option 4: Something else? As you can see, this is not an ideal situation. However, a decision has to be made and I'm looking to chose the lesser of evils. Any help is greatly appreciated. Thanks again -Chris
Intermediate & Advanced SEO | | Bacon0