Application & understanding of robots.txt
-
Hello Moz World!
I have been reading up on robots.txt files, and I understand the basics. I am looking for a deeper understanding on when to deploy particular tags, and when a page should be disallowed because it will affect SEO. I have been working with a software company who has a News & Events page which I don't think should be indexed. It changes every week, and is only relevant to potential customers who want to book a demo or attend an event, not so much search engines. My initial thinking was that I should use noindex/follow tag on that page. So, the pages would not be indexed, but all the links will be crawled.
I decided to look at some of our competitors robots.txt files. Smartbear (https://smartbear.com/robots.txt), b2wsoftware (http://www.b2wsoftware.com/robots.txt) & labtech (http://www.labtechsoftware.com/robots.txt).
I am still confused on what type of tags I should use, and how to gauge which set of tags is best for certain pages. I figured a static page is pretty much always good to index and follow, as long as it's public. And, I should always include a sitemap file. But, What about a dynamic page? What about pages that are out of date? Will this help with soft 404s?
This is a long one, but I appreciate all of the expert insight. Thanks ahead of time for all of the awesome responses.
Best Regards,
Will H.
-
Yup.. also don't forget that robots.txt is just a "recommendation" for robots. they do not obey it
Basically Google does what ever it wants to
Also if you want to block a folder so its inner content wont be "accessed", in case anylink will point to this page, even if its coming from outside of your domain, it will be indexed.. Although the content of it wont be shown on search results but it will show up with a notice stating that the site content is blocked due to the sites robots.txt..best of luck!
-
Great Advice Yossi & Chris. Thanks for taking the time to reply. I will have to dig into the Google Guidelines for additional information, but both of your points are valid. I think I was looking at robots.txt the wrong way. Thanks Again Guys!
-
I completely agree with Yossi here; no need to go blocking that page at all.
I can't really add any further value to the points he has covered but one other part of your question suggested that perhaps you're looking at this the wrong way (and it's very common, don't worry!). Rather than having your site stay as-is and just obscuring the bad parts of it from search engines, the thought process should really around creating a great website instead.
If you're ever considering blocking a page from search engines, the first step should always be "why am I blocking this page(s); could I just fix the issue instead?".
For example, you asked if this might help with soft 404s. Rather than trying to find a way to hide these soft 404s, spend that time fixing them instead!
-
Hi Will
There are some concerns that you have which I do not understand.
Why you want to block News & Events page? If it has unique content and on top of that if it is updated regularly, you have no reason to block access to the page. If it is "relevant to potential customers who want to book a demo" its great. I would definitely keep it indexed and followed.Google explicitly states that you should not block access to a page if you simply want to de-index it/remove it. If the page should not be indexed publicly you should remove it or password protect it (a google suggestion).
About tags, i assume you are talking about meta tags, correct?
There is no need to use any kind of meta tag to signal search engines that they need to index or follow the page, you use it only when you want to limit them not to take certain actions.
Also there is no difference between a static or dynamic page when it comes to tag usage. There is no rules for that. A page perfectly be static for years and still get indexed and ranked very good. (but, well we all know that updating the site is a ranking signal)
If you believe that certain page should be tagged "noindex" it is not because it is not updated within the last month or year. Just for an example: contact us pages, about us pages and terms of use pages. These are super static pages that in many cases probably wont be changed for years.best
Yossi
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Ecommerce category pages & improving rankings
Hi Moz 🙂 I work on an ecommerce site & am getting stuck with how to improve rankings on category pages. I have a competitor who writes loads of content for their category pages under tabs & they perform very well. The content isn't particularly helpful, more about their range and what they offer. I have tested adding similar content under a tab to some of our category pages - with some performing well & others not as well. I know this isn't ideal, and I'd like some help with an alternative. Does anyone have tips on improving rankings on category pages? I don't have much control on the layout, this is controlled by our parent company which restricts us. I am researching writing user guides, but these will be on other pages not directly on the category page & the way we have to add them is a lot of manual work for our webmaster, so I can't get them up as quickly as I'd like. I have seen REI have a small bit of content at the top of their pages that link to guides e.g - https://www.rei.com/c/static-and-rescue-ropes But obviously their domain authority is so high already, that they don't need as much help as me 🙂 At the moment I have some new Chair pages I need to rank, these are competitive and any ideas would be great 🙂 Here are some examples: http://www.key.co.uk/en/key/ergonomic-office-chairs http://www.key.co.uk/en/key/executive-office-chairs Thank you!
Intermediate & Advanced SEO | | BeckyKey0 -
Not sure how we're blocking homepage in robots.txt; meta description not shown
Hi folks! We had a question come in from a client who needs assistance with their robots.txt file. Metadata for their homepage and select other pages isn't appearing in SERPs. Instead they get the usual message "A description for this result is not available because of this site's robots.txt – learn more". At first glance, we're not seeing the homepage or these other pages as being blocked by their robots.txt file: http://www.t2tea.com/robots.txt. Does anyone see what we can't? Any thoughts are massively appreciated! P.S. They used wildcards to ensure the rules were applied for all locale subdirectories, e.g. /en/au/, /en/us/, etc.
Intermediate & Advanced SEO | | SearchDeploy0 -
Checking Rankings Again & Again Can Drop Rankings
Is it possible that if i check my google rankings again & again it can drop ranking?Like checking where do my keywords rank every hours rank drop the rankings? Because this indirectly affects the CTR. Might be because of it? No one has faced such an weird thing before.
Intermediate & Advanced SEO | | welcomecure0 -
Google Ranking Generally in Germany - Keywords & Umlauts
Hi Mozzers, I was hoping i could get some advice/opinions on a website ranking problem i have been working on, in particular one of the pages. This is our German language website which is hosted from Germany and a flaunt German speaking member of staff from our German office moderates the text content of the website for us.Our website seems to get good traffic ,visitor navigation and conversions. One of the keywords i focus building around is Schallpegelmessgerät which is one way of basically saying Sound level meter in German. The keyword uses an umlaut which i cannot use in the URL, but google is picking up and putting into the snippets, but apart from that our on-page optimization is good according to the moz tool. I have been trying to improve our content and we post many blog articles around the topic/keyword but google.de seems to choose not to even display this on the first couple of pages and sometimes ranks our blog articles around the third page. We are even been outranked by some low quality cheap online shop websites some of which with low quality content and low page and domain authorities. I had accepted this but after looking at bing.de and doing a search i find our page in the top 5 results, i understand that google and bing's algorhythms are different but just struggling to get my head around it all. Here is our website & page - http://www.cirrusresearch.de/produkte/schallpegelmessgerat/ Any advice on this situation would be greatly appreciated, thank you very much for reading this James
Intermediate & Advanced SEO | | Antony_Towle0 -
Do different meta titles & descriptions delete the canonical origin?
Hi, hopefully anyone knows something about this case: There is a canonical tag on site "www.xyz.com**/de_de/" **refering to site "www.xyz.com/de-de/". If the meta title and descriptions are different on both sides - is there a problem that google will not pay attention to the canonical tag? Do both sides need the same title and canonical? Thanx for your answers! Cheers Heiko!
Intermediate & Advanced SEO | | heckert0 -
How do I reduce internal links & cannibalisation from primiary navigation?
SEOmoz tools is reporting each page on our site containing in excess of 200 internal links mostly from our primary navigation menu which it says is too many. This also causes cannibalization on the word towels which i would like to avoid if possible. Is there a way to reduce the number of internal links whilst maintaining a good structure to allow link juice to filter through the site and also reduce cannibalization?
Intermediate & Advanced SEO | | Towelsrus0 -
Robots.txt is blocking Wordpress Pages from Googlebot?
I have a robots.txt file on my server, which I did not develop, it was done by the web designer at the company before me. Then there is a word press plugin that generates a robots.txt file. How Do I unblock all the wordpress pages from googlebot?
Intermediate & Advanced SEO | | ENSO0 -
Subdomains vs. Subfolders for unique categories & topics
Hello, We are in the process of redesigning and migrating 5 previously separate websites (all different niche topics, including dining, entertainment, retail, real estate, etc.) under one umbrella site for the property in which they exist. From the property homepage, you will now be able to access all of the individual category sites within. As each niche microsite will be focused on a different topic, I am wondering whether it is best for SEO that we use subdomains such as category.mainsite.com or subfolders mainsite.com/category. I have seen it done both ways on large corporate sites (ie: Ikea uses subdomains for different country sites, and Apple uses subfolders), so I am wondering what makes the most sense for this particular umbrella site. Any help is greatly appreciated. Thanks, Melissa
Intermediate & Advanced SEO | | grapevinemktg0