Is it hurting my seo ranking if robots.txt is forbidden?
-
robots.txt is forbidden - I have read up on what the robots.txt file does and how to configure it but what about if it is not able to be accessed at all?
-
Yes, excluding certain pages can be a benefit to your rankings: if the excluded pages could be considered duplicate content with your marketing pages or with it each other.
This is usually the case for blogs (think wordpress categories) or webshops (pagination, as well as single product pages reachable by different paths (and thus having different urls). As Ryan pointed out: controll that on the page level via noindex,follow to allow PR to flow. Use noindex,nofollow for "internal" pages you dont want to see crawled.
I am not sure, but having 9950 pages indexed, but considered duplicate content might hurt rankings for other pages on that domain. Google might consider the Domain spammy.
If you need a specific hint for your domain, send me a PM and I have a look if time permits.
-
In general, I do not use robots.txt. It is a better practice to use "noindex" for the pages you do not wish to have indexed.
If I had a 10k page site with 50 marketing pages, I would either want to index the entire site, or question why the other 99% of the site exists if it does not help market the products. There are numerous challenges your scenario prevents. If you block 99% of your site with robots.txt or the noindex meta tag, you are severely disrupting the flow of PR throughout your site. Also you are either blocking content which should be indexed, or you are wasting time and resources creating junk pages on your site.
If the content truly should not be indexed, it likely should be moved to another site. I would need a lot more details about the site, it's purpose and the pages involved. Whatever the proper solution, it is not likely going to be using robots.txt to block 99% of the site.
-
So in regards to increasing ranking, is there a benefit of using the robots.txt file to only index certain "marketing" page and exclude other content that may dilute your site. For example, lets say I have 10,000 pages but only about 50 or so are my marketing page. Would using robots.txt to only crawl my main marketing pages help place emphasis on that content?
-
Sebes is correct. To add a bit more, it is not necessary to provide a robots.txt file. Actually, it is preferable in most cases not to use the file but it is necessary if you do not have direct control over the code used in every page of your site. For example, if you have a CMS or Ecommerce based site you may not have likely do not have control over many pages on your site which are automatically generated through the software. In these cases the only way you can control how crawlers will treat your site's pages is either to pay for custom modifications to your site's code or to use a robots.txt file.
-
If the robots.txt can not be read by google or bing they assume that they can crawl as much as they want to. Check out the google webmaster tool to see whether google can "see" and access your robots.txt.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Huge difference between GSC ranking and browser ranking for certain keywords: How to proceed?
Hi, There is a huge ranking difference between the GSC and browser for our primary keyword. As per GSC, our ranking is around 15 and when checking on the multiple different incognito browsers it's around 50. How to handle this? Which is the accurate one? Product expert from Google forums claim that what I see on browsers are the personalized results; but I tried on different browsers with different connections. Thanks
Algorithm Updates | | vtmoz0 -
What Ranking Factors Impact Google News Visibility?
I'm just at the beginning of a new analysis involving Google News visibility and ranking factors, and thought I could put the project out to you, dear SEO geniuses, to get your ideas and perspectives. Backgrounder: My company operates over 50 niche, disease-specific daily news sites, covering science, research and advocacy news about specific diseases. Virtually all of them are in Google News. They range in age from 3 years old to 3 months old. Varying degrees of page rank / authority Content on the site is completely niche to specific diseases, and we have a lot of sites for rare and orphan diseases. Most of the content is news, but we also have info/resources pages, blogs, and some short-form posts made for use in social media. The Project: I want to do an analysis of keywords in our news headlines and see how certain keywords correlate with articles that do well -- both in terms of search traffic and overall with users. Going to use our Multiple Sclerosis News Today website. Most of our search traffic comes from Google News. What I hope to gain: I'm curious to see if certain sets of keywords that relate to the disease, to therapies, etc. drive the most traffic. I want to compare these keyword lists to how well we rank in organic search for the same keywords (via news articles or info pages) to see if there is a connection. I want to also create a working keyword list of the best-performing keywords in the news as a way of cross-pollinating content production on our blogs, info pages, social content, etc. I want to increase my knowledge base of ranking factors specific for Google News. The last point is really something I wish I knew more about. I feel like there aren't many knowledge resources out there about Google News. Is it safe to assume that the same on-site and off-site SEO best practices that govern organic search engine visibility are at play in Google News, or are there independent factors as well? I'd love to get your thoughts. Thanks!
Algorithm Updates | | Michael_Nace0 -
Ecommerce - SEO Quick Wins?
Hi I wanted to find out if anyone had any quick wins for an ecommerce site & SEO. I am the only SEO and we have a small online team and an ecommerce site with thousands of product pages. It's impossible to optimise everything, and we have taken the top 100 products and optimised them - starting from scratch with keyword research. I'm now struggling to prioritize what we need next - I know we need better internal linking, content, social and lots more, but this isn't something I can get through alone. I need a starting point and perhaps something with a quick win initially? Thanks 🙂
Algorithm Updates | | BeckyKey0 -
Did .org vs. .com SEO importance recently changed?
I have seen previous answers in the Forum about this subject but Google has seemed to have again changed the playing surface. Within the past 30 days, we have seen a huge spike in organic search returns seeming to favor .org as domain authorities. Has anyone else noticed this shift and is it just coincidence or worth factoring in? If it is a shift, will Google punish those that have .org but have used.com previously for switching the redirects to serve .org first? Thanks, Jim
Algorithm Updates | | jimmyzig0 -
Who else is noticing a shift in deeper pages ranking?
Without mentioning names, we're noticing a shift in many of our clients ranking pages. Previously many of them held page 1 positions with their home page. We've been building brand only anchor text to these pages for some time now and there's a noticeable change in visibility to the domain as a whole displayed in GWT and there's an uplift in organic traffic too. It just happens that some of our clients already had pages in the root directory that were very optimised for the clients' head terms, but all of a sudden, these sub pages with very few inbound links have started ranking in the place of the home pages. I've attached a screenshot of the landing page organic traffic. The pages in question have been there for at least 8-10 months. These inner pages would not normally have been able to hold their ground in this position and I'm concerned that this is a temporary change. I can see this going one of two ways; (i) home page beings to out rank sub page as before, (i) sub page loses ranking ability and home page rank does not come back. My questions to the community are therefore; **Has anyone else noticed this shift in ranking behaviour? ** What are everyone's thoughts on this? - Will it remain this way? From this query I can easily ask another wider question; Good advice across the internet says we should be building strong brand links and citations to our clients' domains. Typically brand links go to the homepage, which should provide the homepage and (to a lesser extent the domain) with a ranking/traffic/visibility uplift. However, as I'm noticing other pages now picking up ranking boosts as a result of this; **Should we still be trying to gain links to these more commercial landing pages? ** How are others building high quality links to pages full of commercial copy? I hope this can spark a little bit of a debate. I look forward to hearing everyone's thoughts. Thanks yPOEjVA.png
Algorithm Updates | | tomcraig860 -
How could Penguin kill my top ten rank and promote this garbage page to a #5 spot
Hey, Before penguin, I had a #9 rank for the term "yoga poses". So as many of us are doing, I started looking at my link profile... and yes, there were around 300 links from an old yoga news website (anchor: yoga poses)... that lead to the page on my site optimized for this term. The problem is they took the site down, but not properly... I.E. they generate a "not available" message for browsers, but underneath, I guess the bots can still index all the pages... so I guess they were interpreting these links as coming from a cloaked site. So, I was able to get them to remove the links... webmaster tools reports half of them gone now. What I don't get though... is how Google can give this garbage page a #5 spot for a competitive term like "yoga poses"... Check out http://www.ebmyoga.com/beginyoga.html and compare it to my page... http://www.yogaclassplan.com/yoga-poses/ This page leads to highly quality 100% unique yoga pose articles... in my mind we deliver so much more value than the site with a #5 rank. I don't understand. Any insight? Thanks,
Algorithm Updates | | biomat0 -
Are the tags from schema.org beneficial for SEO?
I just came across schema.org, which has a massive list of attribute tags that can be added to HTML code, presumable with the benefit of giving search engines clear signals about your content -- and by extension, presumably boosting the ranking of good-quality content sites. Many of the tags point back to schema.org for definitions of content types. Since it's the first time I've seen this, I thought I'd ask the question: Do the tags listed at schema.org carry any weight with Google, or is this a self-promotional effort by schema.org to become an arbiter of SEO and content encoding? Thanks folks.
Algorithm Updates | | RobM4160 -
Dramatic ranking shifts
I’m starting to notice some interesting behavior in Google rankings. It seems like most days, rankings move around but movements are relatively small. Then, maybe once or twice a month, there are days (like today) where everything seems to shift around in more dramatic fashion. I don’t know if it’s algorithm updates, or maybe they do light crawls every day and heavy crawls a few times a month..?? anybody have any idea what is behind these big waves of ranking changes?
Algorithm Updates | | znotes0