Easy Question: regarding no index meta tag vs robot.txt
-
This seems like a dumb question, but I'm not sure what the answer is. I have an ecommerce client who has a couple of subdirectories "gallery" and "blog". Neither directory gets a lot of traffic or really turns into much conversions, so I want to remove the pages so they don't drain my page rank from more important pages. Does this sound like a good idea?
I was thinking of either disallowing the folders via robot.txt file or add a "no index" tag or 301redirect or delete them. Can you help me determine which is best.
**DEINDEX: **As I understand it, the no index meta tag is going to allow the robots to still crawl the pages, but they won't be indexed. The supposed good news is that it still allows link juice to be passed through. This seems like a bad thing to me because I don't want to waste my link juice passing to these pages. The idea is to keep my page rank from being dilluted on these pages. Kind of similar question, if page rank is finite, does google still treat these pages as part of the site even if it's not indexing them?
If I do deindex these pages, I think there are quite a few internal links to these pages. Even those these pages are deindexed, they still exist, so it's not as if the site would return a 404 right?
ROBOTS.TXT As I understand it, this will keep the robots from crawling the page, so it won't be indexed and the link juice won't pass. I don't want to waste page rank which links to these pages, so is this a bad option?
**301 redirect: **What if I just 301 redirect all these pages back to the homepage? Is this an easy answer? Part of the problem with this solution is that I'm not sure if it's permanent, but even more importantly is that currently 80% of the site is made up of blog and gallery pages and I think it would be strange to have the vast majority of the site 301 redirecting to the home page. What do you think?
DELETE PAGES: Maybe I could just delete all the pages. This will keep the pages from taking link juice and will deindex, but I think there's quite a few internal links to these pages. How would you find all the internal links that point to these pages. There's hundreds of them.
-
Hello Santaur,
I'm afraid this question isn't as easy as you may have thought at first. It really depends on what is on the pages in those two directories, what they're being used for, who visits them, etc... Certainly removing them altogether wouldn't be as terrible as some people might think IF those pages are of poor quality, have no external links, and very few - if any - visitors. It sounds to me that you might need a "Content Audit" wherein the entire site is crawled, using a tool like Screaming Frog, and then relevant metrics are pulled for those pages (e.g. Google Analytics visits, Moz Page Authority and external links...) so you can look at them and make informed decisions about which pages to improve, remove or leave as-is.
Any page that gets "removed" will leave you with another choice: Allow to 404/410 or 301 redirect. That decision should be easy to make on a page-by-page basis after the content audit because you will be able to see which ones have external links and/or visitors within the time period specified (e.g. 90 days). Pages that you have decided to "Remove" which have no external links and no visits in 90 days can probably just be deleted. The others can be 301 redirected to a more appropriate page, such as the blog home page, top level category page, similar page or - if all else fails - the site home page.
Of course any page that gets removed, whether it redirects or 404s/410s should have all internal links updated as soon as possible. The scan you did with Screaming Frog during the content audit will provide you with all internal links pointing to each URL, which should speed up that process for you considerably.
Good luck!
-
I would certainly think twice about removing those pages as they're in most cases of value for both your SEO as your users. If you would decide to go this way and to have them removed I would redirect all the pages belonging to these subdirectories to another page (let's say the homepage). Although you have a limited amount of traffic there you still want to make sure that the people who land on these pages get redirected to a page that does exist.
-
Are you sure you want to do this? You say 80% of the site consists of gallery and blog pages. You also say there are a lot of internal links to those pages. Are you perhaps under estimating the value of long- tail traffic
To answer your specific question, yes link juice will still pass thru to the pages that are no indexed. They just won't ever show up in search results. Using robots noindex gets you the same result. 301 redirects will pass all your link juice back to the home page, but makes for a lousy user experience. Same for deleting pages.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Utilizing one robots.txt for two sites
I have two sites that are facilitated hosting in similar CMS. Maybe than having two separate robots.txt records (one for every space), my web office has made one which records the sitemaps for the two sites, similar to this:
Technical SEO | | eulabrant0 -
JSON-LD meta data: Do you have any rules/recommendations for using BlogPosting vs Article?
Dear Moz Community. I'm looking at moving from in-line Microdata in the HTML to JSON-LD on the web pages that I manage. Seems a far simpler solution having all the meta data in one place - especially for trouble shooting! With this in mind I've started to change the page templates on my personal site before I tackle the ones for my eCommerce site. I've made a start, and I'm still working on the templates producing some default values (like if a page doesn't have an associated image) but have been wondering if any of you have any rules/recommendations for using BlogPosting vs Article? I'd call this type of page an Article:
Technical SEO | | andystorey
https://cycling-jersey-collection.com/browse-collection/selle-italia-chinol-seb-bennotto-1982-team-jersey Whereas this page is from the /blog so that should probably be a BlogPosting:
https://cycling-jersey-collection.com/blog/2017-worldtour-team-jerseys I've used the following resources but it would be great to get a discussion on here.
https://yoast.com/structured-data-schema-ultimate-guide/
https://developers.google.com/search/docs/data-types/data-type-selector
https://search.google.com/structured-data/testing-tool/u/0/ I'm keen to get this 100% right as once this is done I'm going to drive through some further changes to get some progress on things like this: https://moz.com/blog/ranking-zero-seo-for-answers
https://moz.com/blog/what-we-learned-analyzing-featured-snippets Kind Regards andy moz-screenshot.jpg1 -
What are the negative implications of listing URLs in a sitemap that are then blocked in the robots.txt?
In running a crawl of a client's site I can see several URLs listed in the sitemap that are then blocked in the robots.txt file. Other than perhaps using up crawl budget, are there any other negative implications?
Technical SEO | | richdan0 -
"Extremely high number of URLs" warning for robots.txt blocked pages
I have a section of my site that is exclusively for tracking redirects for paid ads. All URLs under this path do a 302 redirect through our ad tracking system: http://www.mysite.com/trackingredirect/blue-widgets?ad_id=1234567 --302--> http://www.mysite.com/blue-widgets This path of the site is blocked by our robots.txt, and none of the pages show up for a site: search. User-agent: * Disallow: /trackingredirect However, I keep receiving messages in Google Webmaster Tools about an "extremely high number of URLs", and the URLs listed are in my redirect directory, which is ostensibly not indexed. If not by robots.txt, how can I keep Googlebot from wasting crawl time on these millions of /trackingredirect/ links?
Technical SEO | | EhrenReilly0 -
Meta Title Tags
Hi, Are Meta Title Tag deemed by google to be unique if I use the same phrases by in a different order. For example 3 different pages <colgroup><col width="475"></colgroup>
Technical SEO | | Studio33
| Online Invoicing Software | Online Invoicing | Invoicing Software |
| Online Invoicing | Invoicing Software | Online Invoicing Software |
| Invoicing Software | Online Invoicing Software | Online Invoicing | You will not it is the same keywords just in a different order. Is this unique enough or will google not be happy about it. Thanks Andrew0 -
SEOMoz Crawler vs Googlebot Question
I read somewhere that SEOMoz’s crawler marks a page in its Crawl Diagnostics as duplicate content if it doesn’t have more than 5% unique content.(I can’t find that statistic anywhere on SEOMoz to confirm though). We are an eCommerce site, so many of our pages share the same sidebar, header, and footer links. The pages flagged by SEOMoz as duplicates have these same links, but they have unique URLs and category names. Because they’re not actual duplicates of each other, canonical tags aren’t the answer. Also because inventory might automatically come back in stock, we can’t use 301 redirects on these “duplicate” pages. It seems like it’s the sidebar, header, and footer links that are what’s causing these pages to be flagged as duplicates. Does the SEOMoz crawler mimic the way Googlebot works? Also, is Googlebot smart enough not to count the sidebar and header/footer links when looking for duplicate content?
Technical SEO | | ElDude0 -
Robots.txt
should I add anything else besides User-Agent: * to my robots.txt file? http://melo4.melotec.com:4010/
Technical SEO | | Romancing0 -
Is blocking RSS Feeds with robots.txt necessary?
Is it necessary to block an rss feed with robots.txt? It seems they are automatically not indexed (http://googlewebmastercentral.blogspot.com/2007/12/taking-feeds-out-of-our-web-search.html) And, google says here that it's important not to block RSS feeds (http://googlewebmastercentral.blogspot.com/2009/10/using-rssatom-feeds-to-discover-new.html) I'm just checking!
Technical SEO | | nicole.healthline0