Why do my https pages index while noindexed?
-
I have some tag pages on one of my sites that I meta noindexed. This worked for the http version, which they are canonical'd to but now the https:// version is indexing.
The https version is both noindexed and has a canonical to the http version, but they still show up! I even have wordpress set up to redirect all https: to http! For some reason these pages are STILL showing in the SERPS though. Any experience or advice would be greatly appreciated.
Example page: https://www.michaelpadway.com/tag/insurance-coverage/
Thanks all!
-
That is true, but I also have them 301'd to the http version and canonicaled! That is pretty much every possible signal to tell them those pages aren't pages and don't index them.
I suppose we can submit the URLs, unfortunately there are a LOT of tag pages.
Thanks for the advice Dana!
-
Hi Spencer,
I am an in-house SEO to a fairly large e-commerce site (4,000 SKUs) that has the same exact problem. As I am sure you are aware, the META robots noindex tag is only a suggestion to goooglebot. This is something the bot can easily choose to ignore, and it frequently does ignore it.
I would suggest submitting individual URLs that you would prefer to be removed from Google in Google Webmaster Tools to the "Remove URLS" tool - It's not instantaneous, but it does work.
I hope that helps. I know it's frustrating. We have tons of content that's indexed that we'd rather wasn't. It takes time, patience and intelligent work to get the job done.
Dana
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Will Reducing Number of Low Page Authority Page Increase Domain Authority?
Our commercial real estate site (www.nyc-officespace-leader.com) contains about 800 URLs. Since 2012 the domain authority has dropped from 35 to about 20. Ranking and traffic dropped significantly since then. The site has about 791 URLs. Many are set to noindex. A large percentage of these pages have a Moz page authority of only "1". It is puzzling that some pages that have similar content to "1" page rank pages rank much better, in some cases "15". If we remove or consolidate the poorly ranked pages will the overall page authority and ranking of the site improve? Would taking the following steps help?: 1. Remove or consolidate poorly ranking unnecessary URLs?
Intermediate & Advanced SEO | | Kingalan1
2. Update content on poorly ranking URLs that are important?
3. Create internal text links (as opposed to links from menus) to critical pages? A MOZ crawl of our site's URLs is visible at the link below. I am wondering if the structure of the site is just not optimized for ranking and what can be done to improve it. THANKS. https://www.dropbox.com/s/oqchfqveelm1q11/CRAWL www.nyc-officespace-leader.com (1).csv?dl=0 Thanks,
Alan0 -
Why would my total number of indexed pages stop increasing?
I have an ecommerce marketplace that has new items added daily. In search consoloe my pages have always gone up almost every week. It hasn't increased in 5 weeks. We haven't made any changes to the site and the sitemap looks good. Any ideas on what I should look for?
Intermediate & Advanced SEO | | EcommerceSite0 -
Pages are Indexed but not Cached by Google. Why?
Here's an example: I get a 404 error for this: http://webcache.googleusercontent.com/search?q=cache:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all But a search for qjamba restaurant coupons gives a clear result as does this: site:http://www.qjamba.com/restaurants-coupons/ferguson/mo/all What is going on? How can this page be indexed but not in the Google cache? I should make clear that the page is not showing up with any kind of error in webmaster tools, and Google has been crawling pages just fine. This particular page was fetched by Google yesterday with no problems, and even crawled again twice today by Google Yet, no cache.
Intermediate & Advanced SEO | | friendoffood2 -
Duplicate page title at bottom of page - ok, or bad?
Can I get you experts opinion? A few years ago, we customized our pages to repeat the page title at the bottom of the page. So the page title is in the breadcrumbs at the top, and then it's also at the bottom of the page under all the contents. Here is a sample page: bit.ly/1pYyrUl I attached a screen shot and highlighted the second occurence of the page title. Am worried that this might be keyword stuffing, or over optimizing? Thoughts or advice on this? Thank you so much! ron ZH8xQX6
Intermediate & Advanced SEO | | yatesandcojewelers0 -
My home page is not found by the "Grade a Page" tool
My home page as well as several important pages are not found by the Grade a Page tool. With our full https address I got this http://screencast.com/t/s1gESMlGwpa With just the www address I got this http://screencast.com/t/BMRHy36Ih https://www.joomlashack.com
Intermediate & Advanced SEO | | etabush
https://www.joomlashack.com/joomla-templates We recently lost a lot of positions for our most important keyword: Joomla Templates Please help us figure this out. Whats screwy with our site?0 -
To index or de-index internal search results pages?
Hi there. My client uses a CMS/E-Commerce platform that is automatically set up to index every single internal search results page on search engines. This was supposedly built as an "SEO Friendly" feature in the sense that it creates hundreds of new indexed pages to send to search engines that reflect various terminology used by existing visitors of the site. In many cases, these pages have proven to outperform our optimized static pages, but there are multiple issues with them: The CMS does not allow us to add any static content to these pages, including titles, headers, metas, or copy on the page The query typed in by the site visitor always becomes part of the Title tag / Meta description on Google. If the customer's internal search query contains any less than ideal terminology that we wouldn't want other users to see, their phrasing is out there for the whole world to see, causing lots and lots of ugly terminology floating around on Google that we can't affect. I am scared to do a blanket de-indexation of all /search/ results pages because we would lose the majority of our rankings and traffic in the short term, while trying to improve the ranks of our optimized static pages. The ideal is to really move up our static pages in Google's index, and when their performance is strong enough, to de-index all of the internal search results pages - but for some reason Google keeps choosing the internal search results page as the "better" page to rank for our targeted keywords. Can anyone advise? Has anyone been in a similar situation? Thanks!
Intermediate & Advanced SEO | | FPD_NYC0 -
Website Displayed by Google as Https: when all Secure Content is Blocked - Causing Index Prob.
Basically, I have no inbound likes going to https://www.mysite.com , but google is indexing the Homepage only as https://www.mysite.com In June, I was re included to the google index after receiving a penalty... Most of my site links recovered fairly well. However my homepage did not recover for its top keywords. Today I notice that when I search for my site, its displayed as https:// Robots.txt blocks all content going to any secure page. Leaving me sort of clueless what I need to do to fix this. Not only does it pose a problem for some users who click, but I think its causing the homepage to have an indexing problem. Any ideas? Redirect the google bot only? Will a canonical tag fix this? Thx
Intermediate & Advanced SEO | | Southbay_Carnivorous_Plants0 -
What Sources to use to compile an as comprehensive list of pages indexed in Google?
As part of a Panda recovery initiative we are trying to get an as comprehensive list of currently URLs indexed by Google as possible. Using the site:domain.com operator Google displays that approximately 21k pages are indexed. Scraping the results however ends after the listing of 240 links. Are there any other sources we could be using to make the list more comprehensive? To be clear, we are not looking for external crawlers like the SEOmoz crawl tool but sources that would be confidently allow us to determine a list of URLs currently hold in the Google index. Thank you /Thomas
Intermediate & Advanced SEO | | sp800