Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
WordPress - How to stop both http:// and https:// pages being indexed?
-
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though.
Could anyone please explain why this may have happened and how I can fix? Thanks!
-
Just one adjustment to this - although I think David's right that the canonical tag can be a good solution. Although Google can index https: fine, the issue is whether you're creating duplicates. If you have duplicates, then it's possible that the https: version could be the one you want as canonical. In this case, it doesn't sound like it, but I just wanted to point that out.
Of course, long-term, you should sort out why these are being created. A desktop crawler like Xenu or Screaming Frog may be the best bet, but I'd hit the WordPress forums, too. Odds are it's a common issue. Typically, it happens when some deeper page (like a shopping cart) on a site is secure, and then the links are all relative ("/about.php", for example). Then, those links get crawled as both secure and non-secure.
Unfortunately, I'm not a WordPress expert, so I can only speak in generalities.
-
Thanks David, I feel like going out to buy some Swedish Fish for some reason now.
-
I actually just did a wealth of research on this topic a few days ago. Without going into the nitty gritty details, if the https is site-wide Google recommends a Rel="canonical" attribute (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394) pointing to the non-secure http version. Google claims it can index https fine, but Matt Cutts said he would "lean towards pointing the canonical to the http version." Also, on the Rel="canonical" page Google says:
If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the element:
Add this link to the section of https://www.example.com/product.php?item=swedish-fish.
Make sure the canonical is on every page of your site.
Not sure why this may have happened, but it is creating duplicate content, which is why the canonical is necessary.
Hope that helps!
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Keywords are indexed on the home page
Hello everyone, For one of our websites, we have optimized for many keywords. However, it seems that every keyword is indexed on the home page, and thus not ranked properly. This occurs only on one of our many websites. I am wondering if anyone knows the cause of this issue, and how to solve it. Thank you.
Technical SEO | | Ginovdw1 -
What do you do with product pages that are no longer used ? Delete/redirect to category/404 etc
We have a store with thousands of active items and thousands of sold items. Each product is unique so only one of each. All products are pinned and pushed online ... and then they sell and we have a product page for a sold item. All products are keyword researched and often can rank well for longtail keywords Would you :- 1. delete the page and let it 404 (we will get thousands) 2. See if the page has a decent PA, incoming links and traffic and if so redirect to a RELEVANT category page ? ~(again there will be thousands) 3. Re use the page for another product - for example a sold ruby ring gets replaces with ta new ruby ring and we use that same page /url for the new item. Gemma
Technical SEO | | acsilver0 -
Nofollow/Noindex Category Listing Pages with Filters
Our e-commerce site currently has thousands of duplicate pages indexed because category listing pages with all the different filters selected are indexed. So, for example, you would see indexed: example.com/boots example.com/boots/black example.com/boots/black-size-small etc. There is a logic in place that when more than one filter is selected all the links on the page are nofollowed, but Googlebot is still getting to them, and the variations are being indexed. At this point I'd like to add 'noindex' or canonical tags to the filtered versions of the category pages, but many of these filtered pages are driving traffic. Any suggestions? Thanks!
Technical SEO | | fayfr0 -
Does Google index internal anchors as separate pages?
Hi, Back in September, I added a function that sets an anchor on each subheading (h[2-6]) and creates a Table of content that links to each of those anchors. These anchors did show up in the SERPs as JumpTo Links. Fine. Back then I also changed the canonicals to a slightly different structur and meanwhile there was some massive increase in the number of indexed pages - WAY over the top - which has since been fixed by removing (410) a complete section of the site. However ... there are still ~34.000 pages indexed to what really are more like 4.000 plus (all properly canonicalised). Naturally I am wondering, what google thinks it is indexing. The number is just way of and quite inexplainable. So I was wondering: Does Google save JumpTo links as unique pages? Also, does anybody know any method of actually getting all the pages in the google index? (Not actually existing sites via Screaming Frog etc, but actual pages in the index - all methods I found sadly do not work.) Finally: Does somebody have any other explanation for the incongruency in indexed vs. actual pages? Thanks for your replies! Nico
Technical SEO | | netzkern_AG0 -
How to check if an individual page is indexed by Google?
So my understanding is that you can use site: [page url without http] to check if a page is indexed by Google, is this 100% reliable though? Just recently Ive worked on a few pages that have not shown up when Ive checked them using site: but they do show up when using info: and also show their cached versions, also the rest of the site and pages above it (the url I was checking was quite deep) are indexed just fine. What does this mean? thank you p.s I do not have WMT or GA access for these sites
Technical SEO | | linklander0 -
How do I redirect the Author archive page in Wordpress?
If you do a search for my name on Google, the first result is the author archive page of my Wordpress blog. I would like to redirect the author page to my "about me" page but cannot add a 301 as the author page is created dynamically in Wordpress. Anyone know how I can do this?
Technical SEO | | richdan0 -
How to stop google from indexing specific sections of a page?
I'm currently trying to find a way to stop googlebot from indexing specific areas of a page, long ago Yahoo search created this tag class=”robots-nocontent” and I'm trying to see if there is a similar manner for google or if they have adopted the same tag? Any help would be much appreciated.
Technical SEO | | Iamfaramon0 -
How to Redirect all inactive Feed to a specific Wordpress page
Hi Guys, I've been doing much cleaning on my blog lately and deleted numerous categories including their posts with low quality content. After deleting the categories, Google Webmaster Tools is reporting some 404 errors about the RSS Feeds for the deleted categories. I've created a 404.php file inside my theme and placed the following code header("HTTP/1.1 301 Moved Permanently");
Technical SEO | | Trigun
header("Location: http://www.mysite.com/My404Page/", true, 301);
exit();
?> this have catched all 404 errors and redirected them to the specific page. Unfortunately, it could not catch the inactive feed urls. Is there a way to do this so that all inactive feeds will be redirected to my 404 page? Thanks in advance....0