WordPress - How to stop both http:// and https:// pages being indexed?
-
Just published a static page 2 days ago on WordPress site but noticed that Google has indexed both http:// and https:// url's. Usually I only get http:// indexed though.
Could anyone please explain why this may have happened and how I can fix? Thanks!
-
Just one adjustment to this - although I think David's right that the canonical tag can be a good solution. Although Google can index https: fine, the issue is whether you're creating duplicates. If you have duplicates, then it's possible that the https: version could be the one you want as canonical. In this case, it doesn't sound like it, but I just wanted to point that out.
Of course, long-term, you should sort out why these are being created. A desktop crawler like Xenu or Screaming Frog may be the best bet, but I'd hit the WordPress forums, too. Odds are it's a common issue. Typically, it happens when some deeper page (like a shopping cart) on a site is secure, and then the links are all relative ("/about.php", for example). Then, those links get crawled as both secure and non-secure.
Unfortunately, I'm not a WordPress expert, so I can only speak in generalities.
-
Thanks David, I feel like going out to buy some Swedish Fish for some reason now.
-
I actually just did a wealth of research on this topic a few days ago. Without going into the nitty gritty details, if the https is site-wide Google recommends a Rel="canonical" attribute (http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394) pointing to the non-secure http version. Google claims it can index https fine, but Matt Cutts said he would "lean towards pointing the canonical to the http version." Also, on the Rel="canonical" page Google says:
If you publish content on both http://www.example.com/product.php?item=swedish-fish and https://www.example.com/product.php?item=swedish-fish, you can specify the canonical version of the page. Create the element:
Add this link to the section of https://www.example.com/product.php?item=swedish-fish.
Make sure the canonical is on every page of your site.
Not sure why this may have happened, but it is creating duplicate content, which is why the canonical is necessary.
Hope that helps!
Thanks
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Any idea why pages are not being indexed?
Hi Everyone, One section on our website is not being indexed. The product pages are, but not some of the subcategories. These are very old pages, so thought it was strange. Here is an example one one: https://www.moregems.com/loose-cut-gemstones/prasiolite-loose-gemstones.html If you take a chunk of text, it is not found in Google. No issues in Bing/Yahoo, only Google. You think it takes a submission to Search Console? Jeff
Technical SEO | | vetofunk1 -
Gradual Drop in GWT Indexed Pages for large website
Hey all, I am working on SEO for a massive sports website. The information provided will be limited but I will give you as much context as possible. I just started digging into it and have found several on-page SEO issues of which I will fix when I get to the meat of it but this seems like something else could be going on. I have attached an image below. It doesn't seem like it's a GWT bug as reported at one point either as it's been gradually dropping over the past year. Also, there is about a 20% drop in traffic in Google Analytics over this time as well. This website has hundreds of thousands of pages of player profiles, sports team information and more all marked up with JSON-LD. Some of the on-page stuff that needs to be fixed are the h1 and h2, title tags and meta description. Also, some of the descriptions are pulled from wikipedia and linked to a "view more" area. Anchor text has "sign up" language as well. Not looking for a magic bullet but to be pointed in the right direction. Where should I start checking off to ensure I cover my bases besides the on page stuff above? There aren't any serious errors and I don't see any manual penalties. There are 4,300 404's but I have seen plenty of sites with that many 404's all of which still got traffic. It doesn't look like a sitemap was submitted to GWT and when I try submitting sitemap.xml, I get a 504 error (network unreachable). Thanks for reading. I am just getting started on this project but would like to spend as much time sharpening the axe before getting to work. lJWk8Rh
Technical SEO | | ArashG0 -
Http to https - Copy Disavow?
If the switch is made from http to https (with 301 redirects from http to https) should the disavow file be copied over in GWT so it is also uploaded against the https as well as the http version?
Technical SEO | | twitime0 -
Robots.txt on http vs. https
We recently changed our domain from http to https. When a user enters any URL on http, there is an global 301 redirect to the same page on https. I cannot find instructions about what to do with robots.txt. Now that https is the canonical version, should I block the http-Version with robots.txt? Strangely, I cannot find a single ressource about this...
Technical SEO | | zeepartner0 -
Which Pagination/Canonicalization Page Selection Approach Should be Used?
Currently working on a retail site that has a product category page with a series of pages related to each other i.e. page 1, page 2, page 3 and Show All page. These are being identified as duplicate content/title pages. I want to resolve this through the applications of pagination to the pages so that crawlers know that these pages belong to the same series. In addition to this I also want to apply canonicalization to point to one page as the one true result that rules them all. All pages have equal weight but I am leaning towards pointing at the ‘Show All’. Catch is that products consistently change meaning that I am sometimes dealing with 4 pages including Show All, and other times I am only dealing with one page (...so actually I should point to page 1 to play it safe). Silly question, but is there a hard and fast rule to setting up this lead page rule?
Technical SEO | | Oxfordcomma0 -
How do I get google to index the right pages with the right key word?
Hello I notice that even though I have a site map google is indexing the wrong pages under the wrong key words. As a result its not as relevant and is not ranking properly.
Technical SEO | | ursalesguru0 -
/forum/ or /hookah-forum/
I'm building a new website on Hookah.org. It will have a forum and blog. Should I put them in Hookah.org/hookah-forum/ and Hookah.com/hookah-blog/ or Hookah.org/forum and Hookah.org/blog I think /forum/ and /blog/ are easier for users but am not sure how much adding the word hookah helps with SEO.
Technical SEO | | Heydarian0 -
Pages not Indexed after a successful Google Fetch
I am trying to understand why google isn't indexing key content on my site. www.BeyondTransition.com is indexed and new pages show up in a couple of hours. My key content is 6 pages of information for each of 3000 events (driven by mySQL on a wordpress platform). These pages are reached via a search page, but no direct navigation from the home page. When I link to an event page from an indexed page it doesn't show up in search results. When I use fetch on webmaster tools the fetch is successful but is then not indexed - or if it does appear in results it's directed to the internal search page e.g. http://www.beyondtransition.com/site/races/course/race110003/ has been fetched and submitted with links but when I search for BeyondTransition Ironman Cozumel I get these results.... So what have I done wrong and how do I go about fixing it? All thoughts and advice appreciated Thanks Denis
Technical SEO | | beyondtransition0