Page missing from Google index
-
Hi all,
One of our most important pages seems to be missing from the Google index.
A number of our collections pages (e.g., http://perfectlinens.com/collections/size-king) are thin, so we've included a canonical reference in all of them to the main collection page (http://perfectlinens.com/collections/all).
However, I don't see the main collection page in any Google search result. When I search using "info:http://perfectlinens.com/collections/all", the page displayed is our homepage. Why is this happening?
The main collection page has a rel=canonical reference to itself (auto-generated by Shopify so I can't control that).
Thanks!
-
In general, for link value to transfer either through 301s or canonicals, the content of the page needs to be nearly identical. See Cyrus' post for more. And canonicals are not always followed by Google, they are just a "hint", so it's unlikely you'll pass much value that way.
-
Dan, thanks for that response! I wasn't aware that our homepage had a canonical reference to our category page. On closer examination, I found that our category page in return had a canonical reference to our homepage. Messed up!
I've fixed that, and now resubmitted that page to Google using Search Console. Hopefully that will fix our issues.
Just one last question - why do you prefer noindex over canonical? If I had some backlinks to a thin category page (e.g., /collections/twin), wouldn't it be better to 'transfer' those benefits to our main category page (/collections/all) using canonical references?
Thanks again
-
Hello
Ahh ok, missed that detail.
I created a quick video for you ---> http://screencast.com/t/IKkEikyr
I think this is a bit of a complicated situation which will be tough to diagnose and fix in a Q&A thread. I would suggest catalog the different settings of your site in a spreadsheet like I show in the video.
Essentially, the canonical settings are just "suggestions" for Google and not "directives" so they will ignore them if they think they have been set in error.
I would start by clearly defining the end result you want (what pages should be crawled, and what should be indexed) and work backwards from there to apply the right settings.
I would probably try to use noindex, robots.txt etc before resorting to a canonical.
-
Hi Dan,
Thanks for your response. The page that you see when you type in our category page is in fact, our home page. e.g., when I do info:page A, or cache: page A, the result is for page B. Why is this happening if page A does not have a canonical reference or a redirect of any kind to B?
Thanks.
-
FYI - to check if a page is indexed try typing site:http://perfectlinens.com/collections/all into the Google search bar, or cache:http://perfectlinens.com/collections/all into your browser.
-
Hi There!
That page is in fact indexed and cached for me! Can you check again? And let me know?
-Dan
-
Patrick, thank you for your response.
1. The reason we're using canonical references on those pages is because they are almost identical copies of each other. In the future, we'll create some content on them and they can then stand by themselves.
2. But the original question remains - why is the main page (http://perfectlinens.com/collections/all) missing from the Google index? It's been on the site for a long time, it's one of our most important pages, it's in our sitemap, and robots.txt is not blocking it.
Thank you for your other tips though - I appreciate them, and will put them on our to-do list.
-
Hi there
First, those pages (size-king) should be canonicalized to their own pages, not canonicaling back to the "all" pages. This could be a potentially bad customer experience and you could be missing out on a LOT of organic traffic if some of those product pages are targeting high volume, low competition keywords / variations.
I would work on expanding the content on those product pages and implementing Schema. You have a lot of opportunities to be implementing these tags which will also help your search visibility.
Lastly, depending on when you implemented these canonical tags and your sitemap, Google and other search engines could still be indexing them. When did you upload your sitemap / implement canonical tags? Also, have you submitted these sitemaps to Google and Bing? I recommend you do so if you didn't!
And always make sure your robots.txt and meta tags aren't inadvertently blocking key pages from search! This is an often overlooked area in SEO!
But more than anything - work on that content for your product, canonical tag them to their pages, and add schema. It will make a world a difference!
Hope this helps! Good luck!
Patrick
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
My Website's Home Page is Missing on Google SERP
Hi All, I have a WordPress website which has about 10-12 pages in total. When I search for the brand name on Google Search, the home page URL isn't appearing on the result pages while the rest of the pages are appearing. There're no issues with the canonicalization or meta titles/descriptions as such. What could possibly the reason behind this aberration? Looking forward to your advice! Cheers
Technical SEO | | ugorayan0 -
Why wont google Index this page?
A week ago i accidentally changed this page settings in my CMS to "disable & dont index" as i was going to replace this page with another, but this didnt happen, but i forgot to switch the settings back! http://www.over50choices.co.uk/funeral-planning/funeral-plans Anyhow in an effort to get it back up quickly i submitted in GWTs but its still not indexed. When i use several SEO on page checking tools it has the Meta Title data as "Form" and not the correct title. Any ideas please? Yours frustrated Ash
Technical SEO | | AshShep10 -
Home page not indexed by any search engines
We are currently having an issue with our homepage not being indexed by any search engines. We recently transferred our domain to Godaddy and there was an issue with the DNS. When we typed our url into Google like this "https://www.mysite.com" nothing from the site came up in the search results, only our social media profiles. When we typed our url into Google like this "mysite.com" we were sent to a GoDaddy parked page. We've been able to fix the issue over at Godaddy and the url "mysite.com" is not being redirected to "https://mysite.com" but, Google and the other search engines have yet to respond. I would say our fix has been in place for at least 72 hours. Do I need to give this more time? I would think that at lease one search engine would have picked up on the change by now and would start indexing the site properly.
Technical SEO | | bcglf1 -
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Please advise.
My beta site (beta.website.com) has been inadvertently indexed. Its cached pages are taking traffic away from our real website (website.com). Should I just "NO INDEX" the entire beta site and if so, what's the best way to do this? Are there any other precautions I should be taking? Please advise.
Technical SEO | | BVREID0 -
Website Migration - Very Technical Google "Index" Question
This is my understanding of how Google's search works, and I am unsure about one thing in specifc: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" connects to the "page directory". I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I ask is I am starting to work with a client who has a newly developed website. The old website domain and files were located on a GoDaddy account. The new websites files have completely changed location and are now hosted on a separate GoDaddy account, but the domain has remained in the same account. The client has setup domain forwarding/masking to access the files on the separate account. From what I've researched domain masking and SEO don't get along very well. Not only can you not link to specific pages, but if my above assumption is true wouldn't Google have a hard time crawling and storing each page in the cache?
Technical SEO | | reidsteven750 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
Unnecessary pages getting indexed in Google for my blog
I have a blog dapazze.com and I am suffering from a problem for a long time. I found out that Google have indexed hundreds of replytocom links and images attachment pages for my blog. I had to remove these pages manually using the URL removal tool. I had used "Disallow: ?replytocom" in my robots.txt, but Google disobeyed it. After that, I removed the parameter from my blog completely using the SEO by Yoast plugin. But now I see that Google has again started indexing these links even after they are not present in my blog (I use #comment). Google have also indexed many of my admin and plugin pages, whereas they are disallowed in my robots.txt file. Have a look at my robots.txt file here: http://dapazze.com/robots.txt Please help me out to solve this problem permanently?
Technical SEO | | rahulchowdhury0 -
Diagnostics say I'm missing Page titles... but I am not?
I've been running a crawl of one of our new site builds for a couple of weeks. The Diagnostics picked up a couple of issues, which was great, but it's saying we're missing Page Titles and Descriptions on pages that we have Page Titles and Descriptions. Anyone come across this before?
Technical SEO | | niamhomahony0