URL with query string being indexed over it's parent page?
-
I noticed earlier this week that this page - https://www.ihasco.co.uk/courses/detail/bomb-threats-and-suspicious-packages?channel=care was being indexed instead of this page - https://www.ihasco.co.uk/courses/detail/bomb-threats-and-suspicious-packages for its various keywords
We have rel=canonical tags correctly set up and all internal links to these pages with query strings are nofollow, so why is this page being indexed?
Any help would be appreciated
-
If the Disallowed URLs are linked to externally, or internally from a page that isn't Disallowed, they'll appear in the index with the snippet text you've quoted. If you want to ... Typically, that page will already be populated with parameters the crawler has discovered, though you can specify them manually too.
-
I would suggest adding some URL parameters to your Google Search Console for the domain, it will help tell Google how to crawl and index your site... https://support.google.com/webmasters/answer/6080550?hl=en
https://www.shoutmeloud.com/google-webmaster-tool-added-url-parameter-option-seo.html - a useful read.
https://www.hallaminternet.com/avoiding-the-seo-pitfalls-of-url-parameters/ - another useful read.
Adding canonical's as you have already mentioned, would have been my second bit of advice. I am not sure why you have no-followed the links.
Cheers
Tim
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Hi! I'm wondering whether for keyword SEO - a url should be www.salshoes.com/shoes/mens/day-wear (so with a few parent categories) or www.salshoes.com/shoes-mens-day-wear is ok for on page optimization?
Hi! I'm wondering whether for keyword SEO - a url should be www.salshoes.com/shoes/mens/day-wear (so with a few parent categories) or www.salshoes.com/shoes-mens-day-wear is ok for on page optimization? Hi! I'm wondering whether for keyword SEO - a url should be www.salshoes.com/shoes/mens/day-wear (so with a few parent categories) or www.salshoes.com/shoes-mens-day-wear is ok for on page optimization?
Technical SEO | | SalSantaCruz0 -
Should We Index These Category Pages?
Currently we have marked category pages like http://www.yournextshoes.com/celebrities/kim-kardashian/ as follow/noindex as they essentially do not include any original content. On the other hand, for someone searching for Kim Kardashian shoes, it's a highly relevant page as we provide links to all the Kim Kardashian shoe sightings that we have covered. Should we index the category pages or leave them unindexed?
Technical SEO | | Jantaro0 -
How Does Google's "index" find the location of pages in the "page directory" to return?
This is my understanding of how Google's search works, and I am unsure about one thing in specific: Google continuously crawls websites and stores each page it finds (let's call it "page directory") Google's "page directory" is a cache so it isn't the "live" version of the page Google has separate storage called "the index" which contains all the keywords searched. These keywords in "the index" point to the pages in the "page directory" that contain the same keywords. When someone searches a keyword, that keyword is accessed in the "index" and returns all relevant pages in the "page directory" These returned pages are given ranks based on the algorithm The one part I'm unsure of is how Google's "index" knows the location of relevant pages in the "page directory". The keyword entries in the "index" point to the "page directory" somehow. I'm thinking each page has a url in the "page directory", and the entries in the "index" contain these urls. Since Google's "page directory" is a cache, would the urls be the same as the live website (and would the keywords in the "index" point to these urls)? For example if webpage is found at wwww.website.com/page1, would the "page directory" store this page under that url in Google's cache? The reason I want to discuss this is to know the effects of changing a pages url by understanding how the search process works better.
Technical SEO | | reidsteven750 -
301ing 404's
Hey guys, I am currently in the process of redirecting some of my 404 pages to pages like my home page. Before I do that, I am assessing the link value of the 404 pages. My question is what do you do with the 404 pages which appear to have low quality links, do you really want to redirect them to an important page on your site? What should I do with these 404 pages? CheersAdam
Technical SEO | | Adamshowbiz0 -
Writing of url query strings to be seo frinedly
I understand the basic concepts of url write and creating inbound and outbound rules. I understand the creating of rules to rewrite url query strings so that it’s readable and seo friendly. It’s simple when dealing with a small number of pages and database records. (Microsoft Server, asp.net 4.0, IIS 7) However, I need to understand the concept to handle this: Viz the following: We have a database of 10,000+ establishments, 650+ cities, 400+ suburbs. Each establishment can be searched for by country, province, city and suburb. The search results show establishments that match the search criteria. Each establishment has its own unique id. Each establishment in the search results table has a link to the establishments detailed profile aspx page. The link is a query string such as http://www.ubuntustay.com/detailed.aspx?id=4 which opens the establishments profile. We need to rewrite the url to be something like: http://www.ubuntustay.com/detailed.aspx/capetown/westerncape/capetown/campsbay/diamondhouse which should still open the same establishment profile as the above query string. I can manually create a rule for this one example above without a problem. But there are over 10,000 establishments, all in different provinces, cities and suburbs. Surely we don’t manually generate a rewrite rule for each establishment? The resulting .htaccess will be rather large(?!) Therefore my questions are: How do I create url rewrite rules for dynamic query strings that originate from a large dataset? How do I translate the id number into the equivalent <country>/<province>/<city>/<suburb>/ <establishment>syntax?</establishment></suburb></city></province></country> Do I have to wire-up the global.asax so that every incoming requests extracts the country, province, city and suburb based on the establishment id which seem a bit cumbersome(?). If you’re wondering how I currently do it (it works but it’s not very portable or efficient): For each establishment which is included on the search results I simply construct the link url as: http://www.ubuntustay.com/detailed.aspx/4/Diamond%20House/Camps%20Bay/Cape%20Town On the detailed.aspx page load I simply extract the record id (4 in the example above) from the querystring and select that record from the db. Claude, what I’m looking for is advice on the best approach on how to create these rewrite rules and would be grateful if you can have one of your SEO friends lend their advice and experience. Any web resources that show the above techniques would be great. I’m not really looking for simple web links to url rewriting overviews…I have plenty of those. It’s the detail on the specific requirement above that I need please.
Technical SEO | | claudeSteyn0 -
Duplicate pages in Google index despite canonical tag and URL Parameter in GWMT
Good morning Moz... This is a weird one. It seems to be a "bug" with Google, honest... We migrated our site www.three-clearance.co.uk to a Drupal platform over the new year. The old site used URL-based tracking for heat map purposes, so for instance www.three-clearance.co.uk/apple-phones.html ..could be reached via www.three-clearance.co.uk/apple-phones.html?ref=menu or www.three-clearance.co.uk/apple-phones.html?ref=sidebar and so on. GWMT was told of the ref parameter and the canonical meta tag used to indicate our preference. As expected we encountered no duplicate content issues and everything was good. This is the chain of events: Site migrated to new platform following best practice, as far as I can attest to. Only known issue was that the verification for both google analytics (meta tag) and GWMT (HTML file) didn't transfer as expected so between relaunch on the 22nd Dec and the fix on 2nd Jan we have no GA data, and presumably there was a period where GWMT became unverified. URL structure and URIs were maintained 100% (which may be a problem, now) Yesterday I discovered 200-ish 'duplicate meta titles' and 'duplicate meta descriptions' in GWMT. Uh oh, thought I. Expand the report out and the duplicates are in fact ?ref= versions of the same root URL. Double uh oh, thought I. Run, not walk, to google and do some Fu: http://is.gd/yJ3U24 (9 versions of the same page, in the index, the only variation being the ?ref= URI) Checked BING and it has indexed each root URL once, as it should. Situation now: Site no longer uses ?ref= parameter, although of course there still exists some external backlinks that use it. This was intentional and happened when we migrated. I 'reset' the URL parameter in GWMT yesterday, given that there's no "delete" option. The "URLs monitored" count went from 900 to 0, but today is at over 1,000 (another wtf moment) I also resubmitted the XML sitemap and fetched 5 'hub' pages as Google, including the homepage and HTML site-map page. The ?ref= URls in the index have the disadvantage of actually working, given that we transferred the URL structure and of course the webserver just ignores the nonsense arguments and serves the page. So I assume Google assumes the pages still exist, and won't drop them from the index but will instead apply a dupe content penalty. Or maybe call us a spam farm. Who knows. Options that occurred to me (other than maybe making our canonical tags bold or locating a Google bug submission form 😄 ) include A) robots.txt-ing .?ref=. but to me this says "you can't see these pages", not "these pages don't exist", so isn't correct B) Hand-removing the URLs from the index through a page removal request per indexed URL C) Apply 301 to each indexed URL (hello BING dirty sitemap penalty) D) Post on SEOMoz because I genuinely can't understand this. Even if the gap in verification caused GWMT to forget that we had set ?ref= as a URL parameter, the parameter was no longer in use because the verification only went missing when we relaunched the site without this tracking. Google is seemingly 100% ignoring our canonical tags as well as the GWMT URL setting - I have no idea why and can't think of the best way to correct the situation. Do you? 🙂 Edited To Add: As of this morning the "edit/reset" buttons have disappeared from GWMT URL Parameters page, along with the option to add a new one. There's no messages explaining why and of course the Google help page doesn't mention disappearing buttons (it doesn't even explain what 'reset' does, or why there's no 'remove' option).
Technical SEO | | Tinhat0 -
Best way to handle indexed pages you don't want indexed
We've had a lot of pages indexed by google which we didn't want indexed. They relate to a ajax category filter module that works ok for front end customers but under the bonnet google has been following all of the links. I've put a rule in the robots.txt file to stop google from following any dynamic pages (with a ?) and also any ajax pages but the pages are still indexed on google. At the moment there is over 5000 pages which have been indexed which I don't want on there and I'm worried is causing issues with my rankings. Would a redirect rule work or could someone offer any advice? https://www.google.co.uk/search?q=site:outdoormegastore.co.uk+inurl:default&num=100&hl=en&safe=off&prmd=imvnsl&filter=0&biw=1600&bih=809#hl=en&safe=off&sclient=psy-ab&q=site:outdoormegastore.co.uk+inurl%3Aajax&oq=site:outdoormegastore.co.uk+inurl%3Aajax&gs_l=serp.3...194108.194626.0.194891.4.4.0.0.0.0.100.305.3j1.4.0.les%3B..0.0...1c.1.SDhuslImrLY&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.&fp=ff301ef4d48490c5&biw=1920&bih=860
Technical SEO | | gavinhoman0 -
Index page 404 error
Crawl Results show there is 404 error page which is index.htmk **it is under my root, ** http://mydomain.com/index.htmk I have checked my index page on the server and my index page is index.HTML instead of index.HTMK. Please help me to fix it
Technical SEO | | semer0