Moz Q&A is closed.
After more than 13 years, and tens of thousands of questions, Moz Q&A closed on 12th December 2024. Whilst we’re not completely removing the content - many posts will still be possible to view - we have locked both new posts and new replies. More details here.
Reducing Booking Engine Indexation
-
Hi Mozzers,
I am working on a site with a very useful room booking engine. Helpful as it may be, all the variations (2 bedrooms, 3 bedrooms, room with a view, etc, etc,) are indexed by Google. Section 13 on Search Pagination in Dr. Pete's great post on Panda http://www.seomoz.org/blog/duplicate-content-in-a-post-panda-world speaks to our issue, but I was wondering since 2 (!) years have gone by, if there are any additional solutions y'all might recommend. We want to cut down on the duplicate titles and content and get the useful but not useful for SERPs online booking pages out of the index. Any thoughts?
Thanks for your help.
-
I love public Q&A because everyone gets to chip in, but nobody wants to share the domain in question (which is understandable) so that makes the job of answering a question really difficult.
Can you hide the actual domain name but provide some examples of URLs? For instance:
ourdomain.com/honolulu/four-seasons?rooms=4&view=0&page=1
Did you try any of Dr. Pete's suggestions? If not, I would implement one of those first, as they are still as relevant today as they were when he wrote them. Rel next/prev has received a bit more attention since then, but it only solves part of the problem if you're dealing with parameters beyond simple pagination (e.g. rooms, views, etc..).
From the information provided above I would probably go with a rel canonical tag to fix this issue.
I would not rely on a rel nofollow tag on links pointing to variants, as was suggested by Smarties, because Google is going to find those URLs regardless and a no follow tag on a link doesn't tell them not to index it.
Smarties #2 suggestion sounds good but I'd allow them to be followed. i.e. robots meta noindex,follow as opposed to noindex,nofollow. This allows pagerank from external links to flow through non-indexable URLs.
Good luck!
-
-
You could use rel=nofollow on links pointing to pages variations.
-
If you can you could also dynamically add a meta noindex, no follow, when a variant of the initial page is generated.
-
You could also add a link rel=canonical pointing to the initial page, this will tell bots that this page is the original page.
In other word, you have to tell crawlers when it is a page variant and that you don't want him to index them.
-
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
New Subdomain & Best Way To Index
We have an ecommerce site, we'll say at https://example.com. We have created a series of brand new landing pages, mainly for PPC and Social at https://sub.example.com, but would also like for these to get indexed. These are built on Unbounce so there is an easy option to simply uncheck the box that says "block page from search engines", however I am trying to speed up this process but also do this the best/correct way. I've read a lot about how we should build landing pages as a sub-directory, but one of the main issues we are dealing with is long page load time on https://example.com, so I wanted a kind of fresh start. I was thinking a potential solution to index these quickly/correctly was to make a redirect such as https://example.com/forward-1 -> https:sub.example.com/forward-1 then submit https://example.com/forward-1 to Search Console but I am not sure if that will even work. Another possible solution was to put some of the subdomain links accessed on the root domain say right on the pages or in the navigation. Also, will I definitely be hurt by 'starting over' with a new website? Even though my MozBar on my subdomain https://sub.example.com has the same domain authority (DA) as the root domain https://example.com? Recommendations and steps to be taken are welcome!
Intermediate & Advanced SEO | | Markbwc0 -
Can you index a Google doc?
We have updated and added completely new content to our state pages. Our old state content is sitting in a our Google drive. Can I make these public to get them indexed and provide a link back to our state pages? In theory it sounds like a great link building strategy... TIA!
Intermediate & Advanced SEO | | LindsayE1 -
Password Protected Page(s) Indexed
Hi, I am wondering if my website can get a penalty if some password protected pages are showing up when I search on google: site:www.example.com/sub-group/pass-word-protected-page That shows that my password protected page was indexed either before or after adding the password protection. I've seen people suggest no indexing the page. Is that the best method to take care of this? What if we are planning on pushing the page live later on? All of these pages have no title tag, meta description, image alt text, etc. Should I add them for each page? I am wondering what is the best step, especially if we are planning on pushing the page(s) live. Thanks for any help!
Intermediate & Advanced SEO | | aua0 -
E-Commerce Site Collection Pages Not Being Indexed
Hello Everyone, So this is not really my strong suit but I’m going to do my best to explain the full scope of the issue and really hope someone has any insight. We have an e-commerce client (can't really share the domain) that uses Shopify; they have a large number of products categorized by Collections. The issue is when we do a site:search of our Collection Pages (site:Domain.com/Collections/) they don’t seem to be indexed. Also, not sure if it’s relevant but we also recently did an over-hall of our design. Because we haven’t been able to identify the issue here’s everything we know/have done so far: Moz Crawl Check and the Collection Pages came up. Checked Organic Landing Page Analytics (source/medium: Google) and the pages are getting traffic. Submitted the pages to Google Search Console. The URLs are listed on the sitemap.xml but when we tried to submit the Collections sitemap.xml to Google Search Console 99 were submitted but nothing came back as being indexed (like our other pages and products). We tested the URL in GSC’s robots.txt tester and it came up as being “allowed” but just in case below is the language used in our robots:
Intermediate & Advanced SEO | | Ben-R
User-agent: *
Disallow: /admin
Disallow: /cart
Disallow: /orders
Disallow: /checkout
Disallow: /9545580/checkouts
Disallow: /carts
Disallow: /account
Disallow: /collections/+
Disallow: /collections/%2B
Disallow: /collections/%2b
Disallow: /blogs/+
Disallow: /blogs/%2B
Disallow: /blogs/%2b
Disallow: /design_theme_id
Disallow: /preview_theme_id
Disallow: /preview_script_id
Disallow: /apple-app-site-association
Sitemap: https://domain.com/sitemap.xml A Google Cache:Search currently shows a collections/all page we have up that lists all of our products. Please let us know if there’s any other details we could provide that might help. Any insight or suggestions would be very much appreciated. Looking forward to hearing all of your thoughts! Thank you in advance. Best,0 -
How can I make a list of all URLs indexed by Google?
I started working for this eCommerce site 2 months ago, and my SEO site audit revealed a massive spider trap. The site should have been 3500-ish pages, but Google has over 30K pages in its index. I'm trying to find a effective way of making a list of all URLs indexed by Google. Anyone? (I basically want to build a sitemap with all the indexed spider trap URLs, then set up 301 on those, then ping Google with the "defective" sitemap so they can see what the site really looks like and remove those URLs, shrinking the site back to around 3500 pages)
Intermediate & Advanced SEO | | Bryggselv.no0 -
Better to 301 or de-index 403 pages
Google WMT recently found and called out a large number of old unpublished pages as access denied errors. The pages are tagged "noindex, follow." These old pages are in Google's index. At this point, would it better to 301 all these pages or submit an index removal request or what? Thanks... Darcy
Intermediate & Advanced SEO | | 945010 -
Do internal links from non-indexed pages matter?
Hi everybody! Here's my question. After a site migration, a client has seen a big drop in rankings. We're trying to narrow down the issue. It seems that they have lost around 15,000 links following the switch, but these came from pages that were blocked in the robots.txt file. I was wondering if there was any research that has been done on the impact of internal links from no-indexed pages. Would be great to hear your thoughts! Sam
Intermediate & Advanced SEO | | Blink-SEO0 -
Google Indexing Feedburner Links???
I just noticed that for lots of the articles on my website, there are two results in Google's index. For instance: http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html and http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+thewebhostinghero+(TheWebHostingHero.com) Now my Feedburner feed is set to "noindex" and it's always been that way. The canonical tag on the webpage is set to: rel='canonical' href='http://www.thewebhostinghero.com/articles/tools-for-creating-wordpress-plugins.html' /> The robots tag is set to: name="robots" content="index,follow,noodp" /> I found out that there are scrapper sites that are linking to my content using the Feedburner link. So should the robots tag be set to "noindex" when the requested URL is different from the canonical URL? If so, is there an easy way to do this in Wordpress?
Intermediate & Advanced SEO | | sbrault740