De-indexing product "quick view" pages
-
Hi there,
The e-commerce website I am working on seems to index all of the "quick view" pages (which normally occur as iframes on the category page) as their own unique pages, creating thousands of duplicate pages / overly-dynamic URLs. Each indexed "quick view" page has the following URL structure:
www.mydomain.com/catalog/includes/inc_productquickview.jsp?prodId=89514&catgId=cat140142&KeepThis=true&TB_iframe=true&height=475&width=700
where the only thing that changes is the product ID and category number.
Would using "disallow" in Robots.txt be the best way to de-indexing all of these URLs? If so, could someone help me identify how to best structure this disallow statement? Would it be:
Disallow: /catalog/includes/inc_productquickview.jsp?prodID=*
Thanks for your help.
-
Just to add, if you block URLs in robots.txt they wont actually get deindexed. They will be for all intents and purposes be blocked (wont cause duplicate content issues etc) but they will drop into the omitted results:
_In order to show you the most relevant results, we have omitted some entries very similar to the 13 already displayed._If you like, you can repeat the search with the omitted results included. And will look like this in the SERPS (see attachment).If you want them removed from the SERPs you will need to use the robots NOINDEX meta tag, or use GWMT as William advised.
The disallow entry you posted will block these pages, as long as they all start with that way. Although you don't actually need the trailing wild card as that gets ignored, you can just leave it open. Google robots.txt specs
-
Thanks William. I think I will stick with the Robots file in this case. I am nervous about using that parameter feature in case ?prodID is used in any other URL that should be indexed.
-
You can use that in your robots.txt, which should work on crawls.
Or
you can also go into WMT and setup your parameters, in this case would be ?prodID.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Paginated Pages Page Depth
Hi Everyone, I was wondering how Google counts the page depth on paginated pages. DeepCrawl is showing our primary pages as being 6+ levels deep, but without the blog or with an infinite scroll on the /blog/ page, I believe it would be only 2 or 3 levels deep. Using Moz's blog as an example, is https://moz.com/blog?page=2 treated to be on the same level in terms of page depth as https://moz.com/blog? If so is it the https://site.comcom/blog" /> and https://site.com/blog?page=3" /> code that helps Google recognize this? Or does Google treat the page depth the same way that DeepCrawl is showing it with the blog posts on page 2 being +1 in page depth compared to the ones on page 1, for example? Thanks, Andy
Intermediate & Advanced SEO | | AndyRSB0 -
Google Indexing Of Pages As HTTPS vs HTTP
We recently updated our site to be mobile optimized. As part of the update, we had also planned on adding SSL security to the site. However, we use an iframe on a lot of our site pages from a third party vendor for real estate listings and that iframe was not SSL friendly and the vendor does not have that solution yet. So, those iframes weren't displaying the content. As a result, we had to shift gears and go back to just being http and not the new https that we were hoping for. However, google seems to have indexed a lot of our pages as https and gives a security error to any visitors. The new site was launched about a week ago and there was code in the htaccess file that was pushing to www and https. I have fixed the htaccess file to no longer have https. My questions is will google "reindex" the site once it recognizes the new htaccess commands in the next couple weeks?
Intermediate & Advanced SEO | | vikasnwu1 -
How to handle brand description on product pages?
Hi Mozzers, Hope you're doing good. I have a content placement related question. Assume, I have 1000 products of brand A, 1000 of brand B, and so on. Now, if I want to put brand specific 200-words description on each of these product pages. I'm creating duplicate content across the site by putting absolutely same brand description on these product pages i.e brand A description on first 1000 pages, brand B description on next 1000 products and so on. Looking for an expert advice around placement of content here i.e how can I add brand description on product pages and avoid duplicate content penalty? Any help?
Intermediate & Advanced SEO | | _nitman0 -
Removing pages from index
My client is running 4 websites on ModX CMS and using the same database for all the sites. Roger has discovered that one of the sites has 2050 302 redirects pointing to the clients other sites. The Sitemap for the site in question includes 860 pages. Google Webmaster Tools has indexed 540 pages. Roger has discovered 5200 pages and a Site: query of Google reveals 7200 pages. Diving into the SERP results many of the pages indexed are pointing to the other 3 sites. I believe there is a configuration problem with the site because the other sites when crawled do not have a huge volume of redirects. My concern is how can we remove from Google's index the 2050 pages that are redirecting to the other sites via a 302 redirect?
Intermediate & Advanced SEO | | tinbum0 -
Why is page still indexing?
Hi all, I have a few pages that - despite having a robots meta tag and no follow, no index, they are showing up in Google SERPs. In troubleshooting this with my team, it was brought up that another page could be linking to these pages and causing this. Is that plausible? How could I confirm that? Thanks,
Intermediate & Advanced SEO | | SSFCU
Sarah0 -
Putting "noindex" on a page that's in an iframe... what will that mean for the parent page?
If I've got a page that is being called in an iframe, on my homepage, and I don't want that called page to be indexed.... so I put a noindex tag on the called page (but not on the homepage) what might that mean for the homepage? Nothing? Will Google, Bing, Yahoo, or anyone else, potentially see that as a noindex tag on my homepage?
Intermediate & Advanced SEO | | Philip-DiPatrizio0 -
Home page not being indexed
Hi Moz crew. I have two sites (one is a client's and one is mine). They are both Wordpress sites and both are hosted on WP Engine. They have both been set up for a long time, and are "on-page" optimized. Pages from each site are indexed, but Google is not indexing the homepage for either site. Just to be clear - I can set up and work on a Wordpress site, but am not a programmer. Both seem to be fine according to my Moz dashboard. I have Webmaster tools set up for each - and as far as I can tell (definitely not an exper in webmaster tools) they are okay. I have done the obvious and checked that the the box preventing Google from crawling is not checked, and I believe I have set up the proper re-directs and canonicals.Thanks in advance! Brent
Intermediate & Advanced SEO | | EchelonSEO0 -
Getting Google in index but display "parent" pages..
Greetings esteemed SEO experts - I'm hunting for advice: We operate an accommodation listings website. We monetize by listing position in search results, i.e. you pay more to get higher placing in the page. Because of this, while we want individual detailed listing pages to be indexed to get the value of the content, we don't really want them appearing in Google search results. We ideally want the "content value" to be attributed to the parent page - and google to display this as the link in the search results instead of the individual listing. Any ideas on how to achieve this?
Intermediate & Advanced SEO | | AABAB0