Prevent indexing of dynamic content
-
Hi folks!
I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding:
dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a:
within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page?
Dino
-
Hello Steven,
Thank you for providing another perspective. However, all factors considered, I agree with Shane's approach on this one. The pages add very little merit to the site and exist primarily to provide the site users with eye-candy (e.g. photos of classic cars).
-
Just personally, I would still deindex or canonical them - they are just pages with a few images - so not of much value and unless all titles and descriptions are targeting varying keywords and content is added, they will canabalize eachother, and possibly even drag down the site due to 100's of pages of thin content....
So actually from an SEO perspective it probably IS better to deindex or canonical 3 - 5 or so years ago, maybe the advice would have been keep them and keyword target - but not in the age of content
(unless the images were optimized for image searches for sale able products (but I do not think it is)
-
Hi Dino,
I know this won't solve the immediate problem you asked for, but wouldn't it be better for your client's site (and for SEO) to alter the PHP so that the title and meta data description are replaced with variables that can also be dynamic, depending on whichever of the 100 or so pages gets created?
That way, rather than worrying about a robot seeing 100 pages as duplicate content, it could see 100 pages as 100 pages.
-
It depends on how the pages are being created (I would assume it is off of a template page)
So within the template of this dynamically created page you would place
But if this is the global template - you cannot do this as it will noindex every page which of course is bad.
If you want to PM me the URL of the page I can take a look at your code, and see what is going on and how to recitify, as right now i think we are talking about the same principles, but different words are being used.
It really is pretty straightforward. (what I am saying) The pages that you want to be not indexed DO NOT need a nofollow they need a meta noindex
But there are many variables, as if you have already robot.txt disallowed the directory, then no bot will go there to get the updated noindex directive....
If there is no way to add a meta noindex then you need to nofollow and put in for a manual removal
-
I completely understand and agree with all points you have conveyed. However, I am not certain as to the best approach to "noindex" the urls which are being created dynamically from within the static html page? Maybe I am making this more complex than it needs to be...
-
So it is the pages themselves that are dynamically created you want out of index, not the page the contains the links?
If this is so ---
noindex the pages that are created dynamically
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it.
Nofollow does not remove from index, it only tells the bot not to pass authority, as it is still feasible that the bot will crawl the link, so without the noindex, nofollow is not the correct directive due to the page (even though nofollowed) is still being reached and indexed.
PS. also if you have the nofollow on the links, you may want to remove it, so the bots will go straight through to the page and grab the noindex directive, but if you wanted to try to not let any authority "evaporate" you can continue to nofollow, but you may need to request the dynamically generated pages (URLS) be removed using webmaster tools.
-
The goal is to have the page remain in the index, but not follow any dynamically generated links on the page. The nofollow directive (in place for months) has not done the job.
-
?
If a link is coming into the page, and you have Noindex, Nofollow - this would remove from index and prevent the following of any links -
This is NOT instant, and can take months to occur depending on depth of page, crawl schedule ect... (you can try to speed it up by using webmaster tools to remove the URL)
What is the goal You are attempting to achieve?
To get the page out of index, but still followed?
Or remain in index, but just not follow links on page?
?
-
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it. I agree that the noindex removes the page, but I wasn't certain if it prevented crawling of the page, as I have read mixed opinions on this.
I just thought of something else... perhaps an external url is linking to this page - allowing it to be crawled. I am off to examine this perspective.
Thanks for your response!
-
noindex will only remove from Index and dissallow the act of indexing the specific page (or pages created off template) you place the tag in upon the next page crawl.
Bots will still follow the page, and follow any links that are readable as long as there is not a nofollow directive.
I am not sure I fully understand the situation, so I would not say this is my "reccomendation" but an answer to the specific question.....
but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Index / Monthly Click Number
Hi,
On-Page Optimization | | HypermediaSystems
This is a general question, so sorry in advance if inappropriate. Once I was told, in large scale EC / Forum Site,
the following number should be around 1,
and if it is below 1, it is a good sign ... Google Indexed Page Number / Monthly ( 30days ) Click Number I was told this is just a general idea, and real world situation varies, then
if you don't have any standard, this could be a start. (not dogmatic rules, just reference) Does this sounds about right? or do you have any other formula? I was tasked to do the site wide SEO, and diagnose the general state of SEO-wellness/fitness..
and right now, the number is 1.5, so I am about to report we can do more to get more SERP presence or something... If you guys point me relevant blog article / Q&A forum, I would really appreciate. Thanks!0 -
Duplicate Content on our own website
Our website sells tickets for events. We also have an news articles section with information about events / artists / venues. From time to time we release a product page and a related news article on a separate page. Some of the content in the news article would be perfect for our product page. Essentially its our product page we want too rank. Would it harm our SEO if we had some of the same content on both of these pages?
On-Page Optimization | | Alexogilvie0 -
Duplicate Content Indentification Tools
Does anyone have a recommendation for a good tool that can identify which elements on a page are duplicated content? I use Moz Analytics to determine which pages have the duplicated content on them, but it doesn't say which pieces of text or on-page elements are in fact considered to be duplicate. Thanks Moz Community in advance!
On-Page Optimization | | EmpireToday0 -
Duplicate content list by SEOMOZ
Hi Friends, I am seeing lot of duplicate (about 10%) from the crawl report of SEOMOZ. The report says, "Duplicate Page Content" But the urls it listed have different title, different url and also different content. I am not sure how to fix this issue.. My site has both Indian cinema news and photo gallery. The problme mainly coming in photo gallery posts. for example: this is the main url of a post. apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos . But in this post, each image is a link to its enlarged images (default wordpress). The problem is coming with each individual image with in this post. examples of SEOMOZ report 3 individual urls as duplicate content...from the same above post.: http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-4 http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-3 http://apgossips.com/2012/12/18/telugu-actress-poonam-kaur-photos/poonam-kaur-hot-photo-shoot-stills-2 Some body please advise me.. Appreciate your help.
On-Page Optimization | | ksnath0 -
Product category content!? what should it include?
Hello everyone!, I consider myself a rookie... so... please, excuse me if this is super basic or dumb!. I'm working on a ecommerce web (family business!)... and i've got this doubt. Say you've got architected your site this way...: site.com/category
On-Page Optimization | | jleandroperez
site.com/category/model_1
site.com/category/model_2 I'm mainly interested in getting the category webpages to rank high. The problem i've got is... what to put in the CATEGORY webpage!. Suppose you sale office furniture... and the category is 'chairs'... if you add content there, it won't be useful. What do you suggest me to add there?. ====== NOTE: My 'categories' webpage is split vertically, so you've got an image gallery on the left, and the product description on the right. So all of my product pages look a bit alike... and the 'category' itself has a placeholder on the right. I suspect that's why i'm not getting good rankings! THANKS in advance.0 -
When it comes to duplicate page content how do I deal with correcting it. Its a dynamic e commerce site.
I am under the impression that with ecommerce sites this happens often and that there's a plug in or just simply not worry about it since queries will often find similar conent.
On-Page Optimization | | Wayne_c0 -
tagged as duplicate content?
Hello folks, I'm new to SEOmoz . I was looking at our Crawl Diagnostics and found that some of our blog posts that have been commented on were tagged as duplicate content. For example: http://thankyouregistry.com/blog/remarriages-and-gift-registries/ http://thankyouregistry.com/blog/remarriages-and-gift-registries/comment-page-1/ I'm unsure how to fix these, so any ideas would be appreciated. Thanks a lot!
On-Page Optimization | | GiftReg0 -
Dynamic parameters
Our site has numerous filters and on each results page, we have the rel canonical tag. So, I'm not sure if we should concern ourselves or not about the crawl stats reporting that we have a bunch of pages that have more than two parameters. If so, do you have any suggestions? This url is an example: http://www.colemanfurniture.com/accent-occasional-furniture/accent-chairs.htm?color=20&furniture_type=213&price=16,10 Thanks!
On-Page Optimization | | thappe0