Prevent indexing of dynamic content
-
Hi folks!
I discovered bit of an issue with a client's site. Primarily, the site consists of static html pages, however, within one page (a car photo gallery), a line of php coding:
dynamically generates a 100 or so pages comprising the photo gallery - all with the same page title and meta description. The photo gallery script resides in the /gallery folder, which I attempted to block via robots.txt - to no avail. My next step will be to include a:
within the head section of the html page, but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated by the call to the php script residing a bit further down on the page?
Dino
-
Hello Steven,
Thank you for providing another perspective. However, all factors considered, I agree with Shane's approach on this one. The pages add very little merit to the site and exist primarily to provide the site users with eye-candy (e.g. photos of classic cars).
-
Just personally, I would still deindex or canonical them - they are just pages with a few images - so not of much value and unless all titles and descriptions are targeting varying keywords and content is added, they will canabalize eachother, and possibly even drag down the site due to 100's of pages of thin content....
So actually from an SEO perspective it probably IS better to deindex or canonical 3 - 5 or so years ago, maybe the advice would have been keep them and keyword target - but not in the age of content
(unless the images were optimized for image searches for sale able products (but I do not think it is)
-
Hi Dino,
I know this won't solve the immediate problem you asked for, but wouldn't it be better for your client's site (and for SEO) to alter the PHP so that the title and meta data description are replaced with variables that can also be dynamic, depending on whichever of the 100 or so pages gets created?
That way, rather than worrying about a robot seeing 100 pages as duplicate content, it could see 100 pages as 100 pages.
-
It depends on how the pages are being created (I would assume it is off of a template page)
So within the template of this dynamically created page you would place
But if this is the global template - you cannot do this as it will noindex every page which of course is bad.
If you want to PM me the URL of the page I can take a look at your code, and see what is going on and how to recitify, as right now i think we are talking about the same principles, but different words are being used.
It really is pretty straightforward. (what I am saying) The pages that you want to be not indexed DO NOT need a nofollow they need a meta noindex
But there are many variables, as if you have already robot.txt disallowed the directory, then no bot will go there to get the updated noindex directive....
If there is no way to add a meta noindex then you need to nofollow and put in for a manual removal
-
I completely understand and agree with all points you have conveyed. However, I am not certain as to the best approach to "noindex" the urls which are being created dynamically from within the static html page? Maybe I am making this more complex than it needs to be...
-
So it is the pages themselves that are dynamically created you want out of index, not the page the contains the links?
If this is so ---
noindex the pages that are created dynamically
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it.
Nofollow does not remove from index, it only tells the bot not to pass authority, as it is still feasible that the bot will crawl the link, so without the noindex, nofollow is not the correct directive due to the page (even though nofollowed) is still being reached and indexed.
PS. also if you have the nofollow on the links, you may want to remove it, so the bots will go straight through to the page and grab the noindex directive, but if you wanted to try to not let any authority "evaporate" you can continue to nofollow, but you may need to request the dynamically generated pages (URLS) be removed using webmaster tools.
-
The goal is to have the page remain in the index, but not follow any dynamically generated links on the page. The nofollow directive (in place for months) has not done the job.
-
?
If a link is coming into the page, and you have Noindex, Nofollow - this would remove from index and prevent the following of any links -
This is NOT instant, and can take months to occur depending on depth of page, crawl schedule ect... (you can try to speed it up by using webmaster tools to remove the URL)
What is the goal You are attempting to achieve?
To get the page out of index, but still followed?
Or remain in index, but just not follow links on page?
?
-
Therein lies the problem. I did have the nofollow directive in place specifying the /gallery/ folder, but apparently, the bots still crawled it. I agree that the noindex removes the page, but I wasn't certain if it prevented crawling of the page, as I have read mixed opinions on this.
I just thought of something else... perhaps an external url is linking to this page - allowing it to be crawled. I am off to examine this perspective.
Thanks for your response!
-
noindex will only remove from Index and dissallow the act of indexing the specific page (or pages created off template) you place the tag in upon the next page crawl.
Bots will still follow the page, and follow any links that are readable as long as there is not a nofollow directive.
I am not sure I fully understand the situation, so I would not say this is my "reccomendation" but an answer to the specific question.....
but I am wondering if this will stop the bots dead in their tracks or will they still be able to pick-up on the pages generated
Hope this helps!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Regarding internal duplicate content
Suppose two of my webpages from the same site are having 30% to 35% common content. The reason behind this common content is that I put same data and images (in the main content area) since both pages are partially related. But, title tag, meta description, h1 tag, urls are different.
On-Page Optimization | | b.me
My questions are Can Google consider it as duplicate content?
Can it hamper the ranking of my pages ?
How can I deal with it?0 -
Duplicate content from page links
So for the last month or so I have been going through fixing SEO content issues on our site. One of the biggest issues has been duplicate content with WHMCS. Some have been easy and other have been a nightmare trying to fix. Some of the duplicate content has been the login page when a page requires a login. For example knowledge base article that are only viewable by clients etc. Easily fixed for me as I dont really need them locked down like that. However pages like affiliate.php and pwreset.php that are only linked off of a page. I am unsure how to take care of these types. Here are some pages that are being listed as duplicate: Should this type of stuff be a 301 redirect to cart.php or would that break something. I am guessing that everything should point back to cart.php.
On-Page Optimization | | blueray
https://www.bluerayconcepts.com/brcl...art.php?a=view
https://www.bluerayconcepts.com/brcl...php?a=checkout These are the ones that are really weird to me. These are showing as duplicate content but pwreset is only a link of the KB category. It shows up as duplicate many times as does affilliate.php: https://www.bluerayconcepts.com/brcl...ebase/16/Email
https://www.bluerayconcepts.com/brcl...16/pwreset.php Any help is overly welcome.0 -
Duplicate Content
I run a Business Directory, Where all the businesses are listed. I am having an issue.. with Duplication content. I have categories, Like A, B, C Now a business in Category A, User can filter it by different locations, by State, City, Area So If they filter it by the State and the State has 10 businesses and all of them are in one City. Both of the page
On-Page Optimization | | Adnan4SEO
The state filtered and the city filtered are same, What can i do to avoid that? canonical-url-tag or changing page Meta's and Body text? Please help 🙂0 -
Disappearing and reappearing in google index
Hello. I made a lot of car accident lawyer city pages. They probably weren't as unique as they should have been. Suddenly, they all disappeared from the rankings and I freaked out. Then, two days later, they all returned. Is this a bad sign? Should I be worried? Why would they drop out of the rankings and come back in? Let me know, thanks.
On-Page Optimization | | RafeTLouis0 -
Fading in content above the fold on window load
Hi, We'd like to render a font stack from Typekit and paint a large cover image above the fold of our homepage after document completion. Since asynchronously loading anything generally looks choppy, we fade in the affected elements when it's done. Sure, it gives a much smoother feeling and fast load times, but I have a concern about SEO. While Typekit loads, h1, h2 and the page's leading paragraph are sent down the wire with an invisible style (but still technically exist as static html). Even though they appear to a user only milliseconds later, I'm concerned that a search engine's initial request is met with a page whose best descriptive assets are marked as invisible. Both UX and SEO have high value to our business model, so we're asking for some perspective to make the right kind of trade off. Our site has a high domain authority compared to our competition, and sales keyword competition is high. Will this UX improvement damage our On-Page SEO? If so and purely from an SEO perspective, roughly how serious will the impact be? We're eager to hear any advice or comments on this. Thanks a lot.
On-Page Optimization | | noyelling0 -
How to fix duplicate page content and page titles?
Apologies in advance if this has already been answered (it probably has) - I'm just not seeing it. Is there a guide on here for how to fix the issues brought up by the crawler - specifically, things like duplicate page content, or duplicate page titles? A lot of these seem to have been created by wordpress.org combos that I didn't anticipate - i.e., category pages, author pages, etc. The crawler brings up the problems, but I don' t know where to start to go about fixing them. Also, any guide on best SEO practices or fixing optimization problems, specifically for wordpress.org blogs, would be greatly appreciated. Thanks!
On-Page Optimization | | prospects1 -
Should I let Google index tags?
Should I let Google index tags? Positive? Negative Right now Google index every page, including tags... looks like I am risking to get duplicate content errors? If thats true should I just block /tag in robots.txt Also is it better to have as many pages indexed by google or it's should be as lees as possible and specific to the content as much as possible. Cheers
On-Page Optimization | | DiamondJewelryEmpire0 -
Duplicate Page Content Issue
For one of our campaigns, we have 164 errors for Duplicate Page Content. We have a website where much of the same content lives in two different places on their website. The information needs to be accessible from both areas. What is the best way to tackle this problem? Is there anything that can be done so these pages are not competing against one another? If the only solution is to edit the content on one of the pages, how much of the content has to be different? Is there a certain percentage to go by? Here is an example of what I am referring to: 1.) http://www.valleyorthopedicassociates.com/services/foot-center/preventing-sprains-and-strains 2.) http://www.valleyorthopedicassociates.com/patient-resources/service/foot-and-ankle-center/preventing-sprains-and-strains
On-Page Optimization | | cmaseattle1