Onsite calendar throwing out thousands of pages
-
Hi guys - I have just stumbled across an onsite calendar that's throwing out hundreds of indexable pages (some are indexing) - most of the pages are basically blank - just a day date and the calendar design on the page. How would you deal with this issue? I was thinking noindex but would prefer a solution where calendar isn't throwing out so many pages to begin with!
Look forward to reading your thoughts, Luke
-
Hi Luke
Matt has the right idea. If the pages are going to "exist", you should block search engines from crawling them with the robots.txt file.
I would get your dev to help, but basically you'd find the folder or path in which you want to crawler to stop at. Maybe it's /month/ or something and you'd block that in robots.txt.
Ian covers this in his recent article about "Spider Traps". And you can also read about robots.txt on Moz or on Google.
-
Personally, I'd think noindex/nofollow would be a decent solution, provided you don't mind those pages never ranking. You could also block the calendar in robots.txt.
-
Hi Matt - yes, trying not to upset the web dev by posting link (though can do privately if needed)! The CMS is Drupal and is hand-coded in, it seems (and there lies the problem) - every day, month, week you can think of is creating a unique URL, which isn't very helpful - most of the days, months, weeks into the future are blank - you just get a box on the page with, say, March 2017 - and nothing else. I was thinking noindex may be a quick solution (best solution would be to remove the calendar) - though not sure whether that will protect me from all issues - do I really want crawlers heading through hundreds/thousands of empty pages - perhaps I should noindex, nofollow?
-
Hi Luke! It might help if you can let us know how the calendar is set up. Is it embedded from a third-party? Is it some sort of plugin? And what CMS are you using:
The more information you can provide about the calendar and your site, the better. Bonus points if you can provide some URLs.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Category Pages & Content
Hi Does anyone have any great examples of an ecommerce site which has great content on category pages or product listing pages? Thanks!
Intermediate & Advanced SEO | | BeckyKey1 -
Why one of my top pages dropped?
Hello here. Our website, virtualsheetmusic.com, is pretty popular in the sheet music realm, and we used to rank on the first page for the keyword "violin sheet music" until a few weeks ago with our violin dedicated page: http://www.virtualsheetmusic.com/downloads/Indici/Violin.html But a couple of weeks ago we dropped to over the 5th page on Google (I can't even find us!) and I have no idea why. Most of our top ranking pages are still there though. This never happened before, after 17 years on the web. Do you have any idea why that could have happened?
Intermediate & Advanced SEO | | fablau0 -
Link to homepage or brand page?
Hi, I have opportunity to get a link from a brands website to our website as we are official retailers. Should I give them our homepage URL or should I give them their brands page on our website? The brand page will have their brand name in the URL, meta details, images, content and products. What is more beneficial SEO wise? Thanks
Intermediate & Advanced SEO | | YNWA0 -
Product pages content
Hi! I'm doing some SEO work for a new client. I've been tasked with boosting some of their products, such as http://www.lawnmowersdirect.co.uk/product/self-propelled-rear-roller-rotary-petrol-lawnmowers/honda-hrx426qx. It's currently #48 for the term Honda Izy HRG465SD, while http://www.justlawnmowers.co.uk/lawnmowers/honda-izy-hrg-465-sd.htm is #2, behind Amazon. Regarding links, there's no great shakes between the pages or even the domains. However, there's major difference in content. I'm happy to completely revamp it, I just wanted to check I'm not missing anything out before starting to rewrite it altogether! Thanks
Intermediate & Advanced SEO | | neooptic0 -
redirect 404 pages to homepage
Hello, I'm puting a new website on a existing domain. In order to not loose the links that point to the varios old url I would like to redirect them to homepage. The old website was a mess as there was no seo and the pages didn't target any keywords. Thats why I would like to redirect all links to home. What do you think is the best way to do this ? I tried to ad this in the .htaccess but it's not working; ErrorDocument 404 /index.php Con you tell me how it exacly look? Now the hole file is like this: @package Joomla @copyright Copyright (C) 2005 - 2012 Open Source Matters. All rights reserved. @license GNU General Public License version 2 or later; see LICENSE.txt READ THIS COMPLETELY IF YOU CHOOSE TO USE THIS FILE! The line just below this section: 'Options +FollowSymLinks' may cause problems with some server configurations. It is required for use of mod_rewrite, but may already be set by your server administrator in a way that dissallows changing it in your .htaccess file. If using it causes your server to error out, comment it out (add # to beginning of line), reload your site in your browser and test your sef url's. If they work, it has been set by your server administrator and you do not need it set here. Can be commented out if causes errors, see notes above. Options +FollowSymLinks Mod_rewrite in use. RewriteEngine On Begin - Rewrite rules to block out some common exploits. If you experience problems on your site block out the operations listed below This attempts to block the most common type of exploit attempts to Joomla! Block out any script trying to base64_encode data within the URL. RewriteCond %{QUERY_STRING} base64_encode[^(]([^)]) [OR] Block out any script that includes a
Intermediate & Advanced SEO | | igrizo0 -
Should we deindex duplicate pages?
I work on an education website. We offer programs that are offered up to 6 times per year. At the moment, we have a webpage for each instance of the program, but that's causing duplicate content issues. We're reworking the pages so the majority of the content will be on one page, but we'll still have to keep the application details as separate pages. 90% of the time, application details are going to be nearly identical, so I'm worried that these pages will still be seen as duplicate content. My question is, should we deindex these pages? We don't particularly want people landing on our application page without seeing the other details of the program anyway. But, is there problem with deindexing such a large chunk of your site that I'm not thinking of? Thanks, everyone!
Intermediate & Advanced SEO | | UWPCE0 -
How do I fix the error duplicate page content and duplicate page title?
On my site www.millsheating.co.uk I have the error message as per the question title. The conflict is coming from these two pages which are effectively the same page: www.millsheating.co.uk www.millsheating.co.uk/index I have added a htaccess file to the root folder as I thought (hoped) it would fix the problem but I doesn't appear to have done so. this is the content of the htaccess file: Options +FollowSymLinks RewriteEngine On RewriteCond %{HTTP_HOST} ^millsheating.co.uk RewriteRule (.*) http://www.millsheating.co.uk/$1 [R=301,L] RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html\ HTTP/ RewriteRule ^index\.html$ http://www.millsheating.co.uk/ [R=301,L] AddType x-mapp-php5 .php
Intermediate & Advanced SEO | | JasonHegarty0 -
Category Pages - Canonical, Robots.txt, Changing Page Attributes
A site has category pages as such: www.domain.com/category.html, www.domain.com/category-page2.html, etc... This is producing duplicate meta descriptions (page titles have page numbers in them so they are not duplicate). Below are the options that we've been thinking about: a. Keep meta descriptions the same except for adding a page number (this would keep internal juice flowing to products that are listed on subsequent pages). All pages have unique product listings. b. Use canonical tags on subsequent pages and point them back to the main category page. c. Robots.txt on subsequent pages. d. ? Options b and c will orphan or french fry some of our product pages. Any help on this would be much appreciated. Thank you.
Intermediate & Advanced SEO | | Troyville0