Onsite calendar throwing out thousands of pages
-
Hi guys - I have just stumbled across an onsite calendar that's throwing out hundreds of indexable pages (some are indexing) - most of the pages are basically blank - just a day date and the calendar design on the page. How would you deal with this issue? I was thinking noindex but would prefer a solution where calendar isn't throwing out so many pages to begin with!
Look forward to reading your thoughts, Luke
-
Hi Luke
Matt has the right idea. If the pages are going to "exist", you should block search engines from crawling them with the robots.txt file.
I would get your dev to help, but basically you'd find the folder or path in which you want to crawler to stop at. Maybe it's /month/ or something and you'd block that in robots.txt.
Ian covers this in his recent article about "Spider Traps". And you can also read about robots.txt on Moz or on Google.
-
Personally, I'd think noindex/nofollow would be a decent solution, provided you don't mind those pages never ranking. You could also block the calendar in robots.txt.
-
Hi Matt - yes, trying not to upset the web dev by posting link (though can do privately if needed)! The CMS is Drupal and is hand-coded in, it seems (and there lies the problem) - every day, month, week you can think of is creating a unique URL, which isn't very helpful - most of the days, months, weeks into the future are blank - you just get a box on the page with, say, March 2017 - and nothing else. I was thinking noindex may be a quick solution (best solution would be to remove the calendar) - though not sure whether that will protect me from all issues - do I really want crawlers heading through hundreds/thousands of empty pages - perhaps I should noindex, nofollow?
-
Hi Luke! It might help if you can let us know how the calendar is set up. Is it embedded from a third-party? Is it some sort of plugin? And what CMS are you using:
The more information you can provide about the calendar and your site, the better. Bonus points if you can provide some URLs.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate pages and Canonicals
Hi all, Our website has more than 30 pages which are duplicates. So canonicals have been deployed to show up only 10 of these pages. Do more of these pages impact rankings? Thanks
Intermediate & Advanced SEO | | vtmoz0 -
Unlimited Product Pages
While browsing through my Moz campaign, I noticed that my site is pulling up unlimited numbers of product pages even though no products appear on them. i.e. http://www.interstellarstore.com/star-trek-memorabilia?page=16 http://www.interstellarstore.com/star-trek-memorabilia?page=100 http://www.interstellarstore.com/star-trek-memorabilia?page=200 I have no ideal how to resolve this issue. I can't possible 301 an unlimited number of pages, and I can see this being a big SEO problem. Any thoughts?
Intermediate & Advanced SEO | | moon-boots0 -
Links / Top Pages by Page Authority ==> pages shouldnt be there
I checked my site links and top pages by page authority. What i have found i dont understand, because the first 5-10 pages did not exist!! Should know that we launched a new site and rebuilt the static pages so there are a lot of new pages, and of course we deleted some old ones. I refreshed the sitemap.xml (these pages are not in there) and upload it in GWT. Why those old pages appear under the links menu at top pages by page authority?? How can i get rid off them? thx, Endre
Intermediate & Advanced SEO | | Neckermann0 -
"No index" page still shows in search results and paginated pages shows page 2 in results
I have "no index, follow" on some pages, which I set 2 weeks ago. Today I see one of these pages showing in Google Search Results. I am using rel=next prev on pages, yet Page 2 of a string of pages showed up in results before Page 1. What could be the issue?
Intermediate & Advanced SEO | | khi50 -
On-page optimization - Am I doing it well?
Hi Mozzers, I'm sitting here going through our site and optimizing all of our content.
Intermediate & Advanced SEO | | Travis-W
For the most part we've just written without correct keyword research, so the content lacks focus. Here is a page I would consider finished - http://www.consumerbase.com/international-mailing-lists.html I have our KWs in the: URL Title Tag Meta Description Bolded in Content Image Alt Attribute. If I optimize my other pages like this, will I be good?
It feels a tiny bit stuffed to me, but SEOmoz's on-page tool gives me glowing numbers. Thanks!0 -
Blocking Pages Via Robots, Can Images On Those Pages Be Included In Image Search
Hi! I have pages within my forum where visitors can upload photos. When they upload photos they provide a simple statement about the photo but no real information about the image,definitely not enough for the page to be deemed worthy of being indexed. The industry however is one that really leans on images and having the images in Google Image search is important to us. The url structure is like such: domain.com/community/photos/~username~/picture111111.aspx I wish to block the whole folder from Googlebot to prevent these low quality pages from being added to Google's main SERP results. This would be something like this: User-agent: googlebot Disallow: /community/photos/ Can I disallow Googlebot specifically rather than just using User-agent: * which would then allow googlebot-image to pick up the photos? I plan on configuring a way to add meaningful alt attributes and image names to assist in visibility, but the actual act of blocking the pages and getting the images picked up... Is this possible? Thanks! Leona
Intermediate & Advanced SEO | | HD_Leona0 -
301 - should I redirect entire domain or page for page?
Hi, We recently enabled a 301 on our domain from our old website to our new website. On the advice of fellow mozzer's we copied the old site exactly to the new domain, then did the 301 so that the sites are identical. Question is, should we be doing the 301 as a whole domain redirect, i.e. www.oldsite.com is now > www.newsite.com, or individually setting each page, i.e. www.oldsite.com/page1 is now www.newsite.com/page1 etc for each page in our site? Remembering that both old and new sites (for now) are identical copies. Also we set the 301 about 5 days ago and have verified its working but haven't seen a single change in rank either from the old site or new - is this because Google hasn't likely re-indexed yet? Thanks, Anthony
Intermediate & Advanced SEO | | Grenadi0 -
There's a website I'm working with that has a .php extension. All the pages do. What's the best practice to remove the .php extension across all pages?
Client wishes to drop the .php extension on all their pages (they've got around 2k pages). I assured them that wasn't necessary. However, in the event that I do end up doing this what's the best practices way (and easiest way) to do this? This is also a WordPress site. Thanks.
Intermediate & Advanced SEO | | digisavvy0