Huge number of indexed pages with no content
-
Hi,
We have accidentally had Google indexed lots os our pages with no useful content at all on them.
The site in question is a directory site, where we have tags and we have cities. Some cities have suppliers for almost all the tags, but there are lots of cities, where we have suppliers for only a handful of tags.
The problem occured, when we created a page for each cities, where we list the tags as links.
Unfortunately, our programmer listed all the tags, so not only the ones, where we have businesses, offering their services, but all of them!
We have 3,142 cities and 542 tags. I guess, that you can imagine the problem this caused!
Now I know, that Google might simply ignore these empty pages and not crawl them again, but when I check a city (city site:domain) with only 40 providers, I still have 1,050 pages indexed. (Yes, we have some issues between the 550 and the 1050 as well, but first things first:))
These pages might not be crawled again, but will be clicked, and bounces and the whole user experience in itself will be terrible.
My idea is, that I might use meta noindex for all of these empty pages and perhaps also have a 301 redirect from all the empty category pages, directly to the main page of the given city.
Can this work the way I imagine? Any better solution to cut this really bad nightmare short?
Thank you in advance.
Andras
-
Thank you again, John. I will fix this, based on our discussion.
-
NoIndex I think is slightly superfluous as the 301 will take care of it and also point people to a proper result and give Google a redirected result.
However SEOMoz's Robots information page page suggests:
"In most cases, meta robots with parameters
"noindex, follow"
should be employed as a way to to restrict crawling or indexation."- So maybe consider that...
As for Robots, you can check out SEOMoz's Robots information page where it has information on wildcards, which you could use, which I THINK would work (i.e. http://domain.com/*/tags ?
Not quite sure on that last bit though...
-
Thank you for your reply, Josh.
I will then use the 301, but should I also use the noindex tag for these pages to be removed from the index?
Does it make an emphasis on my intention, or it adds no extra to the process? Perhaps, they should not be used together at all, as basically they are meant for different tasks.
(Unfortunatyly, robots.txt is not really a solution, as we have the following url structure:
Since all the cities have at least a couple of valid tags, I can't specify the path to be excluded from indexing. I would also try not to add 2,000+ cities individually.
As for GWT, url removal for this number of pages might also not be an option, as I have minimum 100,000+ no-value pages to be removed (the limit is 500 per month).)
-
I would agree, just setup a 301 redirect so that users don't bounce and actually get directed to something remotely useful, even just a listing of all the tags around the site or a home page or something (even if you do the below, to ensure users who stumble on these pages are still happy).
You could also use a robots.txt file to show which ones you don't want to be indexed, and finally you may also use Google's Webmaster Tools to manually remove particular pages!
A combo of all of those will work a treat!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Search Console Indexed Page Count vs Site:Search Operator page count
We launched a new site and Google Search Console is showing 39 pages have been indexed. When I perform a Site:myurl.com search I see over 100 pages that appear to be indexed. Which is correct and why is there a discrepancy? Also, Search Console Page Index count started at 39 pages on 5/21 and has not increased even though we have hundreds of pages to index. But I do see more results each week from Site:psglearning.com My site is https://wwww.psglearning.com
Technical SEO | | pdowling0 -
Old Content Pages
Hello we run a large sports website. Since 2009 we have been doing game previews for most games every day for all the major sports..IE NFL, CFB, NBA, MLB etc.. Most of these previews generate traffic for 1-2 days leading up to or day of the event. After that there is minimal if any traffic and over the years almost nothing to the old previews. If you do a search for any of these each time the same matchup happens Google will update its rankings and filter out any old matchups/previews with new ones. So our question is what would you do with all this old content? Is it worth just keeping? Google Indexes a majority of it? Should we prune some of the old articles? The other option we thought of and its not really practical is to create event pages where we reuse a post each time the teams meet but if there was some sort of benefit we could do it.
Technical SEO | | dueces0 -
Create Longer Content or Separate Pages
Good Afternoon We've been helping out on a site which for instance offers Hotel Breaks in Birmingham. There is a page on the site birmingham-hotel-breaks and the most popular package that people book is actually Hotel Breaks with a Transfer package. The Birmingham Hotel page focuses mainly on the Hotel and Transfer packages but it's been suggested that we build out a separate page birmingham-hotel-and-transfer to be more keyword targeted and change the original page to focus solely on the Hotel side. I wasn't sure whether it would be better to build out the content on the existing page as people are already linking to us for the packages
Technical SEO | | Ham19790 -
Blog Page Titles - Page 1, Page 2 etc.
Hi All, I have a couple of crawl errors coming up in MOZ that I am trying to fix. They are duplicate page title issues with my blog area. For example we have a URL of www.ourwebsite.com/blog/page/1 and as we have quite a few blog posts they get put onto another page, example www.ourwebsite.com/blog/page/2 both of these urls have the same heading, title, meta description etc. I was just wondering if this was an actual SEO problem or not and if there is a way to fix it. I am using Wordpress for reference but I can't see anywhere to access the settings of these pages. Thanks
Technical SEO | | O2C0 -
Utilising Wordpress Attachment Pages Without Getting Duplicate Content Warnings.
I have a wordpres site that relies heavily on images and their usefulness. Each post links to larger sizes of the images with links back to the post and the "gallery" all images uploaded to the post. Unfortunately this goes against the "rules" and our attachment page show as duplicate content in Google (even though the image titles are different). There must be a way to utlise and make the most of attachment pages without getting duplicate content warnings?
Technical SEO | | DotP0 -
How do I fix Duplicate Content/Title going to memberlist.php page?
I have over 6,000 duplicate title and duplicate content errors going to this link: http://community.mautofied.com/memberlist.php?mode=viewprofile&u=100299 How do I fix this?
Technical SEO | | mautofied0 -
High number of Duplicate Page titles and Content related to index.php
It appears that every page on our site (www.bridgewinners.com) also creates a version of itself with a suffix. This results in Seomoz indicating that there are thousands of duplicate titles and content. 1. Does this matter? If so, how much? 2. How do I eliminate this (we are using joomla)? Thanks.
Technical SEO | | jfeld2220 -
Importance of an optimized home page (index)
I'm helping a client redesign their website and they want to have a home page that's primarily graphics and/or flash (or jquery). If they are able to optimize all of their key sub-pages, what is the harm in terms of SEO?
Technical SEO | | EricVallee340