Huge number of indexed pages with no content
-
Hi,
We have accidentally had Google indexed lots os our pages with no useful content at all on them.
The site in question is a directory site, where we have tags and we have cities. Some cities have suppliers for almost all the tags, but there are lots of cities, where we have suppliers for only a handful of tags.
The problem occured, when we created a page for each cities, where we list the tags as links.
Unfortunately, our programmer listed all the tags, so not only the ones, where we have businesses, offering their services, but all of them!
We have 3,142 cities and 542 tags. I guess, that you can imagine the problem this caused!
Now I know, that Google might simply ignore these empty pages and not crawl them again, but when I check a city (city site:domain) with only 40 providers, I still have 1,050 pages indexed. (Yes, we have some issues between the 550 and the 1050 as well, but first things first:))
These pages might not be crawled again, but will be clicked, and bounces and the whole user experience in itself will be terrible.
My idea is, that I might use meta noindex for all of these empty pages and perhaps also have a 301 redirect from all the empty category pages, directly to the main page of the given city.
Can this work the way I imagine? Any better solution to cut this really bad nightmare short?
Thank you in advance.
Andras
-
Thank you again, John. I will fix this, based on our discussion.
-
NoIndex I think is slightly superfluous as the 301 will take care of it and also point people to a proper result and give Google a redirected result.
However SEOMoz's Robots information page page suggests:
"In most cases, meta robots with parameters
"noindex, follow"
should be employed as a way to to restrict crawling or indexation."- So maybe consider that...
As for Robots, you can check out SEOMoz's Robots information page where it has information on wildcards, which you could use, which I THINK would work (i.e. http://domain.com/*/tags ?
Not quite sure on that last bit though...
-
Thank you for your reply, Josh.
I will then use the 301, but should I also use the noindex tag for these pages to be removed from the index?
Does it make an emphasis on my intention, or it adds no extra to the process? Perhaps, they should not be used together at all, as basically they are meant for different tasks.
(Unfortunatyly, robots.txt is not really a solution, as we have the following url structure:
Since all the cities have at least a couple of valid tags, I can't specify the path to be excluded from indexing. I would also try not to add 2,000+ cities individually.
As for GWT, url removal for this number of pages might also not be an option, as I have minimum 100,000+ no-value pages to be removed (the limit is 500 per month).)
-
I would agree, just setup a 301 redirect so that users don't bounce and actually get directed to something remotely useful, even just a listing of all the tags around the site or a home page or something (even if you do the below, to ensure users who stumble on these pages are still happy).
You could also use a robots.txt file to show which ones you don't want to be indexed, and finally you may also use Google's Webmaster Tools to manually remove particular pages!
A combo of all of those will work a treat!
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Duplicate content issue: staging urls has been indexed and need to know how to remove it from the serps
duplicate content issue: staging url has been indexed by google ( many pages) and need to know how to remove them from the serps. Bing sees the staging url as moved permanently Google sees the staging urls (240 results) and redirects to the correct url Should I be concerned about duplicate content and request Google to remove the staging url removed Thanks Guys
Technical SEO | | Taiger0 -
Is there a way to index important pages manually or to make sure a certain page will get indexed in a short period of time??
Hi There! The problem I'm having is that certain pages are waiting already three months to be indexed. They even have several backlinks. Is it normal to have to wait more than three months before these pages get an indexation? Is there anything i can do to make sure these page will get an indexation soon? Greetings Bob
Technical SEO | | rijwielcashencarry0400 -
Can you noindex a page, but still index an image on that page?
If a blog is centered around visual images, and we have specific pages with high quality content that we plan to index and drive our traffic, but we have many pages with our images...what is the best way to go about getting these images indexed? We want to noindex all the pages with just images because they are thin content... Can you noindex,follow a page, but still index the images on that page? Please explain how to go about this concept.....
Technical SEO | | WebServiceConsulting.com0 -
Mysterious drop in the Number of Pages Crawled
The # of crawled pages on my campaign dashboard has been 90 for months. Approximate a week ago it dropped down to 25 crawled pages, and many links went with it. I have checked with my web master, and he said no changes have been made which would cause this to happen. I am looking for suggestions on how I can go about trouble shooting this issue, and possible solutions. Thanks in advance!
Technical SEO | | GladdySEO0 -
SEO MOZ report showing duplicate content pages with without ending /
Hello the SEOMOZ report is showing me I have a lot of duplicate content and then proceeds listing almost every page on my site as showing with a URL with an ending "/" and without. I checked my sitemap and only one version is there, the one with "/". I have a Wordpress site. Any recommendations ? Thanks.
Technical SEO | | dpaq20110 -
No index directory pages?
All, I have a site built on WordPress with directory software (edirectory) on the backend that houses a directory of members. The Wordpress portion of the site is full of content and drives traffic through to the directory. Like most directories, the results pages are thin on content and mainly contain links to member profiles. Is it best to simply no index the search results for the directory portion of the site?
Technical SEO | | JSOC0 -
2000 pages indexed in Yahoo, 0 in Google. NO PR, What is wrong?
Hello Everyone, I have a friend with a blog site that has over 2000 pages indexed in Yahoo but none in Google and no page rank. The web site is http://www.livingorganicnews.com/ I know it is not the best site but I am guessing something is wrong and I don't see it. Can you spot it? Does he have some settings wrong? What should he do? Thank you.
Technical SEO | | QuietProgress0 -
Why are my pages getting duplicate content errors?
Studying the Duplicate Page Content report reveals that all (or many) of my pages are getting flagged as having duplicate content because the crawler thinks there are two versions of the same page: http://www.mapsalive.com/Features/audio.aspx http://www.mapsalive.com/Features/Audio.aspx The only difference is the capitalization. We don't have two versions of the page so I don't understand what I'm missing or how to correct this. Anyone have any thoughts for what to look for?
Technical SEO | | jkenyon0