Large Site - Advice on Subdomaining
-
I have a large news site - over 1 million pages (have already deleted 1.5 million)
Google buries many of our pages, I'm ready to try subdomaining http://bit.ly/dczF5y
There are two types of content - news from our contributors, and press releases.
We have had contracts with the big press release companies going back to 2004/5. They push releases to us by FTP or we pull from their server. These are then processed and published. It has taken me almost 18 months, but I have found and deleted or fixed all the duplicates I can find.
There are now two duplicate checking systems in place. One runs at the time the release comes in and handles most of them. The other one runs every night after midnight and finds a few, which are then handled manually. This helps fine-tune the real-time checker.
Businesses often link to their release on the site because they like us. Sometimes google likes this, sometimes not.
The news we process is reviews by 1,2 or 3 editors before publishing. Some of the stories are 100% unique to us. Some are from contributors who also contribute to other news sites.
Our search traffic is down by 80%. This has almost destroyed us, but I don't give up easily. As I said, I've done a lot of projects to try to fix this. Not one of them has done any good, so there is something google doesn't like and I haven't yet worked it out. A lot of people have looked and given me their ideas, and I've tried them - zero effect.
Here is an interesting and possibly important piece of information:
Most of our pages are "buried" by google. If I dear, even for a headline, even if it is unique to us, quite often the page containing that will not appear in the SERP. The front page may show up, an index page may show up, another strong page pay show up, if that headline is in the top 10 stories for the day, but the page itself may not show up at all - UNTIL I go to the end of the results and redo the search with the "duplicates" included. Then it will usually show up, on the front page, often in position #2 or #3
According to google, there are no manual actions against us. There are also no notices in WMT that say there is a problem that we haven't fixed.
You may tell me just delete all of the PRs - but those are there for business readers, as they always have been. Google supposedly wants us to build websites for readers, which we have always done, What they really mean is - build it the way we want you to do it, because we know best.
What really peeves me is that there are other sites, that they consistently rank above us, that have all the same content as us, and seem to be 100% aggregators, with ads, with nothing really redeeming them as being different, so this is (I think) inconsistent, confusing and it doesn't help me work out what to do next.
Another thing we have is about 7,000+ US military stories, all the way back to 2005. We were one of the few news sites supporting the troops when it wasn't fashionable to do so. They were emailing the stories to us directly, most with photos. We published every one of them, and we still do. I'm not going to throw them under the bus, no matter what happens.
There were some duplicates, some due to screwups because we had multiple editors who didn't see that a story was already published. Also at one time, a system code race condition - entirely my fault, I am the programmer as well as the editor-in-chief. I believe I have fixed them all with redirects.
I haven't sent in a reconsideration for 14 months, since they said "No manual spam actions found" - I don't see any point, unless you know something I don't.
So, having exhausted all of the things I can think of, I'm down to my last two ideas.
1. Split all of the PRs off into subdomains (I'm ready to pull the trigger later this week)
2. Do what the other sites do, that I believe create little value, which is show only a headline and snippet and some related info and link back to the original page on the PR provider website. (I really don't want to do this)
3. Give up on the PRs and delete them all and lose another 50% of the income, which means releasing our remaining staff and upsetting all of the companies and people who linked to us. (Or find them all and rewrite them as stories - tens of thousands of them) and also throw all our alliances under the bus (I really don't want to do this)
There is no guarantee this is the problem, but google won't tell me, the google forums are crap, and nobody else has given me an idea that has helped.
My thought is that splitting them off into subdomains will have a number of effects.
1. Take most of the syndicated content onto subdomains, so its not on the main domain.
2. Shake up the Domain Authority
3. Create a million 301 redirects.
4. Make it obvious to the crawlers what is our news and what is PRs
5. make it easier for Google News to understand
Here is what I plan to do
1. redirect all PRs to their own subdomain.
pn.domain.com for PRNewswire releases
bw.domain.com for Businesswire releases
etc
2. Fix all references so they use the new subdomain
Here are my questions - and I hope you may see something I haven't considered.
1. Do you have any experience of doing this?
2. What was the result
3. Any tips?
4. Should I put PR index pages on the subdomains too? I was originally planning to keep them on the main domain, with the individual page links pointing to the actual release on the subdomain. Obviously, I want them only in one place, but there are two types of these index pages.
a) all of the releases for a particular PR company - these certainly could be on the subdomain and not on the main domain
b) Various category index pages - agriculture, supermarkets, mining etc These would have to stay on the main domain because they are a mixture of different PR providers.
5. Is this a bad idea?
I'm almost out of ideas. Should I add a condensed list of everything I've done already?
If you are still reading, thanks for hanging in.
-
I am ready to shout "NO" anytime I see anyone talking about using a subdomain... but you have made me consider it.
I have a site (not nearly as large as yours) that has a lot of press releases given to us by government agencies and industry sites. (These are not SEO press releases, they are from people who have a message to get out.) For a few years these were about 2/3 of the content that we published. They ranked really well - often above the original source and we added a bit of unique commentary to each of them.
In October '11 we took a Panda hit. Fortunately we simply dropped a couple positions on tons of pages - not a huge loss like some people see.
To escape we did noindex / follow to the most popular releases and threw the rest overboard. That cut off search traffic (which hit our income) but kept the content on our site for visitors. Google rankings returned partially a couple months later and then a couple months after that everything was back to normal - but income was down a little.
After reading your post, I am thinking about starting a subdomain for these press releases. Hopefully that will isolate the main site from any damage that the duplicate content might cause and allow them to pull a little traffic from search.
About "index pages".... If I do this I will keep them on my main site because most of my content is there (my own unique content). Since Panda I have become highly selective about which press releases I publish and they are now a minority of new content.
Thanks for the idea.
Good luck with your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Need Some Quality Vs. Quantity SEO Advice
We have a gallery here with our main categories of patches. https://www.stadriemblems.com/gallery/ If you click on one, say Fire Patches, you'll be taken to a page of just fire patches. https://www.stadriemblems.com/fire-patches/ But here's the kicker: If you notice of the fire patch page, there are also sub-categories to that. So if you click on say, Fire Rescue, you get taken one level deeper. https://www.stadriemblems.com/fire-patches/fire-rescue-patches/ I'm redoing this entire site (a project over five years overdue), and I'm wondering if it's really worth it to keep these three-level deep sub pages. I originally created them with long tail SEO in mind, making us be the only ones who come up when people search for very specific patches. But it's a big undertaking to redo all of them, and are they really adding any value?
On-Page Optimization | | UnderRugSwept0 -
Any recommendations for an XML Sitemap for a large community website?
Hi all, Once of our clients is a large community website for parents/parenting. The standard Wordpress XML Sitemap plugin is throwing up lots of errors, etc, and is not ideal. Does anyone have any recommendations for either a tool that we could use to create a better one, or else a service that we could pay to use? Gavin
On-Page Optimization | | IcanAgency0 -
Stumped: Site No Longer Showing Up for Important Keywords
URL is: www.radianceofpalmbeach.comGreetings All:I have been working on our company's website for months, and I am finally at wit's end. The site was very out-of-date and had unfortunately been built upon some bad links before my arrival. My partner and I have redone the site with SEO best practices in mind: we created new content for the pages, and have been working diligently on correctly organizing the site. Despite everything we have done, our site has plummeted since September in terms of organic search. Here are some of my suspects: Panda/Penguin: a lot of the content of the old site had been copied. We did our best to make our content helpful and original, but I'm not sure we did enough. Also, many backlinks were suspect. I disavowed all that I didn't like Dec. 8. I have seen minor improvement, but not much. Name Change: Around late October, coinciding with one of the algorithm changes, the doctor insisted we change our name from New Radiance Med Spa to New Radiance Cosmetic Center. We noticed overnight tumbling, but it literally happened at the same time many were complaining about Penguin. Pages too far removed from root directory?: We tried to silo the site by category to make it specific, but I'm not sure if we went too far from the root directory. For example, our botox page is: http://www.radianceofpalmbeach.com/services/injectables/neuromodulators/botox-cosmetic/ -- Should it just be ./botox ? Everything is only one link away, so we didn't foresee a problem. No alternate forms of navigation: Our navigation is solely drop-down. Content Issues: Since the site launch, my boss has changed the organization of the site around. I don't think this should be a problem, but I honestly don't know. Technical Issues: We use a Wordpress site, and the designer has been pretty good about making the site clean and without errors, but perhaps there is something I am overlooking? ??: Despite these issues, I feel like our site should be considered better than many of our competitors who nonetheless perform much better than we do on important keyword searches. Type in "liposuction palm beach" or "botox palm beach" and we don't even come on page 1, whereas we used to dominate. Any suggestions would be greatly appreciated, as, like I said, we are stumped. I feel like I have looked up every possible problem, and with the above list, we feel frozen as to which direction to turn.Thanks in advance,Michael
On-Page Optimization | | mikedelseo0 -
Is a site map necessary or recommended?
We have a website that has been up for the past 4 years without a site map. Google is indexing it. Do we need a site map? Do you recommend we create one and submit it to goggle and bing? The site is www.logobids.com Thank you.
On-Page Optimization | | IsaacH0 -
Is it good to have a subdomain with keyword?
Hi, I want to ask do you thing that it is good and necessary to have a subdomain with a keyword in it when the domain doesn't include it? f.e. you have a website named domain.com but there is no keyword in it. And if you add subdomain keyword.domain.com will this bring any benefit?
On-Page Optimization | | vladokan0 -
How do you avoid getting hit for too many links with an ecommerce site?
On my campaign for www.fourcolormagnets.com one of my warnings was "too many on-page links". Is there any thing to do for ecommerce sites? and also, my page www.fourcolormagnets.com/rectangle-sizes.php is listed as having 744 links but, I count nowhere near that number. And idea where this comes from?
On-Page Optimization | | JHSpecialty0 -
Seasonal site structure
Bit of a complicated one for anyone who likes a challenge.. We sell a range of products which are very seasonal, so therefore have a seasonal section within the store with the products categorized into their relevant categories. In additon to this i wanted to also create a feature of each season so in effect pull forward on to a new tab the relevant season ie: Valentine so that customers didn't have to hunt for the products by going via seasonal shop etc The problem is that my site urls display last-category/product-title so in effect as the seasons change these urls will be deleted. They do remain elsewhere in our catalogue.. Does this make sense?
On-Page Optimization | | LadyApollo0 -
Prioritize Cities Instead of Counties or Countries in Site Architecture
I am designing the structure of a large travel webiste and have the following problem. The obvious structure from a users perspective would be to structure locations as follows: Home/Hotels/Country/County/City i.e. Home/Hotels/UK/Lancashire/Preston The problem is, it is the cities that I need closer to the root directory, instead of countries and counties. If I did Home/Hotels/City or even Home/Hotels/Country/City There may be too many links on one page and google may think they're spammy. How can I get the cities closer to the root directory as they are the most important pages on the site. Even if I did a text based sitemap I would encounter the same problem. *scratches head! Thanks in Advance, Nick UPDATE_________________________________________________________ Sorry I may have phrased the question wrongly. I should have said that I need the city pages to be less clicks from the root directory, as opposed to their actual URL structure. Ideally, I want to be able to access the city pages before the county and country pages on the site, as they are more important. Thanks A Bunch
On-Page Optimization | | Townpages0