Large Site - Advice on Subdomaining
-
I have a large news site - over 1 million pages (have already deleted 1.5 million)
Google buries many of our pages, I'm ready to try subdomaining http://bit.ly/dczF5y
There are two types of content - news from our contributors, and press releases.
We have had contracts with the big press release companies going back to 2004/5. They push releases to us by FTP or we pull from their server. These are then processed and published. It has taken me almost 18 months, but I have found and deleted or fixed all the duplicates I can find.
There are now two duplicate checking systems in place. One runs at the time the release comes in and handles most of them. The other one runs every night after midnight and finds a few, which are then handled manually. This helps fine-tune the real-time checker.
Businesses often link to their release on the site because they like us. Sometimes google likes this, sometimes not.
The news we process is reviews by 1,2 or 3 editors before publishing. Some of the stories are 100% unique to us. Some are from contributors who also contribute to other news sites.
Our search traffic is down by 80%. This has almost destroyed us, but I don't give up easily. As I said, I've done a lot of projects to try to fix this. Not one of them has done any good, so there is something google doesn't like and I haven't yet worked it out. A lot of people have looked and given me their ideas, and I've tried them - zero effect.
Here is an interesting and possibly important piece of information:
Most of our pages are "buried" by google. If I dear, even for a headline, even if it is unique to us, quite often the page containing that will not appear in the SERP. The front page may show up, an index page may show up, another strong page pay show up, if that headline is in the top 10 stories for the day, but the page itself may not show up at all - UNTIL I go to the end of the results and redo the search with the "duplicates" included. Then it will usually show up, on the front page, often in position #2 or #3
According to google, there are no manual actions against us. There are also no notices in WMT that say there is a problem that we haven't fixed.
You may tell me just delete all of the PRs - but those are there for business readers, as they always have been. Google supposedly wants us to build websites for readers, which we have always done, What they really mean is - build it the way we want you to do it, because we know best.
What really peeves me is that there are other sites, that they consistently rank above us, that have all the same content as us, and seem to be 100% aggregators, with ads, with nothing really redeeming them as being different, so this is (I think) inconsistent, confusing and it doesn't help me work out what to do next.
Another thing we have is about 7,000+ US military stories, all the way back to 2005. We were one of the few news sites supporting the troops when it wasn't fashionable to do so. They were emailing the stories to us directly, most with photos. We published every one of them, and we still do. I'm not going to throw them under the bus, no matter what happens.
There were some duplicates, some due to screwups because we had multiple editors who didn't see that a story was already published. Also at one time, a system code race condition - entirely my fault, I am the programmer as well as the editor-in-chief. I believe I have fixed them all with redirects.
I haven't sent in a reconsideration for 14 months, since they said "No manual spam actions found" - I don't see any point, unless you know something I don't.
So, having exhausted all of the things I can think of, I'm down to my last two ideas.
1. Split all of the PRs off into subdomains (I'm ready to pull the trigger later this week)
2. Do what the other sites do, that I believe create little value, which is show only a headline and snippet and some related info and link back to the original page on the PR provider website. (I really don't want to do this)
3. Give up on the PRs and delete them all and lose another 50% of the income, which means releasing our remaining staff and upsetting all of the companies and people who linked to us. (Or find them all and rewrite them as stories - tens of thousands of them) and also throw all our alliances under the bus (I really don't want to do this)
There is no guarantee this is the problem, but google won't tell me, the google forums are crap, and nobody else has given me an idea that has helped.
My thought is that splitting them off into subdomains will have a number of effects.
1. Take most of the syndicated content onto subdomains, so its not on the main domain.
2. Shake up the Domain Authority
3. Create a million 301 redirects.
4. Make it obvious to the crawlers what is our news and what is PRs
5. make it easier for Google News to understand
Here is what I plan to do
1. redirect all PRs to their own subdomain.
pn.domain.com for PRNewswire releases
bw.domain.com for Businesswire releases
etc
2. Fix all references so they use the new subdomain
Here are my questions - and I hope you may see something I haven't considered.
1. Do you have any experience of doing this?
2. What was the result
3. Any tips?
4. Should I put PR index pages on the subdomains too? I was originally planning to keep them on the main domain, with the individual page links pointing to the actual release on the subdomain. Obviously, I want them only in one place, but there are two types of these index pages.
a) all of the releases for a particular PR company - these certainly could be on the subdomain and not on the main domain
b) Various category index pages - agriculture, supermarkets, mining etc These would have to stay on the main domain because they are a mixture of different PR providers.
5. Is this a bad idea?
I'm almost out of ideas. Should I add a condensed list of everything I've done already?
If you are still reading, thanks for hanging in.
-
I am ready to shout "NO" anytime I see anyone talking about using a subdomain... but you have made me consider it.
I have a site (not nearly as large as yours) that has a lot of press releases given to us by government agencies and industry sites. (These are not SEO press releases, they are from people who have a message to get out.) For a few years these were about 2/3 of the content that we published. They ranked really well - often above the original source and we added a bit of unique commentary to each of them.
In October '11 we took a Panda hit. Fortunately we simply dropped a couple positions on tons of pages - not a huge loss like some people see.
To escape we did noindex / follow to the most popular releases and threw the rest overboard. That cut off search traffic (which hit our income) but kept the content on our site for visitors. Google rankings returned partially a couple months later and then a couple months after that everything was back to normal - but income was down a little.
After reading your post, I am thinking about starting a subdomain for these press releases. Hopefully that will isolate the main site from any damage that the duplicate content might cause and allow them to pull a little traffic from search.
About "index pages".... If I do this I will keep them on my main site because most of my content is there (my own unique content). Since Panda I have become highly selective about which press releases I publish and they are now a minority of new content.
Thanks for the idea.
Good luck with your site.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Can service request pages be indexed for a service site?
I think there is no point in indexing service request pages for a service site. And it causes the indexing of the main pages to be done with a delay. Does anyone have experience with indexing service request pages and their results?
On-Page Optimization | | sora.ya04680 -
IP Canonicalization for HTTPS site?
I received an unsolicited SEO report for one of my sites. My site was faulted for not having IP canonicalization set up. I reviewed this carefully. My site runs on Apache, is https and is on a dedicated IP. The mod rewrite rules for Apache all deal with the http version of the site. When I type my site's IP into a browser, I get the the https version, but with a unsecure cert warning as the certificate does not include the IP. Should I implement the http IP canonicalization rule. Another rewrite rule would then redirect the request to the https version?
On-Page Optimization | | FatRodent20130 -
Generating Dynamic Meta Titles for SEO - Advice?
Hi, I'm working with a new client that works with a lot of suppliers and they want to have their titles built in a specific way. If I generate a template (so to speak) will generating dynamic meta titles have a negative effect on their SEO and is this even possible? If so, what would be the best way to handle this? There is approximately 100 suppliers. Any feedback/advice is appreciated!
On-Page Optimization | | daniel-brooks0 -
Does no-follow for pages affect site ranking?
Hey, I have a question. On my site, it's divided into the main site and the blog is in a subfolder of same domain. Within the main site (same domain), there are MANY checkout pages and other internal pages we use though all with "NO FOLLOW" on each. Despite it having "NO FOLLOW", will it affect our blog rankings in any way or domain ranking?"
On-Page Optimization | | Mirian0 -
Your tactics on improving organic search for a site in a struggling industry
We work with a client of ours with organic search initiatives. The problem is: the industry (e.g. the core of the business) is sagging. It has been for a couple of years. And they're finally feeling these sagging losses. Google Trends support this and shows it quite nicely. It pretty much mirrors organic search referrals as well. The industry (e.g. the core phrase that the company and its competitors have historically hit very hard) as well as Google Trends for the client (middle of the pack) and their two big competitors are attached. Wondering if anyone else has had this type of circumstance with their clients and some of their go to tactics that helped them stop the skid (and even make it start going up). Thanks SpztkvF.png LPHGo76.png
On-Page Optimization | | ChristianMKG0 -
Keyword usage in eCommerce Sites - Danger of keyword stuffing?
Hi all, I'm having a little difficulty deciding the best approach for selecting my product titles as I've encountered a few issues. I understand how important it is to try and use the keyword in your product titles, but about the category page that lists all of these products? One of category pages, for example, has 16 products on it. Each has the product title followed by the keyword. I have also used the keyword in the category title, URL, breadcrumbs and two or 3 times (because it was natural) in a paragraph that describes the category etc. Due to the little amount of text on the page, and the sheer amount of times that the keyword is being used, it looks like I am keyword stuffing (By Moz On Page Report Card). I think it came to 23 uses of the same keyword altogether. This is the pretty much teh same throughout every category page on my site, and think I was penalised by Google for this reason. I'm a relatively new site and have done everything by the book as far as I know, so everything is pointing at this to be the cause of the drop/disappearance in ranking. How do I rectify this problem? It's important for the products to have the keyword in, right? As this is one of the SEO practices that is given more weight when considering rankings. I have thought a potential way around this, which is to split the keyword between an exact match, and a variant of the keyword in the titles - only very slightly though. So my product titles would look like 'Product A Exact Match Keyword', 'Product B Variant on Keyword' etc. Could this work? Can anybody advise on the best thing I could try? I have attached an image to give you an idea of the layout of my category pages - Apologies in advance about my embarrassingly rubbish photoshop skills! I wasn't able to upload directly, so I have attached a link. Thanks for reading, John 4iIkmSx
On-Page Optimization | | John_Francis0 -
Site Architecture: How do I best Optimize for Similar Keywords?
Hello Moz Community! I'm really struggling trying to decide on an improved site architecture. I run an online proofreading & editing website. This leaves us targeting many different niche keywords. For example: blog editing/proofreading, essay editing/proofreading, book editing/proofreading, resume... you get the point. I feel like editing & proofreading are similar enough to target on the same page(s). However, the issue is that I'm also having to deal with what I'm calling derivative keywords. For example, when I try to optimize for 'essay editing/proofreading', I also have to think about: paper editing, paper editor, paper correction, edit my paper, etc. I would have no problem optimizing the page for 'essay editing' in the title, H1, etc. and then targeting these words as secondary keywords within the body text, etc., however, I keep thinking 'a large slice of a small pie is better than a small slice of a big one.' You see, the keyword 'essay correction' has only about one-third the monthly searches as 'essay editing', but it is 50% less competitive. The same is loosely true for the rest of the 'derivative' keywords. I'd have no problem building specific pages for these derivative keyword groups, however, I'm very concerned how this would effect my site from a user experience perspective. I don't want to have a master "services" page with links to book editing, resume editing, essay editing, etc. and then also show paper editing, essay correction, etc. To me, this would be confusing... "What's the difference between essay editing and paper editing?". Any guidance is much appreciated. This has got my head spinning! Thanks!
On-Page Optimization | | TBiz0 -
On-page optimisation for CMS based sites
With so many sites being based on a CMS, and with so many hundreds of different CMS out there, as SEO consultants how do you recommend dealing with on-page optimisation for a client where you discover their site is built with a CMS you have not previously used (or even heard of!)
On-Page Optimization | | bjalc20110