I need help compiling solid documentation and data (if possible) that having tons of orphaned pages is bad for SEO - Can you help?
-
I spent an hour this afternoon trying to convince my CEO that having thousands of orphaned pages is bad for SEO. His argument was "If they aren't indexed, then I don't see how it can be a problem."
Despite my best efforts to convince him that thousands of them ARE indexed, he simply said "Unless you can prove it's bad and prove what benefit the site would get out of cleaning them up, I don't see it as a priority."
So, I am turning to all you brilliant folks here in Q & A and asking for help...and some words of encouragement would be nice today too
Dana
-
Agreed on all counts Jason, not to mention the improved customer experience because we won't have people landing on those God-awful ugly and useless pages!
From a server perspective, could deleting 8,000 files (pages, images, PDFs) results in our site speed improving too? Or would it likely have no impact?
-
So you have roughly 8,500 pages that are part of your customer experience and that you want customers to be able to navigate to from your site and presumably would like customers to find on Google. Â (from Screaming Frog).
But only 7,500 only pages are in Google's index. Â So best case, roughly 1,000 of your good pages (almost 12% of all the pages on your site) don't exist in organic search. Â Worst case, is that some of those 7,500 pages in google are depreciated pages that aren't part of your active site, making the percentage of live pages in google even worse.
It's very possible that a portion of your google crawl budget is being consumed by pages that don't help you. Â If you get those pages out of the index, you stand a better chance to get your 1000 good pages into the index.
-
Hi Jason,
Ok, here is what I saw in Screaming Frog:
27,616 total spidered URLs, of which:
- 8,494 are HTML pages
- 45 are CSS files
- 14,687 are images
- 4,287 are PDFs
Google says we have only 7,540 URLs indexed (of all types) - I know for a fact that at least 500 orphaned pages are indexed in Google. It seems to me, then, that Google is indexing content that isn't important to us, and perhaps not indexing other content that is important to us because it's having trouble telling what's important and what's not.
Any insights on that Jason? What do you make of it?
-
Hi Jason,
I'm just following up as I get my ducks in a row on this one. Above in your comment you said "Google Count of Pages - Screaming Frog count of Pages = # of Orphaned Pages" -Â to be perfectly accurate, this would only give me the number of orphaned pages that are indexed. There could be many additional orphaned pages that are not in Google's index.
My follow up question is, should I be concerned about those too? Or are orphaned pages that aren't indexed not worth cleaning up? I think I already know the answer (Yes! Clean those up too because they can interfere with crawl rate and site speed...)....but I want to know your take on it please. Thanks so much!
Dana
-
Tempting! Very tempting.:-)
-
I would not do this if I was an employee... but....  I would ask him to bet me an amount that would be equivalent to about  "one month's pay" on the results.
He is a chicken so he wouldn't accept that bet. Â And if he did accept I would want it in writing.
-
Thanks EGOL. You made me chuckle, because all of these things crossed my mind. I did go home mad yesterday, and I don't get mad very easily or very often. I usually welcome the idea of explaining SEO strategies and tactics to newbies and laypeople (as is evidenced by my many posts here in Q & A).
Let's just say - my feelers are out looking at other possibilities.
-
In my opinion, the links are still evaporating pagerank.
If some of these pages are still in the index they could be counting as thin/duplicate content.
-
What would your response be to that?
- thinks for a while *
I would be mad about this. Â This is why I prefer to be self-employed.
I don't know the temperament or personality of this person.
I might not be working there much longer.
It seems to me that the effort required to cut links into these pages is tiny and the potential for gain is pretty high.
Downside risk is zero. Â Upside opportunity is good. Â He is a chicken and a fool.
-
EGOL, I thought I would just follow up on these thin content "Reviews/Ratings" pages. They are blocked from Google crawling them via the robots.txt file. Is this enough? Or are they still diluting the product page's authority just by being there?
Thanks!
Dana
-
Thanks EGOL,
And yes, they are.
The comment I received when trying to explain that those links were draining authority off the product pages was "No they aren't. Whatever PageRank the product page has, it has, regardless of whether the links are there or not."
What would your response be to that? I tried to explain it several different ways, but he just looked at me like I was full of malarkey...He is a visual person. Perhaps I should try a diagram?
It's difficult going into a situation like this when the opening premise in the other person's mind is that he knows more about SEO than I do, because all SEO is in his mind is a bunch of guesswork.
Sorry, moral's a bit low in my heart at the moment. I work too hard and study too hard at what I do to have someone who maybe read's a blog about SEO occasionally to come in and treat me like I have no idea what I'm talking about.
Thanks very much for responding. I appreciate it mucho!
Dana
-
Thanks Jason,
These are great suggestions and are exactly the kinds of things that will give me the proof I need to convince him that removing these is a worthwhile endeavor. I'm off to do them now and will come back here and post my discoveries.
Dana
-
Are these those thin content, duplicate content, review and email pages?
There are links into those pages that are evaporating pagerank.
Two links on each of your product pages are being wasted.
If they are getting indexed then they are dead weight on your site and make your site look like a skimpy spammy publisher.
-
By "orphaned" do you mean pages that are no longer linked to your site navigation taxonomy?
If you know the subject matter and/or URLs, you can easy show your boss that they are indexed: Â Google "site:oursite.com orphaned topic" and show him all the pages in the google index.
If you can't find the pages, then do a complete crawl of your site with Screaming Frog and see how many pages it finds. Â Now compare that number with how many pages Google has in your index in Google Webmaster Tools (under Health -> Index Status). Â Google Count of Pages - Screaming Frog count of Pages = # of Orphaned Pages.
Now to see if those pages are hurting you, run them through Open Site Explorer to see if any of them have backlinks.  If so, they are diluting your SEO efforts.  Even if not, look at your crawl stats in Google Webmaster tools under Health and see how many pages you're getting crawled per day.  If it's a fraction of your total pages, then if you got rid of the orphaned pages, you could be getting your important pages crawled more regularly.
I hope that helps.
Jason "Retailgeek" Goldberg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
How do I best optimize my on-page SEO for a magazine-style wordpress theme?
My Wordpress website is set up with a magazine style theme (Newspaper). Maybe that's the issue overall here. Questions: 1) Pages vs Categories vs Posts I currently have a category with a few dozen posts under it. The category page itself has a ~1000 word article on it. It paginates every 10 posts or so at the bottom, but most of the page is duplicate because it's only swapping out a few links. Should I instead make the "category" a page with the posts childed under it? What's the best way to go about that? 2) Canonical and Pagination I get errors about a ton of duplicate content for paginated categories and my author page (all posts are under the admin account, which has ~40 pages or so. Every page is just a list of posts and it bitches about duplicate Titles and Descriptions on every one of the paginated posts). Should I canonical these back to the root author? Same question regarding pagination for categories, assuming I'm not going to be switching them to Pages. 3) Home Page Links Right now my home page just shows a few links to the top posts of all time. After that, it shows the 5 newest posts. On the sidebar it lists a few random pages/posts. There are also a few "category listings" which just shows random posts relevant to that category. Do I want something more static/structured? The navbar does list most main content pages under their appropriate category, but the home page itself is pretty much dynamic.
Technical SEO | | searchspot0 -
Local Landing Pages struggling with rankings although I've done most things needed. Any idea?
Hi Mozzers, I am wondering if someone could advise if there's anything obvious here as to why my local landing pages suck ranking wise even though I have done all of the following.  http://goo.gl/Lr4HXa I am trying to rank for Garden tool hire Bristol on my landing page. Main category page is garden tool hire Consitant NAP - Citations. Local branch address on Page , in title tag, H1 tag and the address is in on page content which is unique. Schema.org has been set up with address in this aswell etc. Pagination set up and view all page has concanical tag pointing to page 1 Speed not an issue as this is a fast site. Currently all the product links on the page are H3 tags but I've seen this on lots of other sites. All my NAP Citations point to the parent branch pages although I don't have any individual deep links pointing to this page. Unique Content I currently don't have internal links to relevant articles on my blog page as I have those on my main category landing page as you can see here  -  http://goo.gl/sO9A9U but I can add these as well to all my location specific landing pages if you think it would help. Any thoughts greatly appreciated Pete
Technical SEO | | PeteC120 -
Local SEO - Page Titles
Hi Folks, Complete newbie (well last 12 months) I have recentley added a blog to my site and have been doing quite a bit of quite word researching through google. I have found some good keywords that have up till now escaped me! Heres my question because I trying for local traffic, mainly newcastle durham and sunderlanddo i go with one of the following two options get two very similar keywords in my article and go for both and rely on google to bring up local listings for the end user in my area e.g  Small garden design | Garden design from the experts.   (keywords bold ) or Garden Design | Newcastle | Sunderland | Durham | so I have geo locations in title either way I will obviously have both keywords and locations in the artcle Help please I dont want to write many hours and find I have missed a trick! Many thank guys n girls!
Technical SEO | | easigrassne0 -
Why can't i get the page if i type/paste url directly?
Hello, just click the following link, http://www.tuscany-cooking-class.com/es/alojamiento/villa-pandolfini/ It might be show the 404 page, but follow this way, www.tuscany-cooking-class.com/es then select alojamiento link, then select first property name with villa-pandolfini, Now you can view the page content, why it behave like this, We are using joomla with customized. Anyone help me to fix this issue Thanks Advance Alex
Technical SEO | | massimobrogi0 -
Are building a page using HTML 5 better for seo?
Very general question really, but does anyone know whether Google sees html5 pages as being superior in any way to xhtml or html 4.x pages?
Technical SEO | | jimpannell0 -
Suggestions on how to hire help with my SEO?
Hi, I just signed up with SEOMOZ and found some major duplicate content issues with Wordpress. I have installed Yoast SEO plug in but honestly am a little lost on how Wordpress handles it all and need guidance. I would love to hire someone to do some desktop share / Skype sessions to teach me the proper way to set this up but did not see if there was any place on SEOMOZ where employers and providers can connect. I am interested in more than just this one issue, looking for a freelancer ongoing to work on our site for SEO. Any suggestions? Thanks in advance Force7
Technical SEO | | Force70 -
If you only want your home page to rank, can you use rel="canonical" on all your other pages?
If you have a lot of pages with 1 or 2 inbound links, what would be the effect of using rel="canonical" to point all those pages to the home page? Would it boost the rankings of the home page? As I understand it, your long-tail keyword traffic would start landing on the home page instead of finding what they were looking for. That would be bad, but might be worth it.
Technical SEO | | watchcases0 -
Very, very confusing behaviour with 301s. Help needed!
Hi SEOMoz gang! Been a long timer reader and hangerouter here but now i need to pick your brains. I've been working on two websites in the last few days which are showing very strange behaviour with 301 redirects. Site A This site is an ecommerce stie stocking over 900 products and 000's of motor parts. The old site was turned off in Feb 2011 when we built them a new one. The old site had terrible problems with canonical URLs where every search could/would generate a unique ID e.g. domain.com/results.aspx?product=1234. When you have 000's of products and Google can find them it is a big problem. Or was. We launche the new site and 301'd all of the old results pages over to the new product pages and deleted the old results.aspx. The results.aspx page didn't index or get shown for months. Then about two months again we found some certain conditions which would mean we wouldn't get the right 301 working so had to put the results.aspx page back in place. If it found the product, it 301'd, if it didn't it redirected to the sitemap.aspx page. We found recently that some bizarre scenerio actually caused the results.aspx page to 200 rather than 301 or 404. Problem. We found this last week after our 404 count in GWMT went up to nearly 90k. This was still odd as the results.aspx format was of the OLD site rather than the new. The old URLs should have been forgetten about after several months but started appearing again! When we saw the 404 count get so high last week, we decided to take severe action and 301 everything which hit the results.aspx page to the home page. No problem we thought. When we got into the office on Monday, most of our product pages had been dropped from the top 20 placing they had (there were nearly 400 rankings lost) and on some phrases the old results.aspx pages started to show up in there place!! Can anyone think why old pages, some of which have been 301'd over to new pages for nearly 6 months would start to rank? Even when the page didn't exist for several months? Surely if they are 301's then after a while they should start to get lost in the index? Site B This site moved domain a few weeks ago. Traffic has been lost on some phrases but this was mainly due to old blog articles not being carried forward (what i'll call noisy traffic which was picked up by accident and had bad on page stats). No major loss in traffic on this one but again bizarre errors in GWMT. This time pages which haven't been in existence for several YEARS are showing up as 404s in GWMT. The only place they are still noted anywhere is in the redirect table on our old site. The new site went live and all of the pages which were in Googles index and in OpenSiteExplorer were handled in a new 301 table. The old 301s we thought we didn't need to worry about as they had been going from old page to new page for several years and we assumed the old page had delisted. We couldn't see it anywhere in any index. So... my question here is why would some old pages which have been 301'ing for years now show up as 404s on my new domain? I've been doing SEO on and off for seven years so think i know most things about how google works but this is baffling. It seems that two different sites have failed to prevent old pages from cropping up which were 301d for either months or years. Does anyone has any thoughts as to why this might the case. Thanks in advance. Andy Adido
Technical SEO | | Adido-1053990