I need help compiling solid documentation and data (if possible) that having tons of orphaned pages is bad for SEO - Can you help?
-
I spent an hour this afternoon trying to convince my CEO that having thousands of orphaned pages is bad for SEO. His argument was "If they aren't indexed, then I don't see how it can be a problem."
Despite my best efforts to convince him that thousands of them ARE indexed, he simply said "Unless you can prove it's bad and prove what benefit the site would get out of cleaning them up, I don't see it as a priority."
So, I am turning to all you brilliant folks here in Q & A and asking for help...and some words of encouragement would be nice today too
Dana
-
Agreed on all counts Jason, not to mention the improved customer experience because we won't have people landing on those God-awful ugly and useless pages!
From a server perspective, could deleting 8,000 files (pages, images, PDFs) results in our site speed improving too? Or would it likely have no impact?
-
So you have roughly 8,500 pages that are part of your customer experience and that you want customers to be able to navigate to from your site and presumably would like customers to find on Google. (from Screaming Frog).
But only 7,500 only pages are in Google's index. So best case, roughly 1,000 of your good pages (almost 12% of all the pages on your site) don't exist in organic search. Worst case, is that some of those 7,500 pages in google are depreciated pages that aren't part of your active site, making the percentage of live pages in google even worse.
It's very possible that a portion of your google crawl budget is being consumed by pages that don't help you. If you get those pages out of the index, you stand a better chance to get your 1000 good pages into the index.
-
Hi Jason,
Ok, here is what I saw in Screaming Frog:
27,616 total spidered URLs, of which:
- 8,494 are HTML pages
- 45 are CSS files
- 14,687 are images
- 4,287 are PDFs
Google says we have only 7,540 URLs indexed (of all types) - I know for a fact that at least 500 orphaned pages are indexed in Google. It seems to me, then, that Google is indexing content that isn't important to us, and perhaps not indexing other content that is important to us because it's having trouble telling what's important and what's not.
Any insights on that Jason? What do you make of it?
-
Hi Jason,
I'm just following up as I get my ducks in a row on this one. Above in your comment you said "Google Count of Pages - Screaming Frog count of Pages = # of Orphaned Pages" - to be perfectly accurate, this would only give me the number of orphaned pages that are indexed. There could be many additional orphaned pages that are not in Google's index.
My follow up question is, should I be concerned about those too? Or are orphaned pages that aren't indexed not worth cleaning up? I think I already know the answer (Yes! Clean those up too because they can interfere with crawl rate and site speed...)....but I want to know your take on it please. Thanks so much!
Dana
-
Tempting! Very tempting.:-)
-
I would not do this if I was an employee... but.... I would ask him to bet me an amount that would be equivalent to about "one month's pay" on the results.
He is a chicken so he wouldn't accept that bet. And if he did accept I would want it in writing.
-
Thanks EGOL. You made me chuckle, because all of these things crossed my mind. I did go home mad yesterday, and I don't get mad very easily or very often. I usually welcome the idea of explaining SEO strategies and tactics to newbies and laypeople (as is evidenced by my many posts here in Q & A).
Let's just say - my feelers are out looking at other possibilities.
-
In my opinion, the links are still evaporating pagerank.
If some of these pages are still in the index they could be counting as thin/duplicate content.
-
What would your response be to that?
- thinks for a while *
I would be mad about this. This is why I prefer to be self-employed.
I don't know the temperament or personality of this person.
I might not be working there much longer.
It seems to me that the effort required to cut links into these pages is tiny and the potential for gain is pretty high.
Downside risk is zero. Upside opportunity is good. He is a chicken and a fool.
-
EGOL, I thought I would just follow up on these thin content "Reviews/Ratings" pages. They are blocked from Google crawling them via the robots.txt file. Is this enough? Or are they still diluting the product page's authority just by being there?
Thanks!
Dana
-
Thanks EGOL,
And yes, they are.
The comment I received when trying to explain that those links were draining authority off the product pages was "No they aren't. Whatever PageRank the product page has, it has, regardless of whether the links are there or not."
What would your response be to that? I tried to explain it several different ways, but he just looked at me like I was full of malarkey...He is a visual person. Perhaps I should try a diagram?
It's difficult going into a situation like this when the opening premise in the other person's mind is that he knows more about SEO than I do, because all SEO is in his mind is a bunch of guesswork.
Sorry, moral's a bit low in my heart at the moment. I work too hard and study too hard at what I do to have someone who maybe read's a blog about SEO occasionally to come in and treat me like I have no idea what I'm talking about.
Thanks very much for responding. I appreciate it mucho!
Dana
-
Thanks Jason,
These are great suggestions and are exactly the kinds of things that will give me the proof I need to convince him that removing these is a worthwhile endeavor. I'm off to do them now and will come back here and post my discoveries.
Dana
-
Are these those thin content, duplicate content, review and email pages?
There are links into those pages that are evaporating pagerank.
Two links on each of your product pages are being wasted.
If they are getting indexed then they are dead weight on your site and make your site look like a skimpy spammy publisher.
-
By "orphaned" do you mean pages that are no longer linked to your site navigation taxonomy?
If you know the subject matter and/or URLs, you can easy show your boss that they are indexed: Google "site:oursite.com orphaned topic" and show him all the pages in the google index.
If you can't find the pages, then do a complete crawl of your site with Screaming Frog and see how many pages it finds. Now compare that number with how many pages Google has in your index in Google Webmaster Tools (under Health -> Index Status). Google Count of Pages - Screaming Frog count of Pages = # of Orphaned Pages.
Now to see if those pages are hurting you, run them through Open Site Explorer to see if any of them have backlinks. If so, they are diluting your SEO efforts. Even if not, look at your crawl stats in Google Webmaster tools under Health and see how many pages you're getting crawled per day. If it's a fraction of your total pages, then if you got rid of the orphaned pages, you could be getting your important pages crawled more regularly.
I hope that helps.
Jason "Retailgeek" Goldberg
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Bay Area E-Commerce SEO Firm Needed
So my e-commerce site recently got hit badly with the latest Penguin update. Traffic is down by 60%. We were using a cheap Indian SEO firm who did get us great results but it seems they was a lot more spamming than I realized. I am now looking to clean up my backlinks and create a new relationship with a local business so I can be more hands on with my SEO. Does anyone have any recommendations for SEO firms that have experience in e-commerce? Ideally somewhere in the Bay Area or even Sacramento?
Technical SEO | | premierchampagne0 -
Will it make any difference to SEO on an ecommerce site if they use their SSL certificate (https) across every page
I know that e-commerce sites usually have SSL certificates on their payment pages. A site I have come across is using has the https: prefix to every page on their site. I'm just wondering if this will make any difference to the site in the eyes of Search Engines, and whether it could effect the rankings of the site?
Technical SEO | | Sayers1 -
Home Page Not Ranking... Need an EXPERT
Hi, for about a year now, our home page seems to have been "removed" from Google (except for our branded keywords). The home page is clearly indexed, but none of the keywords for the home page rank in the top 500 (again, except for our brand, trophycentral). Other pages in the site are fine and we rank in the top 10 positions for hundreds of keywords. We are also okay on Yahoo and Bing. So the issue is the home page from non-branded words. There have been no manual penalties, but since the words, such as trophies and trophies and awards are all but gone, something is going on. It has been almost exactly a year now and I have tried everything from removing backlinks (although I still can't tell which are "bad links", to changing titles, to improving speed and structure. I have had quite a few really nice people try to help, but their suggestions don't seem to work or are too vague. I really need an expert as the impact to the business of not having a home page performing is very damaging. Thank you!!!
Technical SEO | | trophycentraltrophiesandawards0 -
My homepage+key pages have dropped 40+ positions after implementing redirects and canonical changes. HELP!
Hi SEOMozers, I work for a web based nonprofit at www.tisbest.org. I had a professional contact recommend that we work on our redirects to our homepage because we were losing valuable rank benefit. This combined with getting sick of seeing our weekly SEOMoz crawl reports show 304 duplicate page and title errors for months. No one could seem to figure out what was happening (we think it had to do with session stuff; we were seeing several versions of each page showing the following: www.tisbest.org/default.aspx/(random character string) My developer and I read a bunch of articles and started making changes 10 days ago: He setup 301 redirects from http://tisbest.org to http://www.tisbest.org. (set the canonical domain). We did a redirect from http://www.tisbest.org/default.aspx to root with "/". I set the canonical setting to www.tisbest.org in our webmaster tools. In our web config (we're running in asp.net), we changed our session detection from auto-detect then saw some session funkiness so we changed it back. Though we do think the character strings we were seeing were session GUID. He forced lower case URL’s to reduce duplicate page content/titles. I got my weekly crawl report 9 days ago and we had dropped from 340 duplicate page title and page content errors went to one. We went nuts and felt like the kings of SEO. Then, yesterday (9/28), the SEO grim reaper came knocking when I received my weekly SEOMoz ranking report. It said we had dropped 40+ spots for all of 9 of our keywords. Sure enough, I searched our keywords and our website was gone. Then I searched our company name, tisbest, and only a few of our pages show but not the homepage. I searched for our URL www.tisbest.org, and I originally got the expanded view (with 8 links to various webpages - can't remember what this view is called) but now, today (Saturday), the expanded view is gone from this search result. Also, when I run the On Page Report card for our homepage, I get the following error message with no results: "We were unable to grade that page. The page did not load. Curl::Err::TooManyRedirectsError: Number of redirects hit maximum amount." When I run the Open Site explorer report, I get this message at the top: Oh Hey! It looks like that URL redirects to www.tisbest.org/?AspxAutoDetectCookieSupport=1. Would you like to see data for <a class="clickable redirects">that URL instead</a>?" If I go to the report for the that report's page, it says that "No information is available for that URL." Just tonight (night of 9/29), our developer added the rel="canonical" href="http://www.tisbest.org" /> to our homepage tonight to see if that would help. We did not do that originally. In our Google Webmaster tools, I am seeing the number of URL Error - Not Followed has sky rocked. I have attached a screen capture to this thread. There are also a large number of URL Errors - Not Found errors as well. I did some research tonight and downloaded and ran Screaming Frog SEO Crawler. I have attached a screen capture below with this report and a couple of questions I sent our developer that may be helpful to you. Also, not sure if this is relevant, we use a master page that all of our pages inherit from so all of our pages get the same meta-data: name="keywords" content="charitable gift card, charitable gift certificate, non profit gift card, charity donation, giftcard, charity gift card, donation gift card, donation gift, charity gift, animal gift card, animal gift, environmental gift card, environmental gift, humanitarian gift card, humanitarian gift, christian gift card, christian gift, catholic gift card, catholic gift, religious gift card, religious gift" />id="ctl00_metaDescription" name="description" content="Award winning Charity Gift Card, for over 250 premier charities. A customized donation gift that makes the world better. TisBest is BBB Accredited." />name="google-site-verification" content="EfJIhN3h2SVSXdSpUbfceBVw2q6zrGX8rRQhdNZ1xY8" /><title></span><span> </span></p> <p>Can anyone help me/us identify the issue that obliterated our rankings? I am happy to give an information needed. Thank you! Chad Edwards</p> <a download="Bqcu1.png" class="imported-anchor-tag" href="http://i.imgur.com/Bqcu1.png" target="_blank">Bqcu1.png</a> <a download="ZXQ8d.png" class="imported-anchor-tag" href="http://i.imgur.com/ZXQ8d.png" target="_blank">ZXQ8d.png</a></title>
Technical SEO | | TisBest0 -
Why isn't Google pushing my Schema data to the search results page
I believe we have it set up right. I'm noticing all my competitors schema data is showing up which is really giving them a leg up on us. We have a high ranking website so I'm just not sure why it's now showing up. Here is an example URL http://www.airgundepot.com/3576w.html I've used the Google webmaster tools tester and it all looks fine. Any ideas? Thanks in advance.
Technical SEO | | AirgunDepot0 -
SEO MOZ report showing duplicate content pages with without ending /
Hello the SEOMOZ report is showing me I have a lot of duplicate content and then proceeds listing almost every page on my site as showing with a URL with an ending "/" and without. I checked my sitemap and only one version is there, the one with "/". I have a Wordpress site. Any recommendations ? Thanks.
Technical SEO | | dpaq20110 -
Well, I need some help, advice, something.
Hey all, I'm new to the SEOmoz thing but I like it so far. I think I have my site listing so messed up that it's effecting my rank. I have 3 domains. 1.) rt112media.com 2.) route112media.com 3.) route112.net. Each domain was purchased through GoDaddy.com and still remain there. I have my own hosting account which I was registered as rt112media.com with route112media.com and route112.net listed as add on domains. Technically, I would like for my main site to be route112media.com for everything. However when I registered the site as rt112media.com I didn't know the issues I would have as far as different domains so I registered with rt112media.com as my main domain name. Anyways, as of now I have rt112media.com as my main domain through my cpanel hosting.I have both domains route112media.com and route112.net set for 301 wildcard redirects to rt112media.com on my hosting account and my GoDaddy account. When I started my WMT account I didn't really know which domain to use cause I figured I could link them all to one. So, I signed up as routet12media.com. After a little while I realized it was not recieving anything because everything was being redirected to rt112media.com Anyways both addresses have been crawled and indexed so they are showing as two. So, I requested to change the route112media.com address to rt112media.com in WMT. That was about 2 weeks ago and it is still pending request. I'm not having further problems with WMT because of the www.rt112media.com vs http://rt112media.com. I am the verified owner of both but I can not switch the www.rt112media account to show the non www. account as the main one because I have the other pending. My site is still being crawled as 2 versions rt112media.com and route112media.com. So what is my best option? And what would be the worst cause scenario if I wanted to start completely over using route112media.com as my main domain with hosting and all. Sorry this was so long I just wanted to explain my situation. I'm lost. Any advice would be appreciated! http:/rt112media.com
Technical SEO | | Route112Media0 -
If two links from one page link to another, how can I get the second link's anchor text to count?
I am working on an e-commerce site and on the category pages each of the product listings link to the product page twice. The first is an image link and then the second is the product name. I want to get the anchor text of the second link to count. If I no-follow the image link will that help at all? If not is there a way to do this?
Technical SEO | | JordanJudson0