Adding Orphaned Pages to the Google Index
-
Hey folks,
How do you think Google will treat adding 300K orphaned pages to a 4.5 million page site. The URLs would resolve but there would be no on site navigation to those pages, Google would only know about them through sitemap.xmls.
These pages are super low competition.
The plot thickens, what we are really after is to get 150k real pages back on the site, these pages do have crawlable paths on the site but in order to do that (for technical reasons) we need to push these other 300k orphaned pages live (it's an all or nothing deal)
a) Do you think Google will have a problem with this or just decide to not index some or most these pages since they are orphaned.
b) If these pages will just fall out of the index or not get included, and have no chance of ever accumulating PR anyway since they are not linked to, would it make sense to just noindex them?
c) Should we not submit sitemap.xml files at all, and take our 150k and just ignore these 300k and hope Google ignores them as well since they are orhpaned?
d) If Google is OK with this maybe we should submit the sitemap.xmls and keep an eye on the pages, maybe they will rank and bring us a bit of traffic, but we don't want to do that if it could be an issue with Google.
Thanks for your opinions and if you have any hard evidence either way especially thanks for that info.
-
it's not a strategy, it's due to technical limitations on the dev side. i agree though thanks.
So, I asked this question to a very advanced SEO guru and he said they could be seen as doorways and present some risk and advised against it. That combined with the probability that they will most likely get dropped from Google's index anyway and we know that Google says they want pages to be part of the sites architecture has me leaning towards nofollowing all of them and maybe experiment with allowing 1000 to get indexed and see what happens with them.
Thanks for your input folks
-
I'd go back to the drawing board and rework your strategy.
Do you need additional sites? 150K orphaned pages you want indexed sounds spammy or poor site architecture to me.
-
Yikes, I didn't know the site was that big. Still, if you're afraid of how Google would "react" to those orphaned pages, I'd still test small, regardless of how large your overall site is.
-
Yea 1000 is probably a big enough sample.
10,000 seems like a lot i guess but not when you've got a site with 4.5 million pages.
-
yea submitting sitemap.xml files for 300k pages that are not part of the site seems a bit obnoxious.
-
we definitely want the 150k in the index since they are legitimate pages and linked to on the site. it's the 300k of orphaned ones we have to take along as a package deal that i am worried about. too many orphaned pages for Google.
-
That's a good idea. 10,000 Is still a lot. You could even test fewer than 10,000 pages. Why not try 1,000?
-
Hmmm. I am leaning towards the following solution since I would rather be on the cautious side, maybe this makes sense?
a) we noindex these 300k orphaned pages and do not submit sitemap.xml files
b) we experiment with say 10,000 pages and we allow only those to get indexed and submit sitemap.xml files for them
c) we closely monitor their indexing and ranking performance so we can determine if these are even worth opening up to Google and taking any risk.
-
In my opinion, add the 150k pages in the site map along with the 300k pages, let Google index all the pages and once they are all indexed , you can take a call on de indexing the 150k pages based on their traction.
-
I have no hard evidence, but if it were my site, I would do option C but keep an eye on what happens, and if I noticed anything strange happening, I would implement option B. But if option C makes you nervous, I see no reason you couldn't or shouldn't noindex them right off the bat.
That's merely one person's opinion, however.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Removing indexed internal search pages from Google when it's driving lots of traffic?
Hi I'm working on an E-Commerce site and the internal Search results page is our 3rd most popular landing page. I've also seen Google has often used this page as a "Google-selected canonical" on Search Console on a few pages, and it has thousands of these Search pages indexed. Hoping you can help with the below: To remove these results, is it as simple as adding "noindex/follow" to Search pages? Should I do it incrementally? There are parameters (brand, colour, size, etc.) in the indexed results and maybe I should block each one of them over time. Will there be an initial negative impact on results I should warn others about? Thanks!
Intermediate & Advanced SEO | | Frankie-BTDublin0 -
Google Not Indexing App Content
Hello Mozzers I recently noticed that there has been an increase in crawl errors reported in Google Search console & Google has stopped indexing our app content. Could this be due to the fact that there is a mismatch between the host path name mentioned within the android deeplink (within the alternate tag) and the actual URL of the page. For instance on the following desktop page http://www.example.com.au/page-1 the android deeplink points to http://www.example.com.au/android-app://com.example/http/www.example.com.au/4652374 Please note that the content on both pages (desktop & android) is same.Is this is a correct setup or am I doing something wrong here? Any help would be much appreciated. Thank you so much in advance.
Intermediate & Advanced SEO | | InMarketingWeTrust0 -
Does Google Index URLs that are always 302 redirected
Hello community Due to the architecture of our site, we have a bunch of URLs that are 302 redirected to the same URL plus a query string appended to it. For example: www.example.com/hello.html is 302 redirected to www.example.com/hello.html?___store=abc The www.example.com/hello.html?___store=abc page also has a link canonical tag to www.example.com/hello.html In the above example, can www.example.com/hello.html every be Indexed, by google as I assume the googlebot will always be redirected to www.example.com/hello.html?___store=abc and will never see www.example.com/hello.html ? Thanks in advance for the help!
Intermediate & Advanced SEO | | EcommRulz0 -
Https & http urls in Google Index
Hi everyone, this question is a two parter: I am now working for a large website - over 500k monthly organic traffic. The site currently has both http and https urls in Google's index. The website has not formally converted to https. The https began with an error and has evolved unchecked over time. Both versions of the site (http & https) are registered in webmaster tools so I can clearly track and see that as time passes http indexation is decreasing and https has been increasing. The ratio is at about 3:1 in favor of https at this time. Traffic over the last year has slowly dipped, however, over the last two months there has been a steady decline in overall visits registered through analytics. No single page appears to be the culprit, this decline is occurring across most pages of the website, pages which traditionally draw heavy traffic - including the home page. Considering that Google is giving priority to https pages, could it be possible that the split is having a negative impact on traffic as rankings sway? Additionally, mobile activity for the site has steadily increased both from a traffic and a conversion standpoint. However that traffic has also dipped significantly over the last two months. Looking at Google's mobile usability error's page I see a significant number of errors (over 1k). I know Google has been testing and changing mobile ranking factors, is it safe to posit that this could be having an impact on mobile traffic? The traffic declines are 9-10% MOM. Thank you. ~Geo
Intermediate & Advanced SEO | | Geosem0 -
Page Indexed but not Cached
A section of pages on my site are indexed (I know because they appear in SERPs if I copy and paste a sentence from the content), however according to the text-only cached version of the page they are not being read by Google.Why are they indexed event hough it seems like Google is not reading them..... or is Google in fact reading this text even though it seems like they should not be?Thanks for your assistance.
Intermediate & Advanced SEO | | theLotter0 -
Drop in number of pages in Bing index
I regularly check our index inclusion and this morning saw that we had dropped from having approx 6,000 pages in Bing's index to less than 100. We still have 13,000 in Bing's image index, and I've seen no similar drop in the number of pages in either Google or Yahoo. I've checked with our dev team and there have been no significant changes to the sitemap or robots file. Has anybody seen anything like this before, or could give any insight into why it might be happening?
Intermediate & Advanced SEO | | GBC0 -
404 with a Javascript Redirect to the index page...
I have a client that is wanting me to issue a 404 on her links that are no longer valid to a custom 404, pause for 10 seconds, then rediirect to the root page (or whatever other redirect logic she wants)...to me it seems trying to game googlebot this way is a "bad idea" Can anyone confirm/deny or offer up a better suggestion?
Intermediate & Advanced SEO | | JusinDuff0 -
How can I block unwanted urls being indexed on google?
Hi, I have to block unwanted urls (not that page) from being indexed on google. I have to block urls like example.com/entertainment not the exact page example.com/entertainment.aspx . Is there any other ways other than robot.txt? If i add this to robot.txt will that block my other url too? Or should I make a 301 redirection from example.com/entertainment to example.com/entertainment.aspx. Because some of the unwanted urls are linked from other sites. thanks in advance.
Intermediate & Advanced SEO | | VipinLouka780