Will Google Recrawl an Indexed URL Which is No Longer Internally Linked?
-
We accidentally introduced Google to our incomplete site. The end result: thousands of pages indexed which return nothing but a "Sorry, no results" page. I know there are many ways to go about this, but the sheer number of pages makes it frustrating.
Ideally, in the interim, I'd love to 404 the offending pages and allow Google to recrawl them, realize they're dead, and begin removing them from the index. Unfortunately, we've removed the initial internal links that lead to this premature indexation from our site.
So my question is, will Google revisit these pages based on their own records (as in, this page is indexed, let's go check it out again!), or will they only revisit them by following along a current site structure?
We are signed up with WMT if that helps.
-
What we run into often is that on larger sites there 1) still are internal links to those pages from old blog posts etc. You have to really scrub your site to find those and manually update. I am only mentioning this as unless you used a tool to crawl the site and looked at it with a fine toothed comb, you might be surprised to find the links you missed 2) there are still external links to those pages. That said, even if 1 and 2 are not met, Google will still recrawl (although not as often). Google assumes that any initial 404 or even 301 may be a temporary error and so checks back. I have seen urls that we removed over a year ago, Google will still ping them. They really hang onto stuff. I have not gone as far as the 301 to a directory that I deindex, but generally just watch to see them show up and then fall out of Webmaster Tools and then I move on.
-
Right, but having lots of 404's that are still indexed probably isn't good for your site in general. If you wanted them de-indexed, 301'ing them to a new folder and filing a single removal request for that entire directory would probably work.
Thanks for the help. I've heard from a few people that they will recrawl these pages again even if nothing is linking to them. That's reassuring. Thanks all.
-
No reason other than finding all those 404 pages and doing individual URL removals for each isn't a very productive task. 404s generally have no impact on search rankings.
-
Interesting. Any reason why you haven't simply filed a removal request? I feel if there's too many to manually do, you could 301 them to a specific directory and then manually remove that directory all at once?
-
Hi Martijn,
Thanks for the response. I must apologize as I left out an important detail. While are pages are "No results" and basically useless to the user, they're not actually 404'd pages. They're live, valid pages that basically offer nothing.
As I stated earlier, 404'ing them would be ideal for us if we could be sure Google would recrawl them. I am hesitant due to uncertainty of Googlebot re-crawling unlinked internal links. Our deeper pages like these have not been updated/recrawled yet, so I'm a bit unsure as to how likely they will.
I guess I should just go ahead and 404 all of them now and see what happens, since it can't hurt. Just curious about Googlebot in general since it always helps to know more!
-
Don't count on Google dropping those 404ing pages from the index any time soon. We have pages that have 404d for over a year and they're still in the index.
-
They'll eventually drop these pages as they already know where to find them and as they give the proper 404 header they know that's a sign to drop them. In most cases pages that 404 are already not linked from any other pages so that will also be a sign to search engines that the specific pages aren't important anymore.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL Indexing with Keyword
Hi, My webpage url is indexed in Google but don't show when searching the Main Keyword. How can i index it with keyword. It should show on any SERP when the keyword is searched. Any suggestions.
Technical SEO | | green.h1 -
How to know how much pages are indexed on Google?
I have a big site, there are a way to know what page are not indexed? I know that you can use site: but with a big site is a mess to check page by page. This is a tool or a system to check a entire site and automatically find non-indexed pages?
Technical SEO | | markovald0 -
Will multiple internal links with the same anchor text hurt a site's ranking?
Hello, I just watched this video from the Google Webmasters channel at YouTube: http://www.youtube.com/watch?v=6ybpXU0ckKQ My question: If a site is built up on subdomains, will linking the different subdomains with exact anchor text hurt the site's ranking? Thanks
Technical SEO | | arnoldwender0 -
Unnatural Link Warning No Longer Showing in GWT?
Hi, We recently took on a new client that had been hit by the recent Google updates. After having a really good look at their analytics and their link profile it looked like they had been hit with over-optimisation of anchor text. Over the last month or so we have been working to remove a pile of links that contain their main keyword starting with the easiest to remove and the lowest quality. At the same time we have been building links using sematic keywords and junk anchor text in a bid to dilute the ration of main anchor text within their profile. We have a timetable of tasks drawn-up which we are working through, at the end of the timetable when all tasks were complete we planned to write a very nice reconsideration request to Mr Google. I have logged in to Google Webmaster Tools this morning and I have noticed that the 'Unnatural Links' notice has been removed from that domain. Does anyone know if this signifies anything? We haven't sent a reconsideration request to google yet. Thanks.
Technical SEO | | AdeLewis
Ade.0 -
Unit # No Longer Showing On Google Places
We've noticed that the unit # is no longer showing on a clients Places profile page. Any thoughts on why and if relevant to rankings? Places page - https://skitch.com/kyegrace/8fimr/vuppie-real-estate-team or http://maps.google.ca/maps/place?hl=en&qscrl=1&nord=1&rlz=1T4GGNI_en-GBCA461CA461&gs_upl=&ion=1&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&biw=1366&bih=641&wrapid=tlif133037839755610&um=1&ie=UTF-8&q=realtor&fb=1&gl=ca&hq=realtor&hnear=0x548673f143a94fb3:0xbb9196ea9b81f38b,Vancouver,+BC&cid=5594900399034912659&ei=o_ZLT4XHM6ObiQKZnbCdDw&sa=X&oi=local_result&ct=placepage-link&resnum=8&ved=0CIMBEOIJMAc and as it appears in the backend https://skitch.com/kyegrace/8fimk/google-places-analytics Any insight greatly appreciated!
Technical SEO | | kyegrace0 -
How to display the exact url of our subsite in Google
Hi, I'm new to SEO and we just recently relaunched our site. Our site consist of 6 hotels that acts as a subsite. We noticed that when search for one of the hotels what is coming in the google is the main website. Example: We search for flora grand. We expect that in Google it will display the first link as www.florahospitality.com/dubai-flora-grand-hotel.aspx. But it show the main site which is www.florahospitality.com What do I miss here?
Technical SEO | | shebinhassan0 -
Why is this url showing as "not crawled" on opensiteexplorer, but still showing up in Google's index?
The below url is showing up as "not crawled" on opensitexplorer.com, but when you google the title tag "Joel Roberts, Our Family Doctors - Doctor in Clearwater, FL" it is showing up in the Google index. Can you explain why this is happening? Thank you http://doctor.webmd.com/physician_finder/profile.aspx?sponsor=core&pid=14ef09dd-e216-4369-99d3-460aa3c4f1ce
Technical SEO | | nicole.healthline0 -
Internal Link Counts in SEOMoz Report?
Hi, We ran a site diagnostic and it came back with thousands of pages that have more than 100 internal links on a page; however, the actual number of links on those pages seems to be far less than what was reported. Any ideas? Thanks! Phil UPDATE: So we've looked at the source code and realized that for each product we link to the product page in multiple ways - from the product image, product title and price. So we have three internal links to the same page from each product listing, which is being counted by the SEOMoz crawler as hundreds of links on each page. But in terms of the Googlebot, is this as egregious as having hundreds of links to different pages or does it not matter as much?
Technical SEO | | beso1