Googlebot crawling partial URLs

panini

Hi guys,

I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL.

Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'.

This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders.

Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening..

Thanks!

Improvements

This is why I love this forum. We recently started seeing these urls in our GWT report. We have hundreds of truncated urls that end in "..." that go nowhere. We can't figure out where these are coming from. We thought it could be G's relatively new privacy policy w/ not passing along the data, but we're not sure. Anyone have any thoughts on that?

Thanks!

panini

@vitalscom - it's at least good to know someone else has experienced this!

Due to the volume I don't consider doing 301s a permanent solution. Fortunately there is a noindex on our 404 page so Google et al shouldn't take these errors into consideration.

irvingw

I'm seeing it too - It looks like it's coming from Superpages but the truncated URLs are not actually hyperlinks, so why is Google following them is a good question.

http://swbd-out.superpages.com/webresults.htm?qkw=Find+A+Physician&qcat=web

I'm fixing this on my end with a modrewrite in HTACCESS, all of my sites truncated URL problems either end in ".." or "..." so any URL that ends in those two instances will get 301 redirected to the homepage.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Googlebot crawling partial URLs

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

301 vs Canonical - With A Side of Partial URL Rewrite and Google URL Parameters-OH MY

What's the best URL structure?

Replicating keywords in the URL - bad?

Removing Parameterized URLs from Google Index

Website Re-Launch - New URLS / Old URL WMT

Received "Googlebot found an extremely high number of URLs on your site:" but most of the example URLs are noindexed.

URL stucture like Zappos?

How to prevent Google from crawling our product filter?