Googlebot crawling partial URLs
-
Hi guys,
I've checked my email this morning and I've got a number of 404 errors over the weekend where Google has tried to crawl some of my existing pages but not found the full URL.
Instead of hitting 'domain.com/folder/complete-pagename.php' it's hit 'domain.com/folder/comp'.
This is definitely Googlebot/2.1; http://www.google.com/bot.html (66.249.72.53) but I can't find where it would have found only the partial URL. It certainly wasn't on the domain it's crawling and I can't find any links from external sites pointing to us with the incorrect URL. GoogleBot is doing the same thing across a single domain but in different sub-folders.
Having checked Webmaster Tools there aren't any hard 404s and the soft ones aren't related and haven't occured since August. I'm really confused as to how this is happening..
Thanks!
-
This is why I love this forum. We recently started seeing these urls in our GWT report. We have hundreds of truncated urls that end in "..." that go nowhere. We can't figure out where these are coming from. We thought it could be G's relatively new privacy policy w/ not passing along the data, but we're not sure. Anyone have any thoughts on that?
Thanks!
-
@vitalscom - it's at least good to know someone else has experienced this!
Due to the volume I don't consider doing 301s a permanent solution. Fortunately there is a noindex on our 404 page so Google et al shouldn't take these errors into consideration.
-
I'm seeing it too - It looks like it's coming from Superpages but the truncated URLs are not actually hyperlinks, so why is Google following them is a good question.
http://swbd-out.superpages.com/webresults.htm?qkw=Find+A+Physician&qcat=web
I'm fixing this on my end with a modrewrite in HTACCESS, all of my sites truncated URL problems either end in ".." or "..." so any URL that ends in those two instances will get 301 redirected to the homepage.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
URL structure with dash or slash
Hi, everyone Basically I am editing my website page's URL for SEO Optimisation and I am not sure which URL structure is best for SEO. The main different is the sign ( dash or slash ) before the product-code. HERE ARE TWO EXAMPLE www.example.com/long-tail-keyword-product-code www.example.com/long-tail-keyword/product-code To get more idea of my page, here is one of the product from my website : http://www.okeus.co.uk/pro_view-3.html My website is selling my own product, as a result the only keyword can be found was the name of the product and I separated different design by different code. Any experts who are willing help would be very much appreciated.
Intermediate & Advanced SEO | | chrisyu781 -
Changing URLs
URLs of my web pages are based on the titles of pages. For sampel, if a title page is called "product ABC", then the URL for this page is /product-abc. Google and all other search engines have indexed all pages. Now I want to change the titles of some sites. Should I change the URLs accordingly, or should I rather leave URLs as they are. SEO Best Practice says that keywords must be placed both in the title, and in the URL. I think that Google will think that pages have douplicate content with diffrent titles, and it comes to many 404 error, if I change the URLs. What do you recommend in this case?
Intermediate & Advanced SEO | | kian_moz0 -
Duplicate URL Parameters for Blog Articles
Hi there, I'm working on a site which is using parameter URLs for category pages that list blog articles. The content on these pages constantly change as new posts are frequently added, the category maybe for 'Heath Articles' and list 10 blog posts (snippets from the blog). The URL could appear like so with filtering: www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016 www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=1 All pages currently have the same Meta title and descriptions due to limitations with the CMS, they are also not in our xml sitemap I don't believe we should be focusing on ranking for these pages as the content on here are from blog posts (which we do want to rank for on the individual post) but there are 3000 duplicates and they need to be fixed. Below are the options we have so far: Canonical URLs Have all parameter pages within the category canonicalize to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general and generate dynamic page titles (I know its a good idea to use parameter pages in canonical URLs). WMT Parameter tool Tell Google all extra parameter tags belong to the main pages (e.g. www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general&year=2016&page=3 belongs to www.domain.com/blog/articles/?taxonomy=health-articles&taxon=general). Noindex Remove all the blog category pages, I don't know how Google would react if we were to remove 3000 pages from our index (we have roughly 1700 unique pages) We are very limited with what we can do to these pages, if anyone has any feedback suggestions it would be much appreciated. Thanks!
Intermediate & Advanced SEO | | Xtend-Life0 -
Internal links and URL shortners
Hi guys, what are your thoughts using bit.ly links as internal links on blog posts of a website? Some posts have 4/5 bit.ly links going to other pages of our website (noindexed pages). I have nofollowed them so no seo value is lost, also the links are going to noindexed pages so no need to pass seo value directly. However what are your thoughts on how Google will see internal links which have essential become re-direct links? They are bit.ly links going to result pages basically. Am I also to assume the tracking for internal links would also be better using google analytics functionality? is bit.ly accurate for tracking clicks? Any advice much appreciated, I just wanted to double check this.
Intermediate & Advanced SEO | | pauledwards0 -
Crazy long weird URLs... help
I have a HTML website, mysite1.com, and I placed a link on the home page to another one of my sites, mysite2.com Today I checked the links to mysite2.com in Majestic and noticed 24 links coming from the mysite1.com instead of just one link. The URLs from mysite1.com that are showing in Majestic are like this mysite1.com/?epl=4donafvFK3fMXxZXMWQRQLodmPchoXCK5C7-kbBv_agkwlkJrZAoaSDVUlhqFmUqt0f8c2Q6jF6GO6DNMnbidqRsikriF-IEBEt5okmICLEB0FxP36GrsxoPGQ3SGBo1PVR7itDUA4CYmjypn5gi mysite1.com,was inherited from a friend and I believe that it was originally built in Frontpage. Can you tell me how I can get rid of these multiple links as I only want 1 showing from the home page Thanks in advance
Intermediate & Advanced SEO | | JohnPeters0 -
Long URL and Overly Dynamic
I'm having a lot of these "Warnings" show up. I use an Ecommerce site that automatically makes my URL. Is this something I should be concerned about?
Intermediate & Advanced SEO | | floridaoutdoorsrv0 -
Using abbreviations in URL - Matching Keyword
We have a website that uses /us/, /ca/, /va/, etc for URLs of the different U.S. states. How much better is it (or is it at all better) to use /california/ or /virginia/ instead in our URLs to rank for searches that include the name of those states?
Intermediate & Advanced SEO | | Heydarian0 -
How to fix issues regarding URL parameters?
Today, I was reading help article for URL parameters by Google. http://www.google.com/support/webmasters/bin/answer.py?answer=1235687 I come to know that, Google is giving value to URLs which ave parameters that change or determine the content of a page. There are too many pages in my website with similar value for Name, Price and Number of product. But, I have restricted all pages by Robots.txt with following syntax. URLs:
Intermediate & Advanced SEO | | CommercePundit
http://www.vistastores.com/table-lamps?dir=asc&order=name
http://www.vistastores.com/table-lamps?dir=asc&order=price
http://www.vistastores.com/table-lamps?limit=100 Syntax in Robots.txt
Disallow: /?dir=
Disallow: /?p=
Disallow: /*?limit= Now, I am confuse. Which is best solution to get maximum benefits in SEO?0