What might make Bing.bot find a URL that looks like this on our site?
-
I have been doing something Richard Baxter recently suggested and reviewing our server logs.
I have found an oddity that hopefully some of you smart Mozzers can help me figure out.
Here is the line from the server log (there are many more like this):
157.55.32.166 - - [04/Mar/2013:08:00:59 -0800] "GET /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones HTTP/1.1" 200 94133 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)" "-"
See how the www.ccisolutions.com appears after /StoreFront/category/ ? We used to see weird URLs reported in GWT that looked like this, but ever since we fixed our canonical tags to be absolute instead of relative URLs, they no longer appeared in our Webmaster Tools reports.
However, it seems there is still a problem. Where/how could Bingbot be seeing URLs configured this way? Could it be a server issue, or is it most likely a data problem?
Thanks in advance!
Dana
P.S. Could this be resulting from our massive use of relative URLs all over the site?
-
Hi Streamline,
I thought I would circle back and update everyone as to what I found. You were correct about mal-formed URLs being the culprit of this problem. We have many isolated incidences of URLs for internal links that are missing the "/" at the beginning of a relative URL. There are inconsistencies on the relative URLs all over the site. It's certainly an example of one of many problems that can be caused by using relative rather than absolute URLs.
Since we are in the process of completely re-doing the site and moving to a new platform, it's something we can definitely work to get right during the transition.
Thanks again to you, Daniel and Keri for jumping in with answers.
Dana
-
Thanks to you both Daniel and Streamline.
I believe the problem may have to do with our .htaccess file. I am obtaining a copy of it now.
-
Thanks Keri. That's very helpful. I will do that.
-
Hi Dana,
I agree with Streamline, there will be a hidden issue in you site that it attempting to connect to an under formed link (a URL missing 'http://'). Given there is a number of them in one day I will guess this is happening in a templated page.
Have a look at;
It renders as a page.
The best course of action would be resolve it at the source. If you can pinpoint when this issue is due to occur next, have your developer get each page to append it's URL into the log at the beginning of the page. Then you should be able to determine where the issue is occurring. I am hoping you well see a discernible pattern.
Worse case scenario, possibly a canonical will work, OR create a REGEX redirect to handle this URL pattern in htaccess...
Hope this helps,
Dan
-
Dana, you might also want to contact Bing at https://support.discoverbing.com/eform.aspx?productKey=bingwebmaster&ct=eformts&scrx=1. I sent a quick note on Twitter to Duane Forrester and that's the URL he provided.
-
Can you tell from which page Bing is trying to access these URLs? And it only happened on the 4th and not on any other day? Could it be an issue with the sitemap on that day?
I'm looking at your site now and the page http://www.ccisolutions.com/StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones is returning a 200 response code to me, not a 404 code. The key is to figure out how Bing discovered the URL in the first place...
-
While this is certainly a possibility, I'm not sure it's the cause of the problem. If this were the case, wouldn't it most likely cause a 404 error, instead of rendering the proper page (albeit with a very funky URL) and a 200 status code?
The other thing making me think it's not just a poorly constructed link on the site is that there are over 100 of these in the server log, from just one day.
Thoughts?
-
I'm willing to bet that on some page of your site, there is a link pointing to www.ccisolutions.com/StoreFront/category/shure-se-earphones which is missing the "http://" at the beginning. So if Bing or a user tried to click on that link, they would be directed to /StoreFront/category/www.ccisolutions.com/StoreFront/category/shure-se-earphones instead of the correct link.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Site hacked in Jan. Redeveloped new site. Still not ranking. Should we change domain?
Our top ranking site in the UK was hacked at the end of 2014. http://www.ultimatefloorsanding.co.uk/ The site was the subject of a manual spam action from Google. After several unsuccessful attempts to clean it up, using Securi.net and reinstating old versions of the site, changing passwords etc. we took the decision to redevelop the site. We also changed hosting provider as we had received absolutely no support from them whatsoever in resolving the issue. So far we have: Removed the old website files off the server Developed a new website having implemented 301's for all the old URL's (except the spam ones) Submitted a reconsideration request for the manual spam action, which was accepted. Disavowed all the spammy inbound links through Webmaster Tools Implemented custom URL parameters through Google to not index the SPAM URLs ( which were using parameters) Our organic traffic is down by 63% compared to last year, and we are not ranking for most of our target keywords any longer. Is there anything that I am missing in the actions I have taken so far? We were advised that at this stage changing domain and starting again might be the way to go. However the current domain has been used by us since 2007, so it would be a big call. Any advice is appreciated, thanks. Sue - http://www.ultimatefloorsanding.co.uk/
Technical SEO | | galwaygirl0 -
An article we wrote was published on the Daily Business Review, we'd like to post it on our site. What is the proper way?
Part 1
Technical SEO | | peteboyd
We wrote an article and submitted it to the Daily Business Review. They published the article on their website. We want to also post the article on our website for our users but we want to make sure we are doing this properly. We don't want to be penalized for duplicating content. Is this the correct way to handle this scenario written below? We added a rel="canonical" to the blog post (on our website). The rel="canonical" is set to the Daily Business Review URL where the article was originally published. At the end of the blog post we wrote. "This article was originally posted on The Daily Business Review." and we link to the original post on the Daily Business Review. Should we be setting the blog post (on our website) to be a "noindex" or rel="canonical" ? Part 2 Our company was mentioned in a number of articles. We DID NOT write those articles, we were only mentioned. We have also posted those same articles on our website (verbatim from the original article). We want to show our users that we have been mentioned in highly credited articles. All of these articles were posted on our website and are set to be a "noindex". Is that the correct thing to do? Should we be using a rel="canonical" instead and pointing to the original article URL? Thanks in advance MOZ community for your assistance! We tried to do the leg work of our own research for the answers but couldn't find the exact same scenario that we are encountering**.**0 -
How to find temporary redirects of existing site you don't control?
I am getting ready to move a clients site from another company. They have like 35 tempory redirects according to MOZ. Question is, how can I find out then current redirects so I can update everything for the new site? Do I need access to the current htaccess file to do this?
Technical SEO | | scott3150 -
Bringing a large news site back on line - anything to look out for?
Hi, I'm advising an online news site site that has been completely offline for almost 6 months, and is now looking to start back up again. The site seems to be completely gone from google's cache. This might mean moving to new hosting, but with the same URL. The archive has about 7000 original articles. Most of these are date specific news, although there are some longer investigative pieces that are more timeless. Is there any difference (from an SEO/digital marketing perspective) between putting the whole archive online at once, or gradually republishing the old articles? Is there anything I should be aware of, when restarting a website of this size? Thanks - Chris
Technical SEO | | AISFM0 -
What to do with extremely high number of URLs on your site?
Here is the situation: The site has tons of business and personal profiles, the information needed to be categorized as such directories were created in an attempt to keep the URL structure clean - so for example: www.abc.com/product/um/name-here/city-name/state/lastname:3458765 Each profile has a unique ID#, and for some reason there needed to be a category for a user in this case /um/ stands for user name. Webmaster tool steps to resolve state to use an rel=canonical which can be done for that directory /um/ but I am concerned about the bot not being able to find the other pages beyond that directory, like the profile name, city, state associated. So I guess my ultimate question is if I use rel=canonical will the rest of the content not get crawled or indexed as well?
Technical SEO | | TLO0 -
Does it sound like a linkwheel to you?
I have a small group of insurance related blogs and sites (20). I have been creating useful articles and linked them to my main site. These sites are "not" interlinked and they are all in different C-classes ip. I was thinking in linking some of them to some other good resources that I have in article sites like ezinearticles.com with the idea of making my articles available to my readers and also increasing the rate that they get crawl by the bots. Maybe even boosting does pages Page Authority. Thanks, Gabe
Technical SEO | | ggplaylist0 -
How to make multiple url redirection using global.asax in IIS 6?
sir, I am working with IIS 6 site and i ant to redirect three different urls of a domain to one url, i.e, there are the different versions of the same url...so how can i create one? I have found a script on google. but it says redirecting one url. see it here: Sub Application_BeginRequest(ByVal sender as Object, ByVal e as EventArgs) Try Dim requestedDomain As String = HttpContext.Current.Request.Url.ToString().toLower() If InStr(requestedDomain, "http://yoursite.com") Then requestedDomain = requestedDomain.Replace("http://yoursite.com", "http://www.yoursite.com") Response.Clear() Response.Status = "301 Moved Permanently" Response.AddHeader("Location", requestedDomain) Response.End() End If Catch ex As Exception Response.Write("Error in Global.asax :" & ex.Message) End Try End Sub
Technical SEO | | VipinLouka780 -
URL Structure
Hi Guys, I'm in the process of creating a very exciting startup aimed at the baby industry. It's essentially a social commerce question where parents can shop for products, create lists of products and ask questions. The challenge I'm facing is how best to structure my URLs from an SEO standpoint. For example a common baby topic such as "feeding", can sit in all three categories: Shopping category aggregates all products related to feeding List category aggregates all lists related to feeding Question category aggregates all question and answers on feeding So for that keyword "feeding" you have 3 potential landing pages. What I was wondering is what is the most effective way of doing it? I was thinking of something along these lines: /shopping/feeding /baby_list/feeding /ask/feeding Would love to hear your points of view on this. Thanks! Walid
Technical SEO | | walidalsaqqaf0