Bingpreview/1.0b Useragent Using Adding Trailing Slash to all URLs
-
The Bingpreview crawler, which I think exists in order to take snapshots of mobile friendly pages, crawled my pages last night for the first time. However, it is adding a trailing slash to the end of each of my dynamic pages. The result is my program is giving the wrong page--my program is not expecting a trailing slash at the end of the urls. It was 160 pages, but I have thousands of pages it could do this to.
I could try doing a mod rewrite but that seems like it should be unnecessary. ALL the other crawlers are crawling the proper urls. None of my hyperlinks have the slash on the end. I have written to Bing to tell them of the problem.
Is anyone else having this issue? Any other suggestions for what to do?
The user agent is: Mozilla/5.0 (iPhone; CPU iPhone OS 7_0 like Mac OS X) AppleWebKit/537.51.1 (KHTML, like Gecko) Version/7.0 Mobile/11A465 Safari/9537.53 BingPreview/1.0b
-
Will do. Forgot to mention Bing is checking into it. But for the reasons you mentioned I am still going to do the 301s. Thanks again.
-
Sounds like a plan. I'd also make every redirect a 301, just in case. Cheers.
-
Thanks for your reply Cyrus. Wow, so much to learn.
I will put in logic via a mod redirect to basically remove the trailing slash and go to the resulting url because otherwise all the trailing slash urls will be a different page of basically a 'no-product' business and the like.
These are all dynamically generated pages, so I think as long as I resolve to the 'proper' no-slash version then I won't need to worry about anything else, like a rel=canonical tag because there wont be any identical content.
Does that sound right to you?
-
On one hand I'd agree with you that you shouldn't have to rewrite those URLs on your end. On the other hand, it's usually best practice to make sure both versions of a URL (with slash and/or without) resolve to the same page. The reason for this is that:
- Search bots, including Google, will often "explore" variations of URLs for discoverability reasons - they want to make sure they are discovering all of your available content.
- People will link to you with and without trailing slashes. If they link to you with a trailing slash and your page breaks, you could be wasting link equity, to say nothing of the bad user experience of people visiting your site from the referral links
- For one reason or another it's common to append URLs with various parameters (for tracking reasons, campaings, etc) and often these URLs are generated by third party services when pointing at your site.
For all of these reasons, it's pretty common to either force redirect trailing slashes (via a 301) or make sure both versions resolve to the same content, and use a rel=canonical tag to indicate to search engines that these are indeed meant to be the same page.
On the other hand, if this is something not feasible and URLs ending in a slash are indeed different pages, you might want to carefully consider what those pages deliver to both humans and bots because it seems inevitable that both will eventually crawl and stumble upon them.
Perhaps not the answer you were looking for, but I hope it helps.
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
When I crawl my website I have urls with (#!162738372878) at the end of my urls
When I crawl my website I have urls with (#!162738372878) at the end of my urls. I used screaming frog to look check my website and I seen these. My normal urls are in there too, but each of them have a copy with this strange symbol and number at the end. I used a website builder called homestead to make the website and I seen a bunch of there urls in my crawl as well - http://editor.homestead.com/faq is an example I recently created a new website with their new website builder and transferred it to my old domain. However, I didnt know they didnt offer 301 redirects or canonical tags(learned about those afterwards) and I changed my page names. So they recommended I leave the old website published along with the new website. So if I search my website name on google, sometimes both will show in the results. I just want to sort this all out somehow. My website is www.coastlinetvinstalls.com Any feedback is greatly appreciated. Thanks, Matt
Intermediate & Advanced SEO | | Matt160 -
Trailing Slashes on URLs
Hi we currently have a site on Wordpress which has two version of each URL trailing slash on URLs and one without it. Example: www.domain.com/page (preferred version - based on link data) www.domain.com/page**/** The non-slash version of the URL has most of the external links pointing to them, so we are going to pick that as the preferred version. However, currently, each version of every URL has rel canonical tag pointing to the non-preferred version. E.g. www.domain.com/page the rel canonical tag is: www.domain.com/page/ What would be the best way to clean up this setup? Cheers.
Intermediate & Advanced SEO | | cathywix0 -
URL in russian
Hi everyone, I am doing an audit of a site that currently have a lot of 500 errors due to the russian langage. Basically, all the url's look that way for every page in russian: http://www.exemple.com/ru-kg/pешения-для/food-packaging-machines/
Intermediate & Advanced SEO | | alexrbrg
http://www.exemple.com/ru-kg/pешения-для/wood-flour-solutions/
http://www.exemple.com/ru-kg/pешения-для/cellulose-solutions/ I am wondering if this error is really caused by the server or if Google have difficulty reading the russian langage in URL's. Is it better to have the URL's only in english ?0 -
URL Changes Twice in the Same Year
I've got a new client with a great site, great off-page optimization and some scars and a hangover from a bad developer relationship. I'd be so grateful for your thoughts on this situation: Some time in the not-too-distant-past, the website is established and new content is posted. We'll call this Alpha. In April 2015, the client migrates to WordPress, implementing 301 redirects on every content page because of the capitalization issues of the old CMS. That means Alpha URLs are redirecting to Betas. Problem is, the new Beta WordPress URLs are the the permalink structure: /%year%/%monthnum%/%postname%/ and update by default when the page content is updated meaning that any updates to existing content cause another 301. It's my belief that for evergreen content, dates in the URL do nothing to help you and might even hurt from a user-experience standpoint, if not a search engine one. So, naturally, I'd like to move to the simple/%postname%/ structure, which would be Gamma. So, here's how I think we should fix it. Step 1: Update the sitemap and navigation and make the desired URL (Gamma) structure the default and the canonical. Step 2: Change the Alpha -> Beta redirects to Alpha -> Gamma Step 3: Add Beta -> Gamma redirects Anyone done this in the past? Anyone have any problems with it?
Intermediate & Advanced SEO | | LindsayDayton0 -
Website.com/blog/post vs website.com/post
I have clients with Wordpress sites and clients with just a Wordpress blog on the back of website. The clients with entire Wordpress sites seem to be ranking better. Do you think the URL structure could have anything to do with it? Does having that extra /blog folder decrease any SEO effectiveness? Setting up a few new blogs now...
Intermediate & Advanced SEO | | PortlandGuy0 -
Received "Googlebot found an extremely high number of URLs on your site:" but most of the example URLs are noindexed.
An example URL can be found here: http://symptom.healthline.com/symptomsearch?addterm=Neck%20pain&addterm=Face&addterm=Fatigue&addterm=Shortness%20Of%20Breath A couple of questions: Why is Google reporting an issue with these URLs if they are marked as noindex? What is the best way to fix the issue? Thanks in advance.
Intermediate & Advanced SEO | | nicole.healthline0 -
Blog URL Canonical
Hi Guy's, I would like to know your thoughts on the following set-up for blog canonical. Option 1 domain.com/blog = <link rel="canonical" href="domin.com/blog"> domain.com/blog-category/general = <link rel="canonical" href="domain.com/blog"> domain.com/blog-article/how-to-set-canonical = no canonical option 2 domain.com/blog = <link rel="canonical" href="domin.com blog"="">(as option 1)</link rel="canonical" href="domin.com> domain.com/blog-category/general = <link rel="canonical" href="domain.com blog-category="" general"="">(this time has the canonical of the category)</link rel="canonical" href="domain.com> domain.com/blog-article/how-to-set-canonical = <link rel="canonical" href="domain.com blog-article="" how-to-set-canonical"="">(this time has the canonical of the article full URL)</link rel="canonical" href="domain.com> Just not sure which is the best option, or even if it is any of the above! Thanks Dan
Intermediate & Advanced SEO | | Dan1e10 -
Lots of incorrect urls indexed - Googlebot found an extremely high number of URLs on your site
Hi, Any assistance would be greatly appreciated. Basically, our rankings and traffic etc have been dropping massively recently google sent us a message stating " Googlebot found an extremely high number of URLs on your site". This first highligted us to the problem that for some reason our eCommerce site has recently generated loads (potentially thousands) of rubbish urls hencing giving us duplication everywhere which google is obviously penalizing us with in the terms of rankings dropping etc etc. Our developer is trying to find the route cause of this but my concern is, How do we get rid of all these bogus urls ?. If we use GWT to remove urls it's going to take years. We have just amended our Robot txt file to exclude them going forward but they have already been indexed so I need to know do we put a redirect 301 on them and also a HTTP Code 404 to tell google they don't exist ? Do we also put a No Index on the pages or what . what is the best solution .? A couple of example of our problems are here : In Google type - site:bestathire.co.uk inurl:"br" You will see 107 results. This is one of many lot we need to get rid of. Also - site:bestathire.co.uk intitle:"All items from this hire company" Shows 25,300 indexed pages we need to get rid of Another thing to help tidy this mess up going forward is to improve on our pagination work. Our Site uses Rel=Next and Rel=Prev but no concanical. As a belt and braces approach, should we also put concanical tags on our category pages whereby there are more than 1 page. I was thinking of doing it on the Page 1 of our most important pages or the View all or both ?. Whats' the general consenus ? Any advice on both points greatly appreciated? thanks Sarah.
Intermediate & Advanced SEO | | SarahCollins0