Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Googlebot Reports All URLs as Unreachable
Webmaster Tools is reporting that all of our site's URLs - located www.zuken.com - are "unreachable." The URLs display correctly in all browsers. We recently switched hosting providers, but they assure us there is no security setting that would be causing this issue. Any ideas? Is this a glitch with Webmaster Tools?
Web Design | | Zuken0 -
Help with error: Not Found The requested URL /java/backlinker.php was not found on this server.
Hi all, We got this error for almost a month now. Until now we were outsourcing the webdesign and optimization, and now we are doing it in house, and the previous company did not gave us all the information we should know. And we've been trying to find this error and fix it with no result. Have you encounter this issue before? Did anyone found or knows a solution? Also would this affect our website in terms of SEO and in general. Would be very grateful to hear from you. Many thanks. Here is what appears on the bottom of the site( www.manvanlondon.co.uk) Not Found The requested URL /java/backlinker.php was not found on this server. <address>Apache/2.4.7 (Ubuntu) Server at 01adserver.com Port 80</address> <address> </address> <address> </address>
Web Design | | monicapopa0 -
Is there a way to redirect URLs with a hash-bang (#!) format?
Hi Moz, I'm trying to redirect www.site.com/locations/#!city to www.site.com/locations/city. This seems difficult because anything after the hash character in the URL does not make it to the server thus cannot be parsed for rewriting. Is there an SEO friendly way to implement these redirects? Thanks for reading!
Web Design | | DA20130 -
Are URL suffixes ignored by Google? Or is this duplicate content?
Example URLs: www.example.com/great-article-on-dog-hygiene.html www.example.com/great-article-on-dog-hygiene.rt-article.html My IT dept. tells me the second instance of this article would be ignored by Google, but I've found a couple of instances in which Google did index the 'rt-article.html' version of the page. To be fair, I've only found a couple out of MANY. Is it an issue? Thanks, Trisha
Web Design | | lzhao0 -
Need help in website URL Structure
I have been working on a brand new website currently it is live but I have disallow Googlebots temporarily as I dint want any negative impact. The business of the site is to generate leads , they install and sell Stairlifts and used Stairlifts. There are two main categories New Stairlifts and Reconditioned Stairlifts Currently the URL for new Stairlifts is : http://willowstairlifts.co.uk/stairlifts/ and for Reconditioned Stairlifts is: http://willowstairlifts.co.uk/reconditioned-stairlifts/ My concerns are that the word Stairlifts is mentioned twice in the urls so is it going to have a negative impact or panda penalty? I am thinking of changing them to http://willowstairlifts.co.uk/new/ and the product pages to display as http://willowstairlifts.co.uk/new/brooks/ Currently its http://willowstairlifts.co.uk/stairlifts/brooks/ Same with reconditioned Stairlifts I like to change it to : http://willowstairlifts.co.uk/reconditioned Also its product pages to http://willowstairlifts.co.uk/reconditioned/brooks/ As currently its http://willowstairlifts.co.uk/reconditioned-stairlifts/brooks/ Thanks
Web Design | | conversiontactics0 -
URL structure for multiple cities?
Hi, i am in the process of setting up a business directory site that will be used in a number of cities, though i am initially launching with only one city. My question is, what is the best URL structure to use for the site and should i start using this URL structure from day one? At the moment i am using www.mysite.com.au as my primary website where it contains all listings for the the one initial launch city. Though to plan for the future i was considering this URL structure: www.mysite.com.au/cityname so for example, if i launch in the city Sydney initially then all website traffic that goes to www.mysite.com.au would simply be redirected (302 temp redirect?) to www.mysite.com.au/sydney. When i expand to other cities www.mysite.com.au would simply be a "select your city" screen that then redirects to the city of choice (similar to www.groupon.com page). How would doing a 302 redirect from www.mysite.com.au to www.mysite.com.au/city impact on SEO for the initial launch? Or should i just place this on the root domain since no other cities exist at the moment?
Web Design | | adamkirk0 -
Optimzing a new ecommerce site, Need help with URL
Hi We are putting up a new ecommerce website and for product description, our tech team indicates that they must have the skun numbers in the URL. Which one of the following URL structure do you find the most SEO freindly? 1. http://www.Site.com/SKUNumber/ProductDescription/ or 2. http://www.Site.com/ProductDescription/SKUNumber/ My personal opinion is that most relevant content should be on load page so I like option 1. Thanks
Web Design | | CookingCom0 -
Javascript changing URL - Thoughts?
So, our developer just created a player at the bottom of this site I work for. It's not really important what it is. The thing is, when you go to our home page now, the javascript changes the url from www.site.com to www.site.com/home It's not actually redirected or anything (no 301, it's just the javascript doing this), but I'm worried that if someone links back to our site they're going to surely pull that URL to point back to, which is wrong. Also, when you go to a category, the URL changes from www.site.com/category to www.site.com/home#category. Again, it's not a redirect but I'm still worried people will link back to this since it's on the entire site now... I'm suggesting that we turn off this new feature until we find a workaround. I just wanted to confirm with you guys that this is best. Thanks
Web Design | | poolguy0