Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Infinite Scroll and URL Changing
Hi, So my website is having an issue indexing. Much like other sports sites like ESPN or MLB or a variety of others my site changes the URL as you go down the page. So if you go on a news article and continue scrolling you'll go to another news article. I believe that this is creating errors in Search Console with the article being given an error of being "too long". I don't know how to keep this infinite scroll and url changing which increases my pageviews and eliminate the errors. Can someone help?
Web Design | | mattdinbrooklyn0 -
URL Structure's Effect on SEO
Hello all, I have a client who currently has a very poor URL structure. As it stands, their URLs are formatted in the following manner: http://www.domain.com/category/subcategory/page In all my years of SEO, however, I have always tried to implement the following format: http://www.domain.com/category/page The web designer for this particular project has been very reluctant to change the structure for obvious reasons, but I'm convinced that by modifying the URL structure, SEO will improve. I am correct in thinking this? Likewise, if I am able to get the URL structure changed, what do I need to look out for to make sure we don't lose any traction for our keyword terms? Any and all insight/suggestions is greatly appreciated. Thanks for reading!
Web Design | | maxcarnage0 -
Penguin 2.0 drop due to poor anchor text?
Hi, my website experienced a 30% drop in organic traffic following the Penguin 2.0 update, and after years of designing my website with SEO in mind, generating unique content for users, and only focusing on relevant websites in my link building strategy, I'm a bit disheartened by the drop in traffic. Having rolled out a new design of my website at the start of April, I suspect that I've accidentally messed up the structure of the website, making my site difficult to crawl, or making Google think that my site is spammy. Looking at Google Webmaster Tools, the number 1 anchor text in the site is "remove all filters" - which is clearly not what I want! The "remove all filters" link on my website appears when my hotels page loads with filters or sorting or availability dates in place - I included that link to make it easy for users to view the complete hotel listing again. An example of this link is towards the top right hand side of this page: http://www.concerthotels.com/venue-hotels/agganis-arena-hotels/300382?star=2 With over 6000 venues on my website, this link has the potential to appear thousands of times, and while the anchor text is always "remove all filters", the destination URL will be different depending on the venue the user is looking at. I'm guessing that to Google, this looks VERY spammy indeed!? I tried to make the filtering/sorting/availability less visible to Google's crawl when I designed the site, through the use of forms, jquery and javascript etc., but it does look like the crawl is managing to access these pages and find the "remove all filters" link. What is the best approach to take when a standard "clear all..." type link is required on a listing page, without making the link appear spammy to Google - it's a link which is only in place to benefit the user - not to cause trouble! My final question to you guys is - do you think this one sloppy piece of work could be enough to cause my site to drop significantly following the Penguin 2.0 update, or is it likely to be a bigger problem than this? And if it is probably due to this piece of work, is it likely that solving the problem could result in a prompt rise back up the rankings, or is there going to be a black mark against my website going forward and slow down recovery? Any advice/suggestions will be greatly appreciated, Thanks Mike
Web Design | | mjk260 -
Magento URL Structure
I'm about to migrate to Magento and wanted to ask about the optimal URL structure for the following page: Knowledge Centre |-Videos |-Customer Testimonials |-Customer X Would it be better to use: Domain/knowledge-centre/videos/customer-testimonials/customer-x or Domain/customer-x Thanks in advance for any replies.
Web Design | | ssoneil0 -
Can you use a base element and mod_rewrite to alleviate the need for absolute URLs?
This is a follow up question to Scott Parsons' question about using absolute versus relative URLs when linking internally. Andy King makes the statement that this can be done and that it saves additional space (which he claims then can improve page speed). Is this a true and accurate statement? Can using a base element and mod-rewrite alleviate the need for absolute URLs? I need to know before going off on a "change all of our relative URLs to absolutes" campaign. Thanks in advance! Dana
Web Design | | danatanseo0 -
Need help in website URL Structure
I have been working on a brand new website currently it is live but I have disallow Googlebots temporarily as I dint want any negative impact. The business of the site is to generate leads , they install and sell Stairlifts and used Stairlifts. There are two main categories New Stairlifts and Reconditioned Stairlifts Currently the URL for new Stairlifts is : http://willowstairlifts.co.uk/stairlifts/ and for Reconditioned Stairlifts is: http://willowstairlifts.co.uk/reconditioned-stairlifts/ My concerns are that the word Stairlifts is mentioned twice in the urls so is it going to have a negative impact or panda penalty? I am thinking of changing them to http://willowstairlifts.co.uk/new/ and the product pages to display as http://willowstairlifts.co.uk/new/brooks/ Currently its http://willowstairlifts.co.uk/stairlifts/brooks/ Same with reconditioned Stairlifts I like to change it to : http://willowstairlifts.co.uk/reconditioned Also its product pages to http://willowstairlifts.co.uk/reconditioned/brooks/ As currently its http://willowstairlifts.co.uk/reconditioned-stairlifts/brooks/ Thanks
Web Design | | conversiontactics0 -
URLs with Hashtags - Does Google Index Them?
Hi there, I have a potential issue with a site whereby all pages are dynamically populated using Javascript. Thus, an example of an URL on their site would be www.example.com/#!/category/product. I have read lots of conflicting information on the web - some says Google will ignore everything after the hashtag; other people say that Google will now index everything after the hashtag. Does anybody have any conclusive information about this? Any links to Google or Matt Cutts as confirmation would be brilliant. P.S. I am aware about the potential issue of duplicate content, but I can assure you that has been dealt with. I am only concerned about whether Google will index full URLs that contain hashtags. Thanks all! Mark
Web Design | | markadoi840 -
Javascript changing URL - Thoughts?
So, our developer just created a player at the bottom of this site I work for. It's not really important what it is. The thing is, when you go to our home page now, the javascript changes the url from www.site.com to www.site.com/home It's not actually redirected or anything (no 301, it's just the javascript doing this), but I'm worried that if someone links back to our site they're going to surely pull that URL to point back to, which is wrong. Also, when you go to a category, the URL changes from www.site.com/category to www.site.com/home#category. Again, it's not a redirect but I'm still worried people will link back to this since it's on the entire site now... I'm suggesting that we turn off this new feature until we find a workaround. I just wanted to confirm with you guys that this is best. Thanks
Web Design | | poolguy0