Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Looking for list Pro's & Con's of removing Folder from URL?
Hi We have a sub-folder ("/shop-by-department/") which is pretty much useless on our site and I'm looking to remove it. But the team want a list of the Pro's & Con's in doing so. So for example I'll be changing www.example.ie/shop-by-department/furniture/beds/product-a to www.example.ie/furniture/beds/product-a I know there will be an intial hit as Google adjusts to the change but think it's definitely the way to go. I was lookng for a complete list of the Pro's & Con's to send onto the team. It'll be going to the traditional marketing (print, radio, etc.) too so can ve top-level points too. Hope you can help! Thanks
Web Design | | Frankie-BTDublin0 -
Increase in Soft 404s due to Custom 404 page?
Hi all, We have noticed recently soft 404s are increasing day by day; which are landing on our custom 404 page created a month back. Other 404 pages are NOT landing on custom 404 page. Does this custom 404 page hurting us by causing an increase in soft 404s? Our CMS is WordPress. Thanks
Web Design | | vtmoz0 -
Switched from Wix to Wordpress dreaded hashtag URL
Recently took over managing a site for a non-profit which was using the dreaded Wix. Switched over to Wordpress but now Google still has the old URL's with the hashtag. Can't forward them in .htaccess and don't want to add javascript for fear of slowing down load time. I found a solution that seems like it will take hours and hours of work. I found the solution at http://www.thedriversgarage.com/web-technology/redirecting-hashbang-urls-wix-urls/ but it seems like it would take hours with all the URL's. I submitted an XML sitemap in Google webmaster tools. My question is, how serious could this effect SEO for my site? Google accepted the new sitemap but still has the old URL's in SERP. How long does this generally take to remove? Will the hashtag URL's penalize the site for duplicate content? If so is there a way to tell Google the homepage without hashtags is the page with original content? Sort of like the rel=canonical tag which I know wont work as the hashtag URL's all redirect to the homepage so they will all have the tag. Does Google ignore the hashtag? Could there even be a benefit to this, possibly the homepage getting more page authority due to the redirects? How serious is this? Thanks in advancing.
Web Design | | limited70 -
URL Structure's Effect on SEO
Hello all, I have a client who currently has a very poor URL structure. As it stands, their URLs are formatted in the following manner: http://www.domain.com/category/subcategory/page In all my years of SEO, however, I have always tried to implement the following format: http://www.domain.com/category/page The web designer for this particular project has been very reluctant to change the structure for obvious reasons, but I'm convinced that by modifying the URL structure, SEO will improve. I am correct in thinking this? Likewise, if I am able to get the URL structure changed, what do I need to look out for to make sure we don't lose any traction for our keyword terms? Any and all insight/suggestions is greatly appreciated. Thanks for reading!
Web Design | | maxcarnage0 -
How to find internal pages linking to a URL?
Hey, I had an issue where a client found a bad link on their site then I went to fix it and couldn't figure out where on earth it was. I tried using different software which would find the link, but not tell me where it was linked from. I asked for some help from someone in my office and they found it in about 15 seconds. Their strategy was "think like a client - just click everywhere". Is there a way to quickly find what URLs are pointing to a specific URL? Cheers
Web Design | | renegadeempire0 -
Does Google count the domain name in its 115-character "ideal" URL length?
I've been following various threads having to do with URL length and Google's happiness therewith and have yet to find an answer to the question posed in the title. Some answers and discussions have come close, but none I've found have addressed this with any specificity. Here are four hypothetical URLs of varying lengths and configurations: EXAMPLE ONE:
Web Design | | RScime25
my-big-widgets-are-the-best-widgets-in-the-world-and-come-in-many-vibrant-and-unique-colors-and-configurations.html (115 characters) EXAMPLE TWO: sample.com/my-big-widgets-are-the-best-widgets-in-the-world-and-come-in-many-vibrant-and-unique-colors-and-configurations.html (126 characters) EXAMPLE THREE: www.sample.com/my-big-widgets-are-the-best-widgets-in-the-world-and-come-in-many-vibrant-and-unique-colors-and-configurations.html (130 characters) EXAMPLE FOUR: http://www.sample.com/my-big-widgets-are-the-best-widgets-in-the-world-and-come-in-many-vibrant-and-unique-colors-and-configurations.html (137 characters) Assuming the examples contain appropriate keywords and are linked to appropriate anchor text (etc.,) how would Google look upon each? All I've been able to garner thus far is that URLs should be as short as possible while still containing and contextualizing keywords. I have 500+ URLs to review for the company I work for and could use some guidance; yes, I know I should test, but testing is problematical to the extreme; I look to the collective/accumulated wisdom of the MOZVerse for help. Thanks.1 -
301 Redirect ! Joomla Pages, Already ranking. ( just wanted to change the url
hey guys hope everyone had a new year. I am ranking for a page on my site that i want to ( not specifically move ), but just change the url name: It is too long i think and i want to move it from one portion of my architecture to another menu. I have never physically done a 301 redirect myself, always had someone do it for me. I wanted some pointers. Since it is a fairly new site 4 months old! What are my options. Do i need to 301 redirect the page, if i am changing the Structure and AI of my site, or can i just change the url as is and it will still get ranked? How do i keep that url put delete the page and redirect it ? Sorry its very simple but i wanted to get the communities help to continue on ! Best Wishes HAmpig
Web Design | | BizDetox0 -
How to serve a Mobile & Full Site using one URL?
Hello, Does anyone know of any resources or tutorials that outline how to serve a smartphone-formatted website using the same URL as the full site? I know that one solution is using media-queries to serve a seperate CSS stylesheet, but you still have the full HTML source code. In other words, I might want to serve a smartphone & desktop user different content, but under one URL. WP Touch (Wordpress Plugin) is a perfect example of what I mean, but how is it technically achieved? It serves two different sets of HTML for smartphone & full, but using one URL http://www.bravenewcode.com/store/plugins/wptouch-pro/
Web Design | | petecampbell-bmi0