Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Does adding new pages, new slugs, new URLS in a site affects rankings and visibility?
hi reader, i have decided to add new pages to my site. if i add new urls, i feel like i have to submit the sitemap again. my question is, does submitting sitemap again with new slugs or urls affects visibility is serps, if yes, how do i minimize the impact?
Web Design | | SIMON-CULL0 -
Is there a way to host my website.com/BLOG URL PATH from a different host than my main website.com host?
Is there a way to host my website.com/BLOG URL PATH from a different host than my main website.com host? Is it accomplish-able with DNS settings or are there other considerations that might lead to complications doing this? Specifically, we are investigating install WordPress on a dedicated host, JUST to power the blog for our main website, but our main website is on an internal proprietary hosting and CMS. So basically we're trying to host: website.com --> OFF OF CURRENT INTERNAL HOSTING website.com/blog/ --> OFF OF THIRD PARTY HOSTING (USING WORDPRESS) I know this is a technical question beyond the scope of SEO, but I'm figuring there are members of the community that may have tried this already so I'm floating it here. Many thanks! Cheers.
Web Design | | AlexVelazquez0 -
Pushstate and Infinite Scrolling Article Pages: Is it detrimental to not change URLs as the page is being scrolled?
I've noticed a recent trend of news sites using infinite scrolling on article pages to garner more pageviews and I can assume serve up more ads. Here is an overview. Here is an article from NBC news that uses this technique: http://www.nbcnews.com/pop-culture/music/grammys-2016-here-s-why-adele-s-performance-was-out-n519186 Studies have shown that this technique has decreased bounce rates by +15% for some sites. My question is: If a site is using the technique without changing URLs as the user scrolls down what overall negative effects does this have? Obviously you wouldn't be getting credit for the extra pageviews but I was wondering if there were any indexation implications with this. Here is an example of article infinite scrolling without changing the URL: http://www.wftv.com/news/national-content/deputies-wife-attacks-husband-because-he-didnt-get-her-a-valentines-day-gift/87691927
Web Design | | Cox-Media-Group1 -
Switched from Wix to Wordpress dreaded hashtag URL
Recently took over managing a site for a non-profit which was using the dreaded Wix. Switched over to Wordpress but now Google still has the old URL's with the hashtag. Can't forward them in .htaccess and don't want to add javascript for fear of slowing down load time. I found a solution that seems like it will take hours and hours of work. I found the solution at http://www.thedriversgarage.com/web-technology/redirecting-hashbang-urls-wix-urls/ but it seems like it would take hours with all the URL's. I submitted an XML sitemap in Google webmaster tools. My question is, how serious could this effect SEO for my site? Google accepted the new sitemap but still has the old URL's in SERP. How long does this generally take to remove? Will the hashtag URL's penalize the site for duplicate content? If so is there a way to tell Google the homepage without hashtags is the page with original content? Sort of like the rel=canonical tag which I know wont work as the hashtag URL's all redirect to the homepage so they will all have the tag. Does Google ignore the hashtag? Could there even be a benefit to this, possibly the homepage getting more page authority due to the redirects? How serious is this? Thanks in advancing.
Web Design | | limited70 -
URL & Link Hierarchy - juice flow direction from backlinks?
Our site is very regional, so we focus all of our seo efforts on each of these region landing pages. For Example: domain.com/toys/us/ca/san-francisco We added an informational page (ex. reviews) and gave it a url like this: domain.com/toys/us/ca/san-francisco/reviews Question: Will external backlinks to domain.com/toys/.../reviews provide any link juice value to it's hierarchical parent page: domain.com/toys/us/ca/san-francisco?
Web Design | | 42Floors0 -
How to bounce back after a new url & new site design?
About a month ago, my company changed domains (from the long-established www.imageworksstudio.com to the new www.imageworkscreative.com) and also did a complete overhaul of our site. We tried to do everything necessary to keep Google happy as we went through this change, but we've suffered a drastic loss of both rankings and traffic. I know that can happen as a result of a redesign AND as a result of a new domain, but I'm wondering how long you would expect it to take before we bounced back and also, what can we do in the meantime to improve?
Web Design | | ScottImageWorks0 -
How to find internal pages linking to a URL?
Hey, I had an issue where a client found a bad link on their site then I went to fix it and couldn't figure out where on earth it was. I tried using different software which would find the link, but not tell me where it was linked from. I asked for some help from someone in my office and they found it in about 15 seconds. Their strategy was "think like a client - just click everywhere". Is there a way to quickly find what URLs are pointing to a specific URL? Cheers
Web Design | | renegadeempire0 -
Infinite Scrolling & "Long Scrolling" same or different??
Can anyone please confirm for me the difference if any, between site design that incorporates Long Scrolling and Infinite Scroll? I was told (by an unnamed source) these were different designs and that "long scrolling" is better for SEO . However, in all my research I am unable to prove there is any difference between the two. I understand Infinite scroll may include Ajax, but does that mean Long Scrolling does not? If anyone has any references or can supply any further education here, I'd appreciate it! Thanks!
Web Design | | ACNINTERACTIVE0