Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
WordPress redirects are taking too long to navigate: Anyone ever faced this?
Hi community, We are using wordpress website. We have redirected hundreds of URLs from wordpress redirect manager for last 10 years around. Suddenly from last one week, the redirects are taking too long to navigate to the pages; like around 1 minute. Could you anybody face the same issue? Please help me on this. Thanks
Web Design | | vtmoz0 -
Best Practices for Leveraging Long Tail Content & Gated Content
Our B2B site has a lot of of long form content (e.g., transcriptions from presentations and webinars). We'd like to leverage the long tail SEO traffic driven to these pages and convert those visitors to leads. Essentially, we'd like Google to index all this lengthy, keyword-rich content AND we'd like to put up a read gate that requires users to register before viewing the full article. This is a B2B site, and the goal is to generate leads. Some considerations and questions: How much of the content to share before requiring registration? Ask too soon and it's a terrible user experience, give too much away and our business objectives are not met. Design-wise, what are good ways to do this? I notice Moz uses a "teaser" to block Mozinar content, and I've seen modals and blur bars on other sites. Any gotchas that Google doesn't like that we should be aware of? Trying to avoid anything that might seem like cloaking. Is it better to split the content across several pages (split a 10K word doc across 10 URLs and include a read gate on each) or keep to one page? Thank you!
Web Design | | Allie_Williams0 -
Interlinking using Dynamic URLs Versus Static URLs
Hi Guys, Could you kindly help us in choosing best approach out of mentioned below 2 cases. Case. 1 -We are using: We interlink our static pages(www.abc.com/jobs-in-chennai) through footer, navigation & by showing related searches. Self referential Canonical tags have been implemented. Case. 2 -We plan to use: We interlink our Dynamic pages(www.abc.com/jobs-in-chennai?source=footer) through footer, navigation & by showing related searches. Canonical tags have been implemented on dynamic urls pointing to corresponding static urls Query 1. Which one is better & expected to improve rankings. Query 2. Will shifting to Case 2 negatively affect our existing rankings or traffic. Regards
Web Design | | vivekrathore0 -
Are these doorway pages or not? Concerned due to Panda 4.0
For a new site we're building, the Products team wants the header (let's call this Product-Header) to have links to every subsection of every section on every page. Since this is a bad idea, I want Product-Header to be coded in such a way that it doesn't appear in the code or the links are nofollow, noindex. I want to instead create static versions of these pages without the Product-Header. The homepage links to the static URL section pages, those main section pages link to static subsection pages, and so on. It's one nice silo. I am concerned though that Google won't like this due to these static pages are being created specifically for search engines. Users could click through to this static parallel site from the homepage, or they could use the dynamic URL site. This is similar to what etsy.com is doing where you can search Google for "mermaid bridal" and get this page https://www.etsy.com/market/mermaid_bridal but the dynamic version of the page does not show up. However you can search on etsy.com for " mermaid bridal" and get https://www.etsy.com/search?q=mermaid bridal&ship_to=US. Could these static versions that show up in search engines be seen as doorway pages? I know ebay.com got spanked for doorway pages and I don't want to do anything that would get this site penalized.
Web Design | | CFSSEO0 -
Major URL changes in new site launch
Hey Guys - we recently launched a new website for a client. Prior, all of their URLs were dynamic via an old-school Cold Fusion CMS. We basically had to rewrite 90% of the sites URLs (site is like 300 pages). The new URLs are SEO friendly and the on-page SEO is strong; but the page rank/authority is starting from scratch from these pages and placement has decreased more most of the new pages with competitive keywords. We set up all of the 301 redirects properly and are actively monitoring in Google Webmaster Tools. **Anything else I can do to lessen the pain and get these pages higher page rank/authority sooner rather than later?**Thanks for all of your help.
Web Design | | NobleStudios0 -
How important is URL length?
Is URL length really that important? I have many articles that would lose meaning if the URL was shortened and for most, they would have to be under the root domain instead of under the category in order to fit. Has anyone tested if they were negatively impacted by URL's that are too long?
Web Design | | HMCOE0 -
Javascript changing URL - Thoughts?
So, our developer just created a player at the bottom of this site I work for. It's not really important what it is. The thing is, when you go to our home page now, the javascript changes the url from www.site.com to www.site.com/home It's not actually redirected or anything (no 301, it's just the javascript doing this), but I'm worried that if someone links back to our site they're going to surely pull that URL to point back to, which is wrong. Also, when you go to a category, the URL changes from www.site.com/category to www.site.com/home#category. Again, it's not a redirect but I'm still worried people will link back to this since it's on the entire site now... I'm suggesting that we turn off this new feature until we find a workaround. I just wanted to confirm with you guys that this is best. Thanks
Web Design | | poolguy0 -
Custom URL's with Bigcommerce Issue (Is it worth it?)
We're building out a store in Bigcommerce, who for all intensive purposes is perfect for SEO besides the fact that you can not change the URL's to be custom. My question is, does this kill the SEO value of bigcommerce, despite everything else being great? So for example the URL's for a category page would be something like this www.mysite.com/categories/keyword and the product URL's are pulled in by product name, so product URL's could be something like www.mysite.com/products/Product-Description-Long-223.html (notice the words will be capitalized and their is no way to remove the trailing .html) I could go with Interspire (the liscenced version of Bigcommerce) or Magento so I can custom edit this stuff. But then its a lot more work for my employee's on the buildout.
Web Design | | iAnalyst.com0