Long URLs due to foreign characters
-
I have a site which provides forum sections for various languages. When foreign characters are used in the post title, each letter is replace by a three character replacement such as %93. This conversion makes the URLs long.
The site's software automatically uses the thread's title in the URL. It is never a problem except in these instances.
Any suggestions on how to handle this issue?
-
Thank you John.
The solution you offered works if a site is geared for one particular language. The site I am working with has language dedicated forums covering more then a dozen languages. The end solution will need to adjust for all of them.
I will speak to the forum software about your idea and hopefully we can build something off your suggestion. Thanks for taking the time to share your experience.
-
You should have a meta tag for the page language (adjust language code as needed):
As far as the URLs go... many sites are converting these to non-escaped variants on save. Magento, for example, treats e, é, and ê as e in the url. Check out Lemonde.fr, french news source. They are just stripping the accents as well.
To adjust for the accents, you would need to transliterate them. First, find the function that is generating the URL. Next, if your system allows has the iconv() function:
$new_url = iconv('utf-8', 'us-ascii//IGNORE//TRANSLIT', $old_url);
If not... then you could go this sort of route:
$table = array(
'Š'=>'S', 'š'=>'s', 'Đ'=>'Dj', 'đ'=>'dj', 'Ž'=>'Z',
'ž'=>'z', 'Č'=>'C', 'č'=>'c', 'Ć'=>'C', 'ć'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'Ae',
'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I',
'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'Oe', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U',
'Û'=>'U', 'Ü'=>'Ue', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'ae',
'å'=>'a', 'æ'=>'ae', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i',
'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'oe', 'ø'=>'o', 'ù'=>'u',
'ú'=>'u', 'û'=>'u', 'ü'=>'ue', 'ý'=>'y', 'ý'=>'y',
'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'R', 'ŕ'=>'r', 'Ā'=>'A',
'ā'=>'a', 'Ē'=>'E', 'ē'=>'e', 'Ī'=>'I', 'ī'=>'i',
'Ō'=>'O', 'ō'=>'o', 'Ū'=>'U', 'ū'=>'u', 'œ'=>'oe',
'ß'=>'ss', 'ij'=>'ij'
); $new_url = strtr($old_url, $table);
I'm not sure about Korean handling - perhaps someone else knows how these are being handled?
-John
-
XenForo is the forum software in use.
I was really wondering what type of replacement process would be used?
When Google crawls a russian or korean site, do they convert the characters? If not, is there a way of telling Google "hey, this title is from the Russian forums so please use the Russian alphabet?"
If they do still convert the characters, how do other countries handle this change? The title length would be reduced by two-thirds.
-
Hey Ryan-
What software are you using?
Depending on your coding experience, you may be able to set up replacements for the foreign characters and override the URL generating function.
Just let me know, I may be able to help you out.
-John
Got a burning SEO question?
Subscribe to Moz Pro to gain full access to Q&A, answer questions, and ask your own.
Browse Questions
Explore more categories
-
Moz Tools
Chat with the community about the Moz tools.
-
SEO Tactics
Discuss the SEO process with fellow marketers
-
Community
Discuss industry events, jobs, and news!
-
Digital Marketing
Chat about tactics outside of SEO
-
Research & Trends
Dive into research and trends in the search industry.
-
Support
Connect on product support and feature requests.
Related Questions
-
Infinite Scroll and URL Changing
Hi, So my website is having an issue indexing. Much like other sports sites like ESPN or MLB or a variety of others my site changes the URL as you go down the page. So if you go on a news article and continue scrolling you'll go to another news article. I believe that this is creating errors in Search Console with the article being given an error of being "too long". I don't know how to keep this infinite scroll and url changing which increases my pageviews and eliminate the errors. Can someone help?
Web Design | | mattdinbrooklyn0 -
Pushstate and Infinite Scrolling Article Pages: Is it detrimental to not change URLs as the page is being scrolled?
I've noticed a recent trend of news sites using infinite scrolling on article pages to garner more pageviews and I can assume serve up more ads. Here is an overview. Here is an article from NBC news that uses this technique: http://www.nbcnews.com/pop-culture/music/grammys-2016-here-s-why-adele-s-performance-was-out-n519186 Studies have shown that this technique has decreased bounce rates by +15% for some sites. My question is: If a site is using the technique without changing URLs as the user scrolls down what overall negative effects does this have? Obviously you wouldn't be getting credit for the extra pageviews but I was wondering if there were any indexation implications with this. Here is an example of article infinite scrolling without changing the URL: http://www.wftv.com/news/national-content/deputies-wife-attacks-husband-because-he-didnt-get-her-a-valentines-day-gift/87691927
Web Design | | Cox-Media-Group1 -
URL Structure's Effect on SEO
Hello all, I have a client who currently has a very poor URL structure. As it stands, their URLs are formatted in the following manner: http://www.domain.com/category/subcategory/page In all my years of SEO, however, I have always tried to implement the following format: http://www.domain.com/category/page The web designer for this particular project has been very reluctant to change the structure for obvious reasons, but I'm convinced that by modifying the URL structure, SEO will improve. I am correct in thinking this? Likewise, if I am able to get the URL structure changed, what do I need to look out for to make sure we don't lose any traction for our keyword terms? Any and all insight/suggestions is greatly appreciated. Thanks for reading!
Web Design | | maxcarnage0 -
Are these doorway pages or not? Concerned due to Panda 4.0
For a new site we're building, the Products team wants the header (let's call this Product-Header) to have links to every subsection of every section on every page. Since this is a bad idea, I want Product-Header to be coded in such a way that it doesn't appear in the code or the links are nofollow, noindex. I want to instead create static versions of these pages without the Product-Header. The homepage links to the static URL section pages, those main section pages link to static subsection pages, and so on. It's one nice silo. I am concerned though that Google won't like this due to these static pages are being created specifically for search engines. Users could click through to this static parallel site from the homepage, or they could use the dynamic URL site. This is similar to what etsy.com is doing where you can search Google for "mermaid bridal" and get this page https://www.etsy.com/market/mermaid_bridal but the dynamic version of the page does not show up. However you can search on etsy.com for " mermaid bridal" and get https://www.etsy.com/search?q=mermaid bridal&ship_to=US. Could these static versions that show up in search engines be seen as doorway pages? I know ebay.com got spanked for doorway pages and I don't want to do anything that would get this site penalized.
Web Design | | CFSSEO0 -
Yes or No for Ampersand "&" in SEO URLs
Hi Mozzers I would like to know how crawlers see the ampersand (& or &) in your URLs and if Google frown upon this or not? As far as I know they purely recognise this as "and" is this correct and is there any best practice for implementing this, as I know a lot of people complained before about & in links and that it is better to use it as &, but this is not on links, this is on URLs. Reason for this is that we looking to move onto an ASP.Net MVC framework (any suggestions for a different framework are welcome, we still just planning out future development) and in order to make use of the filter options we have on our site we need a parameter to indicate the difference on a routing level (routing sends to controller, controller sends to model, model sends to controller and controller sends to view < this is pattern of a request that comes in on the framework we will be using). I already have -'s and /'s in the URLs (which is for my SEO structuring) so these syntax can't be used for identifying filters the user clicks or uses to define their search as it will create a complete mess in the system. Now we looking at & to say; OK, when a user lands on /accommodation and they selects De Kelders (which is a destination in our area) the page will be /accommodation/de-kelders on this page they can define their search further to say they are looking for 5 star accommodation and it should be close to the beach, this is where the routing needs some guidance and we looking to have it as follow: /accommodation/de-kelders/5-star&close-to-the-beach. Now, does the "&" get identified by search engines on a URL level as "and" and does this cause any issues with crawling or indexation or would it be best to look at another solution? Thanks, Chris Captivate
Web Design | | DROIDSTERS0 -
How important is URL length?
Is URL length really that important? I have many articles that would lose meaning if the URL was shortened and for most, they would have to be under the root domain instead of under the category in order to fit. Has anyone tested if they were negatively impacted by URL's that are too long?
Web Design | | HMCOE0 -
How long will the 301 ranking swap-over take?
Hi all, I'm about to hit the crunch button and finalise the 301 setup for our website to redirect all traffic, and our old very nice ranking, to our new website. My only question is, how long will the ranking take to move to the new site? Once the 301 is in place what happens when someone searches my keywords? Currently when you search our preferred keywords we rank 1 and 2 depending on the wording. Once I've made the 301 happen, will you see the old site in Google rankings until they re-index it or will it swap straight away to the new site with its continued high rank (from the link juice) or will I have a blackspot period where I don't rank at all? I cannot afford to have a period of time, at this time of year, that I don't rank 1 or 2 - if this is even a vague possibility I might have to consider postponing my 301 till a less important time of year. Thanks for your help, Anthony
Web Design | | Grenadi0 -
Two URLs with same content
We recently had a client who own multiple brands switch from having multiple urls to having a single domain with multiple sub domains. I've posted an example below to better explain. My question is the original url is still functional, so there are two urls with identical content, yet I haven't been getting a duplicate content error. Also, would a rel canonical link be beneficial in this case since the duplicate content is on two separate domains? My thoughts were to put a 301 redirect on the original pages so they permanently forward to the new sub-domain format. Is this the best course of action? If not, what would you recommend? Example: Original URLs
Web Design | | BluespaceCreative
www.example1.com
www.example2.com
www.example3.com
www.parentcompany.com New URLs
example1.parentcompany.com
example2.parentcompany.com
example3.parentcompany.com
www.parentcompany.com Let me know if this I need to clarify anything in better detail.
Thanks in advance!0