PHP Classes

Unicode

Recommend this page to a friend!

      PHP Classes blog  >  PHP 7 Features and Re...  >  All threads  >  Unicode  >  (Un) Subscribe thread alerts  
Subject:Unicode
Summary:Still no unicode support?
Messages:5
Author:Andre Polykanine A.K.A. Menelion Elensúlë
Date:2014-08-04 14:13:46
Update:2014-08-06 20:37:34
 

  1. Unicode   Reply   Report abuse  
Picture of Andre Polykanine A.K.A. Menelion Elensúlë Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-04 19:57:20
And still not a word about native and good unicode support, as far as I can see. Performance is great, however without unicode support it drops the level of the language as a whole very low. Personally I like PHP, that's why it's a double pain for me.

  2. Re: Unicode   Reply   Report abuse  
Picture of Manuel Lemos Manuel Lemos - 2014-08-04 20:08:39 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë
Nothing has been commented about Unicode at least for PHP 7. Maybe in PHP 8 somebody brave faces that problem again.

I remember Rasmus mentioning they may have a go at it in the future using a simpler library than ICU, but that is all I can remember.

  3. Re: Unicode   Reply   Report abuse  
Picture of Joeri Sebrechts Joeri Sebrechts - 2014-08-06 07:30:13 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë
To be fair, you don't need it. PHP basically has the same level of unicode support as C and C++. The built-in strings are byte-arrays, and you can use ICU (intl extension), iconv or mbstring to deal with them as unicode in places where you care that one character != one byte.

Admittedly, it is annoying that sort() can't actually sort UTF-8 properly on windows machines but with the Collator class in intl you now have a cross-platform sorting solution, so the gaps have been filled.

So, yeah, it's a bit awkward to work with unicode, and you need to know what you're doing, but there is nothing missing to handle unicode absolutely perfectly. See this presentation I made which explains how to work with strings in PHP: http://sebrechts.net/slides/strings/

  4. Re: Unicode   Reply   Report abuse  
Picture of Andre Polykanine A.K.A. Menelion Elensúlë Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-06 20:12:37 - In reply to message 3 from Joeri Sebrechts
Of course I use mbstring. However I believe it's slower than if there would be native Unicode support in the language core. Am I wrong?

  5. Re: Unicode   Reply   Report abuse  
Picture of Manuel Lemos Manuel Lemos - 2014-08-06 20:37:34 - In reply to message 4 from Andre Polykanine A.K.A. Menelion Elensúlë
I think any multi-byte text encoding manipulation is slower than the regular single byte encoding text manipulation.

If I am not mistaken, the original PHP 6 plans were using UTF-16 to manipulate all text strings.

This means that single-byte text would be slower to manipulate than what we have today. That could hurt PHP speed in general.

So I am afraid the transparent Unicode support that some developers desire, comes at a price, of either speed and memory usage.