Mar 29 2013

Multibyte php substring without the mbstring extension

In the rare case where the multibyte string (mbstring) extension is not enabled and cannot, for whatever reason, be enabled, it becomes difficult to create a substring using international characters without causing the string to be converted into gibberish.

There is a way around this using regular expressions and preg_match. If “(*UTF8)” or “/u” is used in the regular expression than the preg match will successfully return the desired substring. Two examples follow.


preg_match('/(*UTF8)^.{1,20}/',$multibyte_string,$result_array); 

preg_match('/^.{1,20}/u',$multibyte_string,$result_array); 

Ideally the mbstring functions should be used, but this serves when that is not possible

Share