Friday, May 31, 2013

Re: Wrong UTF-8 string parsing in GWT JSON



On Friday, May 31, 2013 10:07:23 AM UTC+2, Tibor Szolnoki wrote:
Dear Philippe,

You are right...
If I change the escaped ("\uXXXX") codes to UTF-16, for my example:
String response="{ \"test\" : \"\\u00c1\\u00c9\\u0170\" }"; //"ÁÉÜ" in UTF-16
All works correctly.


But I found a strange thin too:
If I disable  the"\uxxxx" escaping in JSON writer in server side, all works as expected. But this is not a good idea according to RFC4627 :((((.

I can't find where it says it's "not a good idea". It says all over the place that JSON "SHALL be encoded in Unicode", with a default to UTF-8, so why not just use UTF-8?
 
In this mode, the JSON string transports the non-printable characters (0xc3, 0x81, 0xc3, 0x89, 0xc5, 0xb0) ("ÁÉÚ" in UTF-8) without any encoding....

These are bytes, not characters.
The encoding is determined by the first 4 bytes of the response (see RFC4627)

--
You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-web-toolkit+unsubscribe@googlegroups.com.
To post to this group, send email to google-web-toolkit@googlegroups.com.
Visit this group at http://groups.google.com/group/google-web-toolkit?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

No comments:

Post a Comment