Friday, May 31, 2013

Re: Wrong UTF-8 string parsing in GWT JSON


In RFC4627: JSON "String" and "Text" is two different things.

Text: is a sequence of JSON objects, with barckets, strings, quotes etc...
(RFC4627 section 2.)

String: Is a JSON basic data type (single JSON data).
(RFC4627 section 2.5.)

As RFC4627 text and string encoding shall be different.
As you write, Text is default UTF-8, determined by first 4 characters. (section 3)
But not the String!

String is always Unicode, unicode characters escaped by "\uXXXX". (section 2.5)

I was only problem with JSON string, not the whole JSON text.
(My text encoding is UTF-8, as default)


I found my solution: I have to use Unicode characters in JSON string. That's all...

As Philip writes, GWT works as indeed.
GWT JSON  parser "\uXXXX" interpret as UTF-16 character.
And this is independent from JSON text encoding, which is UTF-8.




On Friday, May 31, 2013 10:32:40 AM UTC+2, Thomas Broyer wrote:


On Friday, May 31, 2013 10:07:23 AM UTC+2, Tibor Szolnoki wrote:
Dear Philippe,

You are right...
If I change the escaped ("\uXXXX") codes to UTF-16, for my example:
String response="{ \"test\" : \"\\u00c1\\u00c9\\u0170\" }"; //"ÁÉÜ" in UTF-16
All works correctly.


But I found a strange thin too:
If I disable  the"\uxxxx" escaping in JSON writer in server side, all works as expected. But this is not a good idea according to RFC4627 :((((.

I can't find where it says it's "not a good idea". It says all over the place that JSON "SHALL be encoded in Unicode", with a default to UTF-8, so why not just use UTF-8?
 
In this mode, the JSON string transports the non-printable characters (0xc3, 0x81, 0xc3, 0x89, 0xc5, 0xb0) ("ÁÉÚ" in UTF-8) without any encoding....

These are bytes, not characters.
The encoding is determined by the first 4 bytes of the response (see RFC4627)

--
You received this message because you are subscribed to the Google Groups "Google Web Toolkit" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-web-toolkit+unsubscribe@googlegroups.com.
To post to this group, send email to google-web-toolkit@googlegroups.com.
Visit this group at http://groups.google.com/group/google-web-toolkit?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

No comments:

Post a Comment