Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": No module named svn). Look in the Trac log for more information.

Ticket #1480 (closed defect: fixed)

Opened 7 years ago

Last modified 6 years ago

send charset with text/javascript JSON replies

Reported by: wmark Owned by: anonymous
Priority: normal Milestone: 1.5
Component: TurboGears Version: 1.0.3
Severity: normal Keywords: i18n, charset, json, expose, jsonify
Cc:

Description

Controllers returning JSON data do set the content-type to text/javascript but omit indicating a charset.

This breaks displaying of a page in UTF-8 receiving replies with data which' charset cannot be unambigiously determined by the browser reading the JSON reply.

Thus, setting content-type to (e.g.,) text/javascript; charset=utf-8 fixes this issue.

Change History

comment:1 Changed 7 years ago by aalbrecht

You can set the charset in your decorator:

@expose(format="json", content_type="text/javascript; charset=utf-8")

But I agree, it's more consistent, if the charset would be set automatically.

comment:2 Changed 6 years ago by Chris Arndt

  • Keywords json, expose, jsonify added; JSON, expose removed

comment:3 Changed 6 years ago by Chris Arndt

  • Milestone set to 1.1

comment:4 Changed 6 years ago by chrisz

Can you give an example where the charset matters? I see TurboJson (i.e. simplejson) only creating pure ascii data with non-ascii chars escaped, like in 'k\u00e4se'.

On another note, however, why do we actually use "text/javascript" as the content type by default, and for an Opera browser, we even use "text/plain"?

    def get_content_type(self, user_agent):
        if "Opera" in user_agent.browser:
            return "text/plain"
        else:
            return "text/javascript"

It seems "application/json" is the standard now and all modern browsers including Opera 9 should understand that. If we keep that distinction, we should at least check the version of the user_agent. But I don't think we should be considerate of old, broken Opera versions.

comment:5 Changed 6 years ago by Chris Arndt

Why do we use text/javascript? Maybe because one documented way to request a JSON response is to specify accept_format="text/javascript" in expose?

comment:6 Changed 6 years ago by chrisz

I think we should change expose() as well, so that "application/json" is recognized in addition to "text/javascript". Shall we make the switch from "text/javascript" to "application/json" in TG 1.0 or 1.1 then? I think we should fix it in 1.0 already.

Concerning the encoding problem: In fact TurboJson creates only ascii because it uses simplejson with default parameters, which means ensure_ascii=True. And currently, there is no way to use non-default simplejson parameters with TurboJson. Therefore, the charset is currently not a problem.

(As an aside, ensure_ascii=False is ignored for simple strings in simplejson 1.8.1, but this is a bug that should be fixed.)

Shall I enhance TurboJson so that it evaluates a turbojson.ensure_ascii config parameter? (Note: The JSONEncoder instance must then not be created immediately, because the ocnfig is only available later.) There are some other simplejson parameters (skipkeys, allow_nan etc.) that could be made configurable similarly.

If we return "application/json" as content type instead of "text/javascript", then I think we can or even must omit the charset="utf-8" since it is implicit. But we need to check that.

comment:7 Changed 6 years ago by Chris Arndt

1) Switching to "application/json" in TG 1.0: I guess you mean sending "application/json" as the content type, or do you only mean allowing to set accept_format="application/json" in expose? The former has the potential to break existing client JavaScript code, so I'm not sure if we should change this in TG 1.0. The current content-type does not cause problems, so we should not make incompatible changes without necessity.

2) JSONEncoder: when we want to use ensure_ascii, we must also provide a possibility to set the input encoding for JSONEncoder, right? How would it convert strings to unicode otherwise?

comment:8 Changed 6 years ago by chrisz

1) Yes, I mean both. I don't think that it can break client code, because the header is interpreted by the browser and the client javascript code doesn't care. So it is more a matter of browsers, not of code, i.e. I consider it an adaption of TG to modern browsers. I have already checked that all standard ajax widgets continue to work.

I think that the get_content_type mechanism with evaluation of the user agent should be improved to consider the Accept header instead of (or in addition to) the User-Agent header. Then we could simply return what the browser wants. If it accepts "application/json", we deliver that, otherwise (for old browsers) we deliver "text/javascript".

Furthermore, I just noticed that the get_content_type mechanism with evaluation of the user agent was broken anyway. There is a bug in this line - instead of getattr, get must be used here. So the case distinction for Opera never worked anyway.

2) The input encoding of JSONEncoder can be set with the encoding parameter (default is utf-8). We could make that configurable as well, but it has nothing to do with ensure_ascii. The output of JSONEncoder if ensure_ascii=False will be unicode, which will be delivered as utf-8 by default with cherrypy, which is fine for application/json (just tested this, works nicely).

comment:9 Changed 6 years ago by chrisz

  • Status changed from new to closed
  • Resolution set to fixed

This has been fixed in r4678, r4724 and r4726.

Note: See TracTickets for help on using tickets.