Warning: Can't synchronize with repository "(default)" (Unsupported version control system "svn": No module named svn). Look in the Trac log for more information.

Ticket #2269 (closed defect: fixed)

Opened 10 years ago

Last modified 10 years ago

Unicode error when a TextField with UnicodeString validator gets a non-ascii value

Reported by: chrisz Owned by:
Priority: high Milestone: 2.0rc1
Component: TurboGears Version: 2.0b7
Severity: major Keywords: tw.forms formencode UnicodeDecodeError
Cc:

Description (last modified by chrisz) (diff)

I had originally posted this as a TG2.0b7 issue, but after some more analysis it now seems to be a pure tw.forms + formencode issue. The following code gives a UnicodeDecodeError:

from formencode import validators
from tw import forms

field = forms.TextField('test', validator=validators.UnicodeString)

print field.display(u'käse')

Anyway, this makes using forms with TG2 a real problem for me.

Change History

comment:1 Changed 10 years ago by chrisz

  • Keywords formencode added
  • Priority changed from highest to high
  • Description modified (diff)
  • Severity changed from blocker to major
  • Summary changed from Unicode error when a TableForm TextField gets a non-ascii value to Unicode error when a TextField with UnicodeString validator gets a non-ascii value

comment:2 Changed 10 years ago by chrisz

  • Status changed from new to closed
  • Resolution set to fixed

When I brought this up on the  TW mailing list, it turned out that it is a known problem - you cannot use formencode.validators.UnicodeString with tw.forms, but you have to use tw.forms.validators.UnicodeString instead, which is a slightly modified version of the same validator that avoids this error.

Unfortunately, this important detail was nowhere mentioned in our TG2 documentation; but I have added this now in r6499.

For the curious: The issue with the UnicodeString validator is that it assumes the "outside world" to use UTF-8 encoding, while the templating engines used by tw.forms expect unencoded Unicode objects. That's why tw.forms provides a modified UnicodeString validator that you must use to avoid this error.

Btw, I have also improved the UnicodeString validator of FormEncode so that it interprets an outputEncoding of None as "do not encode." So tw.forms.UnicodeString() is now actually the same as formencode.validators.UnicodeString(outputEncoding=None). (This feature will be available in FormeEcode > 1.2.2.)

Note: See TracTickets for help on using tickets.