Ticket #1695 (closed defect: fixed)

Opened 1 year ago

Last modified 10 months ago

i18n - translations for text with leading/trailing whitespace not found

Reported by: amit Assigned to: Chris Arndt
Priority: normal Milestone: 1.0.x bugfix
Component: TurboGears Version: 1.0.4b6
Severity: critical Keywords:
Cc: diegobz

Description

I have seen that the TG 1.0.4b6, collects strings with trailing or leading space, for example:

<span>Welcome to <a href="http://mysite.com">My Site</a>.</span>

We get:

msgid "Welcome to "

instead of:

msgid "Welcome to"

And TG fails to get translation for "Welcome to ". If I manually remove that trailing whitespace, it just works fine...

Attachments

admi18n.patch (0.9 kB) - added by amit on 01/18/08 01:42:01.
I don't know whether this is the correct way to fix the problem, but it works for me…
i18n-kid-strip-msg.diff (1.3 kB) - added by Chris Arndt on 08/23/08 07:09:46.

Change History

01/18/08 00:55:25 changed by amit

Another strange behavior:

<td>
    Total:
</td>

on tg-admin i18n collect, I have:

#: templates/test.kid:td
msgid ""
"\n"
"    Total:"
msgstr ""

instead of:

#: templates/test.kid:td
msgid "Total:"
msgstr ""

That is, all leading and trailing whitespace needs to be stripped out from the gathered strings.

01/18/08 00:56:15 changed by amit

  • summary changed from i18n - can't translate text with trailing whitespace (v1.0.4b6) to i18n - can't translate text with leading/trailing whitespace (v1.0.4b6).

01/18/08 01:42:01 changed by amit

  • attachment admi18n.patch added.

I don't know whether this is the correct way to fix the problem, but it works for me...

01/21/08 08:20:51 changed by Chris Arndt

  • version changed from 1.0.4b3 to 1.0.4b6.

01/21/08 08:42:58 changed by Chris Arndt

  • summary changed from i18n - can't translate text with leading/trailing whitespace (v1.0.4b6) to i18n - translations for text with leading/trailing whitespace not found.
  • milestone changed from 1.0.4 to 1.0.x bugfix.

04/02/08 06:16:02 changed by diegobz

  • cc set to diegobz.

04/02/08 08:00:05 changed by diegobz

Unfortunate this patch didn't work here.

Any other idea?

My TG version currently is: TurboGears-1.0.4.3-2.fc8

04/02/08 09:58:03 changed by amit

It works, I have tested with http://svn.turbogears.org/branches/1.0 (rev 4328).

04/02/08 13:51:12 changed by diegobz

I have added the "s = s.strip()" for both files and compiled them with:

[root@localhost admi18n]# pwd
/usr/lib/python2.5/site-packages/turbogears/toolbox/admi18n
[root@localhost admi18n]# python -O -m compileall ./
Listing ./ ...
Compiling ./catalog.py ...
Compiling ./pygettext.py ...

After that, I have tried to run "tg-admin i18n collect" into my project directory, but I got the following error:

[diego@localhost transifex]$ tg-admin i18n collect
Use po/ as a locale directory
Use transifex as a message domain
Scanning source directory transifex
Working on transifex/people.py
Working on transifex/model.py
Working on transifex/release.py
Working on transifex/admin.py
Working on transifex/repo.py
Working on transifex/util.py
Working on transifex/module.py
Working on transifex/__init__.py
Working on transifex/controllers.py
Working on transifex/config/__init__.py
Working on transifex/templates/__init__.py
Working on transifex/tests/test_controllers.py
Working on transifex/tests/test_model.py
Working on transifex/tests/__init__.py
Traceback (most recent call last):
  File "/usr/bin/tg-admin", line 8, in <module>
    load_entry_point('TurboGears==1.0.4.3', 'console_scripts', 'tg-admin')()
  File "/usr/lib/python2.5/site-packages/turbogears/command/base.py", line 371, in main
    command.run()
  File "/usr/lib/python2.5/site-packages/turbogears/command/i18n.py", line 142, in run
    self.scan_source_files()
  File "/usr/lib/python2.5/site-packages/turbogears/command/i18n.py", line 302, in scan_source_files
    pygettext.main()
  File "/usr/lib/python2.5/site-packages/turbogears/toolbox/admi18n/pygettext.py", line 751, in main
    eater.write(fp)
  File "/usr/lib/python2.5/site-packages/turbogears/toolbox/admi18n/pygettext.py", line 593, in write
    print >> fp, 'msgid', normalize(k, options.escape)
  File "/usr/lib/python2.5/site-packages/turbogears/toolbox/admi18n/pygettext.py", line 269, in normalize
    lines = s.split('\n')
AttributeError: 'list' object has no attribute 'split'

Any idea?

04/04/08 15:39:36 changed by diegobz

Never mind. My mistake. It just worked.

However, after I've applied the patch, when I have a situation like:

in info.kid

<li>Click <a>here</a>.</li>

in other_file.kid

<li>Hey, click <a>here</a>.</li>

with a DOT (or whatever 'single' character like: ')' or ':' ) between the last two TAGS, I'm having problems with msgid duplicated like:

msgid "." 

into my POT file

08/22/08 09:38:11 changed by Chris Arndt

  • owner changed from anonymous to Chris Arndt.
  • status changed from new to assigned.

08/23/08 07:09:02 changed by Chris Arndt

Unfortunately, stripping the message string in the normalize function is too late. The unstripped message is used before to determine the uniqueness of the message, so messages with different whitespace on the left or right end would be doubled in the .pot file as reported by diogobz.

My attached patch fixes the problem for kid files. I'll look into Python source files later. Does the problem even exist there?

08/23/08 07:09:46 changed by Chris Arndt

  • attachment i18n-kid-strip-msg.diff added.

08/25/08 00:26:56 changed by amit

Thanks Chris,

It has fixed the problem for kid, but the problem still exists for Python code. You are right, the patch I have attached is not good (duplicate message ids in pot).

09/07/08 17:18:13 changed by Chris Arndt

In what way does this bug affect translatable strings in Python source code files? If you have a translatable string in Python source code, you wrap it into the _ function. There is no need to surround the string inside with white space nor should it be stripped off, IMHO. So if you do this:

foo = _(u' *** Eat more spamm! *** ')

... the message key will (and should) be exactly " *** Eat more spamm! *** "., e.g.:

#:mypkg/controllers.py:nn
msgid " *** Eat more spamm! *** "
msgstr " *** Esst mehr Spamm! ** "

This will be translated normally by gettext.

09/07/08 17:19:32 changed by Chris Arndt

  • status changed from assigned to closed.
  • resolution set to fixed.

Fixed in r5367 by applying my attached patch in 1.0, 1.1 and 1.5 branch.