Opened 6 years ago

Closed 3 months ago

#104 closed change (duplicate)

Update locale validation

Reported by: trev Assignee:
Priority: Unknown Milestone:
Module: User-Interface Keywords:
Cc: sebastian, wspee, erick, tlucas, erikvold, greiner Blocked By: #7127
Blocking: Platform: Unknown
Ready: no Confidential: no
Tester: Unknown Verified working: no
Review URL(s):

https://codereview.adblockplus.org/29587910/
https://codereview.adblockplus.org/29690867/

Description (last modified by sebastian)

Background

Currently locale validation is being performed by test_locales.pl in the adblockplus repository in combination with LocaleTester.pm in buildtools repository. It only works for our legacy Firefox extension.

What to change

There is little point updating this script, we rather need something in Python which would be integrated with the build tools:

  • We need to expand the JSON format for translations to include optional translationOptional (boolean) and maxLength (integer) keys.
  • Our custom translation keys should be removed when building or uploading to Crowdin. The maxLength field should be converted to "Note: this cannot be longer than N characters" description when uploading to Crowdin.
  • The new build.py checktranslations command should load the locales and perform the following checks:
    • Make sure the JSON data is UTF-8-encoded and parses correctly
    • Validate the structure of the JSON data.
    • Make sure that all declared placeholders are present in all translations.
    • Warn on strings where the source and target translation are identical, except for strings marked with translationOptional key or en-* locales.
    • Warn for overlong translations if maxLength is set for a translation.

Attachments (1)

test_vars.py (730 bytes) - added by trev 6 years ago.
Chrome placeholder checking script

Download all attachments as: .zip

Change History (22)

comment:1 Changed 6 years ago by trev

  • Reporter changed from philll to trev

comment:2 Changed 6 years ago by trev

  • Component changed from Unknown to Build-and-Release-Tools
  • Priority changed from Unknown to P3

Changed 6 years ago by trev

Chrome placeholder checking script

comment:3 Changed 6 years ago by trev

  • Ready unset

I wrote a simple script meant to be run from the root of the adblockpluschrome repository to check placeholder usage, attached it here. This is merely a stopgap solution until we have a better solution in place.

comment:4 Changed 5 years ago by philll

  • Platform set to Firefox

comment:5 Changed 5 years ago by philll

  • Platform changed from Firefox to Unknown

comment:6 Changed 5 years ago by trev

  • Description modified (diff)

comment:7 Changed 5 years ago by sebastian

  • Cc sebastian added
  • Ready set

comment:8 Changed 2 years ago by sebastian

  • Cc wspee erick added
  • Ready unset
  • Tester set to Unknown

I'd like to revive this issue, since Erick is interested in picking it up. However, the issue description seems out-of-date. Specifically the Gecko-specific features/syntax (including access keys) aren't relevant anymore, as legacy Gecko extensions are phased out.

Also, since we only need to target the Chrome-style translation format, I'd rather parse any metadata for the validation from the description field in the source translation's locales/en_US/*.json file. That way we avoid redundancies (for example max length is usually already documented there) and have all translation data in one place.

Then again, max length is already enforced by CrowdIn, and flagging strings which may be equal to the source translation might be tricky since some (in particular short) translations might match the English string in some but not all translations. But what we'd still need at least, is validation of placeholders.

Last edited 2 years ago by sebastian (previous) (diff)

comment:9 Changed 2 years ago by trev

  • Description modified (diff)

comment:10 Changed 2 years ago by trev

  • Description modified (diff)

comment:11 Changed 2 years ago by tlucas

  • Cc tlucas added

I propose a separation of this issue, in order to fully embrace the scope of an apprentice's project (which is the reason why Erick wants to tackle this).

This issue should be stripped down to validating the current format, which could be defined as following (and be fixed as the project):

  • Implement a new "build.py checktranslations" functionality, which handles the following (as the apprentice's project):
    • Make sure the JSON data is UTF-8-encoded and parses correctly
    • Validate the structure of the JSON data.
    • Make sure that all declared placeholders are present and formatted correctly in all translations.
    • Warn on untranslated strings. While this check is prone to false positives, it is also a warning sign allowing to detect bad translations.

Additionally, a new issue could be filed, handling the rest of this issue:

  • Expand the JSON format for translations to include optional translationOptional (boolean) and maxLength (integer) keys.
  • Our custom translation keys should be removed when building or uploading to Crowdin. The maxLength field should be converted to "Note: this cannot be longer than N characters" description when uploading to Crowdin.
  • Additionally, adjust the in #104 introduced functionality to:
    • Warn for overlong translations if maxLength is set for a translation.
    • Ignore untranslated strings for strings marked with translationOptional key or en-* locales.

comment:12 Changed 2 years ago by sebastian

I don't see why we would want to warn about untranslated strings. Incomplete translations are quite normal, in particular for less common languages and rarely seen strings, we don't care much about missing translations. Also CrowdIn already gives a good overview of the translation status, anyway. Perhaps without this feature, it would also better fit the scope of an apprentice project.

comment:13 Changed 2 years ago by trev

By "untranslated" I mean translations that are identical to English, not missing translations.

comment:14 Changed 2 years ago by sebastian

  • Description modified (diff)
  • Ready set

comment:15 Changed 22 months ago by tlucas

  • Review URL(s) modified (diff)

comment:16 Changed 22 months ago by tlucas

Half of this was done by Erick in his project. We need need to figure out, what is left to be done.

comment:17 Changed 12 months ago by erikvold

  • Cc erikvold added

comment:18 Changed 12 months ago by erikvold

In Issue #7127 we are trying to replace the tools provided in the buildtools repo with npm scripts, so I think this ought to be blocked by that issue.

comment:19 Changed 12 months ago by greiner

  • Blocked By 7127 added
  • Cc greiner added
  • Component changed from Automation to User-Interface
  • Priority changed from P3 to Unknown
  • Ready unset

Agreed. Also changing its module to User-Interface as that's where the new script will be located for now.

comment:20 Changed 3 months ago by greiner

Closing this ticket in favor of ui#569.

comment:21 Changed 3 months ago by greiner

  • Resolution set to duplicate
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.