Friday, August 15, 2008

apertium-cy-en in the wild

We released the first version of our Welsh to English translator recently (announcement).

First thing... I get a little thrill every time I see something I've written translated into another language, so I just want to point out the part of the announcement I wrote:
This package constitutes a number of 'firsts' for the Apertium platform: it is the first package targetting a Celtic language (moreover, this is the first release of linguistic data for Apertium that does not include a Romance language); it is also the first 'community' developed package: i.e., developed in a 'bazaar'-like fashion using volunteer contributions - hopefully, the
first of many!
And the Welsh translation:
Hwn yw'r pecyn cyntaf sy'n targedu iaith Celtaidd (hefyd hwn yw'r rhyddhad cyntaf o ddata ieithyddol sydd ddim yn cynnwys iaith rhamant); mae e hefyd y pecyn cyntaf wedi ei ddatblygu gan 'cymuned': h.y. wedi ei ddatblygu mewn ffordd 'basar'-aidd gan ddefnyddio cyfraniadau gan gwirfoddolwyr - gan obeithio mai hwn yw'r cyntaf o nifer!
Kevin, who contributed almost all of the data in the bilingual dictionary and the Welsh morphological analyser, has a dedicated Welsh to English site set up: www.kymraeg.org.uk

Francis, who did most of the implementation work on the rules, pointed out this article - specifically, this section:
"There aren't any mutations in Catalan or Spanish, so there needed to add software that dealt with that. A man from Ireland had already developed simmilar software for Irish and Scottish Gaelic
I think the 'man from Ireland' is me, but that's not quite true: I did have another idea, but the idea that we ended up using was Francis's, and that's what we're now using in the ga-gd translator.

While I'm quoting things, Kevin's appeal for assistance is worth repeating:
You can also help by asking your Assembly Member to help ensure that resources that receive public funds are made available under a free licence. The sad thing is that the development of these translators could proceed much more quickly if we didn't have to create freely-distributable word lists for them from scratch. Public money has gone into compiling Welsh dictionaries and lists of terms, and yet apertium-cy and similar projects cannot use these because they are not available under terms which allow them to be freely redistributed. Every minute we spend adding words is a minute we can't spend writing software. It would be a tremendous help to our work if the Welsh Language Board could look again at this issue - after all, if developers in Spain and the USA are prepared to spend time on Welsh, we should surely give them all the help we can!
It is a great pity that more language boards aren't interested in even providing assistance to open source projects like ours, but unfortunately it's the rule, rather than the exception.

(Oh, and let me wave a flag and say 'developers in Spain, the USA and Ireland' :)

No comments: