Saturday, August 23, 2008

A warning...

I was warned by nicknamed:
“tęsknię po tobie” would sound pretty awkward to a girl… but I presume you know that? it’s “tęsknię za tobą”
Yeah... I already knew that - practical knowledge, so to speak.

Here's a list of some other things that occur in this edition:
  • 'w ów czas' is now one word: 'wówczas'
  • 'wtém' is 'w tym'
  • the letter 'é' no longer exists in Polish: 'daléj' is 'dalej', etc.
  • 'iak' and 'iuż' each appear ('jak' and 'już' in modern Polish). These were written this way in very old Polish, but here, they are normally written as usual, so these are probably just printer's errors.
  • 'nie' is written separately from most verb forms: 'niewiedział' should be 'nie wiedział'

Pan Tadeusz, Book 1 #1

LITHUANIA, my country, thou art like health; how much
thou shouldst be prized only he can learn who has
lost thee. To-day thy beauty in all its splendour I see
and describe, for I yearn for thee.

Litwo! Ojczyzno moja! ty jesteś jak zdrowie;
Ile cię trzeba cenić, ten tylko się dowie
Kto cię stracił. Dziś piękność twą w całéj ozdobie
Widzę i opisuję, bo tęsknię po tobie.

Holy Virgin, who protectest bright Czenstochowa
and shinest above the Ostra Gate in Wilno! Thou
who dost shelter the castle of Nowogrodek with its
faithful folk! As by miracle thou didst restore me to
health in my childhood--when, offered by my weeping
mother to thy protection, I raised my dead eyelids, and
could straightway walk to the threshold of thy shrine
to thank God for the life returned me--so by miracle
thou wilt return us to the bosom of our country. Meanwhile
bear my grief-stricken soul to those wooded hills,
to those green meadows stretched far and wide along
the blue Niemen; to those fields painted with various
grain, gilded with wheat, silvered with rye; where
grows the amber mustard, the buckwheat white as
snow, where the clover glows with a maiden's blush,
where all is girdled as with a ribbon by a strip of green
turf on which here and there rest quiet pear-trees.

Panno Święta, co jasnej bronisz Częstochowy
I w Ostréj świecisz Bramie! Ty, co gród zamkowy
Nowogródzki ochraniasz z jego wiernym ludem!
Jak mnie dziecko do zdrowia powróciłaś cudem,
(Gdy od płaczącéj matki pod Twoją opiekę
Ofiarowany, martwą podniosłem powiekę;
I zaraz mogłem pieszo do, Twych świątyń progu
Iść za wrócone życie podziękować Bogu;)
Tak nas powrócisz cudem na Ojczyzny łono.
Tymczasem przenoś moję duszę utęsknioną
Do tych pagórków leśnych, do tych łąk zielonych,
Szeroko nad błękitnym Niemnem rozciągnionych;
Do tych pól malowanych zbożem rozmaitém,
Wyzłacanych pszenicą, posrebrzanych żytem;
Gdzie bursztynowy świerzop, gryka jak śnieg biała,
Gdzie panieńskim rumieńcem dzięcielina pała,
A wszystko przepasane, jakby wstęgą, miedzą
Zieloną, na niéj z rzadka ciche grusze siedzą.

Amid such fields years ago, by the border of a brook,
on a low hill, in a grove of birches, stood a gentleman's
mansion, of wood, but with a stone foundation; the
white walls shone from afar, the whiter since they were
relieved against the dark green of the poplars that
sheltered it against the winds of autumn. The dwelling-house
was not large, but it was spotlessly neat, and it
had a mighty barn, and near it were three stacks of hay
that could not be contained beneath the roof; one
could see that the neighbourhood was rich and fertile.
And one could see from the number of sheaves that up
and down the meadows shone thick as stars--one could
see from the number of ploughs turning up early
the immense tracts of black fallow land that evidently
belonged to the mansion, and were tilled well
like garden beds, that in that house dwelt plenty and
order. The gate wide-open proclaimed to passers-by
that it was hospitable, and invited all to enter as guests.

Śród takich pól przed laty, nad brzegiem ruczaju,
Na pagórku niewielkim, we brzozowym gaju,
Stał dwór szlachecki, z drzewa, lecz podmurowany;
Świeciły się zdaleka pobielane ściany,
Tym bielsze że odbite od ciemnéj zieleni
Topoli, co go bronią od wiatrów jesieni.
Dóm mieszkalny niewielki, lecz zewsząd chędogi,
I stodołę miał wielką i przy niéj trzy stogi
Użątku, co pod strzechą zmieścić się niemoże;
Widać że okolica obfita we zboże,
I widać z liczby kopic, co wzdłuż i wszerz smugów
Świecą gęsto jak gwiazdy; widać z liczby pługów
Orzących wcześnie łany ogromne ugoru
Czarnoziemne, zapewne należne do dworu,
Uprawne dobrze nakształt ogrodowych grządek:
Że w tym domu dostatek mieszka i porządek.
Brama na wciąż otwarta przechodniom ogłasza,
Że gościnna i wszystkich w gościnę zaprasza.

A young gentleman had just entered in a two-horse
carriage, and, after making a turn about the yard, he
stopped before the porch and descended; his horses,
left to themselves, slowly moved towards the gate,
nibbling the grass. The mansion was deserted, for the
porch doors were barred and the bar fastened with a pin.
The traveller did not run to make inquiries at the farmhouse
but opened the door and ran into the mansion,
for he was eager to greet it. It was long since he had
seen the house, for he had been studying in a distant
city and had at last finished his course. He ran in and
gazed with eager emotion upon the ancient walls, his
old friends. He sees the same furniture, the same
hangings with which he had loved to amuse himself
from babyhood, but they seemed less beautiful and not
so large as of old. And the same portraits hung upon
the walls. Here Kosciuszko, in his Cracow coat, with
his eyes raised to heaven, held his two-handed sword;
such was he when on the steps of the altar he swore
that with this sword he would drive the three powers
from Poland or himself would fall upon it. Farther on
sat Rejtan, in Polish costume, mourning the loss of
liberty; in his hands he held a knife with the point
turned against his breast, and before him lay Phaedo
and The Life of Cato. Still farther on Jasinski, a fair
and melancholy youth, and his faithful comrade Korsak
stand side by side on the entrenchments of Praga, on
heaps of Muscovites, hewing down the enemies of
their country--but around them Praga is already
burning.

Właśnie dwókonną bryką wjechał młody panek
I obiegłszy dziedziniec zawrócił przed ganek,
Wysiadł s powozu; konie porzucone same,
Szczypiąc trawę ciągnęły powoli pod bramę.
We dworze pusto: bo drzwi od ganku zamknięto
Zaszczepkami i kołkiem zaszczepki przetknięto.
Podróżny do folwarku nie biegł sług zapytać,
Odemknął, wbiegł do domu, pragnął go powitać,
Dawno domu niewidział; bo w dalekiém mieście
Kończył nauki, końca doczekał nareszcie.
Wbiega i okiem chciwie ściany starodawne
Ogląda czule, jako swe znajome dawne.
Też same widzi sprzęty, też same obicia,
S któremi się zabawiać lubił od powicia;
Lecz mniéj wielkie, mniéj piękne, niż się dawniéj zdały.
I też same portrety na ścianach wisiały.
Tu Kościuszko w czamarce krakowskiéj, z oczyma
Podniesionymi w niebo, miecz oburącz trzyma;
Takim był, gdy przysięgał na stopniach ołtarzów,
Że tym mieczem wypędzi s Polski trzech mocarzów,
Albo sam na nim padnie. Daléj w polskiéj szacie
Siedzi Rejtan żałośny po wolności stracie,
W ręku trzyma nóż, ostrzem zwrócony do łona,
A przed nim leży Fedon i żywot Katona.
Daléj Jasiński, młodzian piękny i posępny;
Obok Korsak towarzysz jego nieodstępny
Stoją na szańcach Pragi, na stosach moskali
Siekąc wrogów a Praga już się w koło pali.


Friday, August 22, 2008

Pan Tadeusz

Aside from manuals and grammars, I haven't really read anything so far this year. So, it's about time I changed that. My friends keep asking me what I'm doing to improve my Polish, so I've decided to tackle the big one: to read Pan Tadeusz in Polish.

My Polish isn't really up to that, so essentially, I have to read it first in English, then in Polish, and see what sticks.

Despite the fact that Pan Tadeusz is a poem, the English translation is quite readable, and relatively true to the original:
Przypadkiem oczy podniósł, i tuż na parkanie
Stała młoda dziewczyna--białe jéj ubranie
Wysmukłą postać tylko aż do piersi kryje,
Odsłaniając ramiona i łabędzią szyję.

By chance he raised his eyes, and there on the wall
stood a young girl--her white garment
hid her slender form only to the breast,
leaving bare her shoulders and her swan's neck.

The Polish text is not really representative of modern Polish: jéj in the text above is now written jej, the preposition 'z', although pronounced as 's' where appropriate is never in modern Polish written 's' - but these are small differences, and the English translation I'm working with contains similar archaisms, such as 'to-day' instead of 'today'.

Anyway; in linguistic processing, the moans of the lack of parallel text are frequently heard: I think it's more productive to try to do something about it, so I'll be posting book 1 of Pan Tadeusz in both English and Polish. It'll probably be more suitable for language learners than for NLP, but it's worth a try.

Until I'm finished, here are a few selections:
Rumienił się, serce mu biło nadzwyczajnie;
Więc rozwiązane widział swych domysłów tajnie!
Więc było przeznaczono, by przy jego boku
Usiadła owa piękność widziana w pomroku.

He blushed, and his heart beat faster than its wont.
So he now beheld the solution of the mystery upon which he had pondered.
So it had been ordained that by his side
should sit that beauty whom he had seen in the twilight

Lecz młodzież o piękności metrykę nie pyta,
Bo młodzieńcowi młodą jest każda kobiéta,

But youth never asks beauty for its baptismal certificate;
to a young man every woman is young

I w twarz spójrzała, z której wytryskał rumieniec,
Ilekroć z jej oczyma spotkał się młodzieniec:

and she looked into his face, on which a blush rose
as often as the young man met her eyes.

Podróżny zląkł się, spójrzał, lecz już jéj niebyło,
Wyszedł zmieszany i czuł że serce mu biło
Głośno, i sam niewiedział czy go miało śmieszyć
To dziwaczne spotkanie, czy wstydzić, czy cieszyć.


the traveller looked up in alarm, but she was there no longer;
he departed in confusion and felt the loud beating of his
heart; he knew not whether this strange meeting should
cause him amusement or shame or joy.

Friday, August 15, 2008

Apertium in 'The Guardian'

There was an article about our Welsh translator in yesterday's Guardian: http://www.guardian.co.uk/technology/2008/aug/14/freeourdata.opensource

"Flummoxed by a document in Welsh? Now you can get a free translation at cymraeg.org.uk. The Apertium-cy software, described as the first free automatic translator from Welsh to English, is the fruit of a multilingual effort involving developers in Spain, Wales and Ireland pushing forward the possibilities of open-source software and, they hope, free public-sector data."

The focus of the article is on how we weren't able to use public data compiled by the Welsh Language Board:

'When we contacted the Welsh Language Board, however, it said the Apertium team couldn't be more wrong. "We welcome re-use," it said. Although the small print forbids unauthorised reproduction, the board says it would be delighted to consider requests. Where feasible, it will make products available under what it says would be "a suitable free non-commercial agreement".'

Well, if they had ever returned any of my phone calls, maybe we could have used their data. Maybe they'll give me an answer now :)

apertium-cy-en in the wild

We released the first version of our Welsh to English translator recently (announcement).

First thing... I get a little thrill every time I see something I've written translated into another language, so I just want to point out the part of the announcement I wrote:
This package constitutes a number of 'firsts' for the Apertium platform: it is the first package targetting a Celtic language (moreover, this is the first release of linguistic data for Apertium that does not include a Romance language); it is also the first 'community' developed package: i.e., developed in a 'bazaar'-like fashion using volunteer contributions - hopefully, the
first of many!
And the Welsh translation:
Hwn yw'r pecyn cyntaf sy'n targedu iaith Celtaidd (hefyd hwn yw'r rhyddhad cyntaf o ddata ieithyddol sydd ddim yn cynnwys iaith rhamant); mae e hefyd y pecyn cyntaf wedi ei ddatblygu gan 'cymuned': h.y. wedi ei ddatblygu mewn ffordd 'basar'-aidd gan ddefnyddio cyfraniadau gan gwirfoddolwyr - gan obeithio mai hwn yw'r cyntaf o nifer!
Kevin, who contributed almost all of the data in the bilingual dictionary and the Welsh morphological analyser, has a dedicated Welsh to English site set up: www.kymraeg.org.uk

Francis, who did most of the implementation work on the rules, pointed out this article - specifically, this section:
"There aren't any mutations in Catalan or Spanish, so there needed to add software that dealt with that. A man from Ireland had already developed simmilar software for Irish and Scottish Gaelic
I think the 'man from Ireland' is me, but that's not quite true: I did have another idea, but the idea that we ended up using was Francis's, and that's what we're now using in the ga-gd translator.

While I'm quoting things, Kevin's appeal for assistance is worth repeating:
You can also help by asking your Assembly Member to help ensure that resources that receive public funds are made available under a free licence. The sad thing is that the development of these translators could proceed much more quickly if we didn't have to create freely-distributable word lists for them from scratch. Public money has gone into compiling Welsh dictionaries and lists of terms, and yet apertium-cy and similar projects cannot use these because they are not available under terms which allow them to be freely redistributed. Every minute we spend adding words is a minute we can't spend writing software. It would be a tremendous help to our work if the Welsh Language Board could look again at this issue - after all, if developers in Spain and the USA are prepared to spend time on Welsh, we should surely give them all the help we can!
It is a great pity that more language boards aren't interested in even providing assistance to open source projects like ours, but unfortunately it's the rule, rather than the exception.

(Oh, and let me wave a flag and say 'developers in Spain, the USA and Ireland' :)