Hebban olla vogala. . .

Hebban olla vogala nestas hagunnan hinase hic enda thu uuat abidan uue nu?

The Rochester Poem, mmc (CE).

Let’s just see where this goes.

It used to be said, and maybe still is, that the only surviving fragment of Old Dutch was the sentence “Hebben olla vogala nestas hagunnan hinase hic enda thu uuat abidan uue nu?”, which is almost visible above. This means “All the birds have begun their nests except for me and you. What are we waiting for?”. I find it utterly delightful that this rather surreal sentence would constitute the first known text in any language. It has an appealing oddness to it. However, apparently things are not so clear cut. In fact there are much older examples of Dutch known today and it’s also been argued that this sentence isn’t even in the language.

The sentence is somewhat reminiscent of the oldest known sentence in a Romance language as opposed to Latin, which is “Τορνα, τορνα, φρατρε”, apparently uttered by a Dacian soldier in the sixth Christian century in the Byzantine Empire. “Torna, torna, fratre” in Latin script. This one is remarkable in that not only is it the earliest Romance text, but it’s in the ancestor of the last major Romance language to acquire a regular written form, Romanian. Before the Strassburg Oaths, generally considered to be the earliest continuous text in a Romance language, in this case French, dating from Valentine’s Day 842, there was this.

Actually, the Strassburg Oaths, which I now realise isn’t usually spelt the way I spell it, is bilingual and may be relevant to the question of what the oldest Dutch text is, because they were sworn by two kings, Charles The Bald and Louis The German, and the latter’s language was Frankish, the Germanic language of the future France, now surviving in various forms in Germany including the to me impenetrable Kölsch. Frankish was closer to Dutch than High German was or is, so it may be relevant:

In Godes minna ind in thes christianes folches ind unser bedhero gealtnissi, fon thesemo dage frammordes, so fram so mir Got geuuizci indi mahd furgibit, so hald ih tesan minan bruodher, soso man mit rehtu sinan bruodher scal, in thiu, thaz er mig sosoma duo ; indi mit Ludheren in nohheiniu thing ne gegango, zhe minan uuillon imo ce scadhen uuerhen.

Louis The German, dcccxlii

That’s more or less comprehensible, I imagine, to a reader of 21st century English. I don’t run into problems until the word “geuuizci”. There are clear signs of the High German sound shift in it, though not in the places a reader of standard High German today might expect them, for instance “folches” rather than “Volkes”. It clearly is not Dutch, partly for that reason.

“Hebban olla vogala” may in fact not be Dutch either. Since it was written in Rochester, apparently by a Flemish monk in a convent, it’s also been claimed to be Old Kentish. Now at this point I could breezily make the claim that I can understand it because it’s in Kentish and I come from, well, Canterbury, but for all intents and purposes Kent, but the fact is that by the time I was born practically all the distinctive features of the Kentish accent and dialect had vanished in the face of the railways, probably more than a century before I was born. Dickens’s 1860 novel ‘Great Expectations’ famously has Abel Magwitch using the pronunciation “wittles” for “victuals”, which starts with a bilabial fricative /β/. There is no trace of this in my accent and nor have I ever heard it in any other person hailing from that county or from Canterbury (whose accent I’m assuming is the same as East Kent’s). Nonetheless I am faintly aware of what count as Kentish features, and even more Southern ones, such as the tendency to voice initial fricatives like F, TH, S and SH. This does actually give Southern English as was a rather Dutch flavour, and that seems like more than coincidence to me since Kent, Sussex and Wessex are the closest parts of England to the Low Countries. This could well mean, for example, that in Mediæval times people would’ve been saying and writing something like “voules” instead of “foules”, and earlier, in the more conservative southern region, “vogeles”. However, as I understand it the plural ending for present tense verbs in the South was “-eþ” rather than the later dominant “-en” (Chaucer’s “smale foules maken melodie” for example), so “hebben” is not Kentish, and the word “hinase” means nothing to me. It’s also odd that there’s an apparently hypercorrected H before “ic”, suggesting that there was a tendency to drop aitches in whatever that language was at some point which was being over-compensated for here. I also find it slightly surprising that /θ/ is apparently written as “TH” rather than “Þ”, although I seem to have seen this before in Old High German. I’m not sure what sound it actually represents. Also, “W” is written as “uu”, which is common even in Anglo-Saxon texts even though there was a perfectly respectable letter Ƿ to represent /w/ at the time. Then again, modern Dutch uses a labiodental semivowel in that position, and Flemish /v/. At the time, Kentish would’ve used the word “wat” in that position too, and it might well have been “uuat”.

The “nu” at the end seems to indicate that the Dutch “ui” diphthong had yet to emerge, if this is indeed Dutch, as it seems to be. This is not surprising and doesn’t provide a means of deciding between the languages this may be in.

In Middle English of the time, perhaps a mixed Kentish-Midland version, this could be something like “Hebben alle vogeles nestes gunnan bote vor me ende thee uuat abiden uue nu?”. However, although that’s probably a valid Middle English sentence, it is also quite contrived because, for example, the dual “wit” for “we two” was still used in Kent at the time. Placing these two next to each other:

Hebben alle vogeles nestes gunnan bote vor me ende thee uuat abiden uue nu? (My garbled English example)

Hebban olla vogala nestas hagunnan hinase hic enda thu uuat abidan uue nu? (The real version)

The biggest difference is caused by the fact that “hinase” is apparently a conjunction in Old Dutch and is represented by a prepositional construction in Middle English, which obscures the similarity in that part of the sentence by causing the objective forms of the pronouns to be adopted.

Nowadays this sentence is no longer considered to be the earliest Dutch. The Wachtendonck Psalms are older and longer, although they’re mixed Dutch and German, and the Egmond Willeram is another eleventh century Dutch translation of a German commentary on the Song Of Songs. Although these two sources are, I’m sure, prized and considered useful by Dutch speakers, I still find it a shame that this peculiar sentence is not now officially thought to be the only example of Old Dutch. On the other hand, both those examples are translations from German rather than original Dutch texts. I expect there’s another earlier and longer Dutch text I don’t know about.

For me, Dutch is what I think of as a “pincer movement” language. Although it’s a little unfair of me to think of a language in its own right as an adjunct to others, because I know English and standard (High) German well, both of which are quite close to it, Dutch itself makes some sense to me too without my having formally learnt it at all, although I have kind of glanced at it and I do also know some Afrikaans, which is very close indeed. In other words, my knowledge and awareness of Dutch is based on knowledge and awareness of adjacent subject areas, which makes it easier to learn. This phenomenon of the “frilly border” is easier to picture when learning certain languages but also applies to other subject areas. Unfortunately, it probably leads to a non-linguistic equivalent to «faux amis», where we think we know something but we don’t.

Something I’ve never done is compare Scots and English separately to Dutch. I expect English shorn of Romance terms to be somewhat closer to Dutch than Dutch is to Scots because that strikes me as more influenced by Scandinavian. English and Scots are both Ingvæonic languages and Dutch isnae. This shows with the nasal spirant law: a nasal consonant followed by a sibilant loses the nasal, such as the English “us” versus the Dutch “ons”. Scandinavian languages do something similar, which brings up the issue of whether English is actually even a West Germanic language at all or a Nordic one.

Somewhere on here I’ve written about the Dark Ages Ages on this island, and it’s that period which determines what we call Old English or Anglo-Saxon. They traditionally began with the “Last Groans Of The Britons” in around 450 and ended in 1066 when you-know-what happened. This is a 616-year period. Superimposed on the timeline ending in 2022, the earliest part of that period corresponds to about 1500, so it’s about the same as the Battle Of Bosworth Field to today. By that time, English was clearly recognisable and legible to the twenty-first century eye, so it can be expected, I suppose, that at least written Anglo-Saxon, such as it was at the start, would’ve been comprehensible to a reader such as Edward The Confessor. The actual period during which Germanic language was spoken in Great Britain started earlier than that because of German soldiers in the Roman army, and there are therefore runic inscriptions from earlier. These, however, may not be directly ancestral to English as they were from all over the place. English has never been a unified entity and I imagine few languages are. At least four tribes came over from the Angeln Peninsula, also known as Engla Land, speaking somewhat different languages, and writing was in the form of brief runic inscriptions. One of the earliest of these is found on the Loveden Hill Urn in Lincolnshire, a cremation urn dating from the fifth century on which is engraved the text:

ᛋᛇᚦᚫᛒᚫᛞ || ᚦᛁᚢᚹ || ᚻᛚᚫ[ᚹ]

This says “Siþæbæd þiuw hlæw”, which is difficult to interpret. A closer representation of the runes looks like this:

This is a typical example of early runic inscriptions, which look to me almost illiterate, like what a child would write as they are just starting out. I don’t know why this is exactly but it doesn’t seem to me that anyone among the early English speakers, insofar as the language can even be said to have existed at all back then, would’ve been spending much time writing, and the refinement of their instruments might also not have been up to much. Then again, something like the Sutton Hoo Hoard, which was sixth to seventh century, clearly demonstrates that there was sophisticated metalworking and jewellery at that time, only a century or so later. It says something like “Siþæbæd’s . . . barrow”, and presumably she was the woman whose remains were in the urn.

There’s also an Old Dutch runic inscription, or something like Old Dutch, dating from the same period: the Bergakker Inscription, found on a scabbard. These look similarly, er, rustic?:

These are unusual because they aren’t from the coast around the Frisian area but further inland in Gelderland. Cleaned up, they read as follows:

ᚺᚨ?[V]ᚦ[V]??ᛋ᛬ᚨᚾᚾ᛬ᚲ[V]ᛋᛃᚨᛗ᛬᛬ᛚᛟᚷ[V]ᚾᛋ᛬

Most people agree that the first bit is the genitive of “Hæþuþewaz”, but beyond that it becomes controversial. The last word looks to me like “loguns”, which has been taken to be a kenning (poetic synonym) for “flame”, meaning sword, which calls light sabres to mind incidentally. Given their location, this does in a way appear to be Dutch but at such a remote and early point in time, is it really at all significantly distinctive? Is the English inscription any more so? There could certainly be dialectal variations, but did those survive long enough to be recorded in written form by the time the languages had a distinct identity?

I can’t help thinking the Romans, Greeks and their ancient predecessors such as the Sumerians and Ancient Egyptians, were doing a better job than that with their writing. Then again, Linear B and others look pretty sloppy, so I don’t know. I don’t want to be judgemental about this. Maybe it’s just that the people in question were the likes of blacksmiths and farm labourers who were having to rely on their muscle power, making it hard to be neat and delicate. They did succeed in being so very soon afterwards though.

So that’s it really. The earliest fragments of Dutch and English, when they were the same language basically.

Modern Latin

Latin is in a sense a dead language, and in at least two other senses a living one. It’s a dead language in the sense that any children today growing up speaking Latin as their only first language are likely either to be subject to questionable parenting or have parents who have ended up speaking Latin to each other due to not being fluent in each others’ languages. There are a few people who speak Sanskrit, and a few more who speak Esperanto, as a first language, so it’s conceivable that there’s a teensy number who speak Latin that way too. I’m not one to judge such parenting decisions, but even so I’d hope that people who do opt to bring up their children speaking Latin at least make the additional decision to make them bilingual. Judging by my own experience in bringing children up speaking languages other than the dominant ones in their community, English is the language, French and Spanish (not so much Castilian) are also spoken by people around them, but German was just this funny noise I made at them which didn’t really catch on, except that it’s alleged that our daughter does speak German in her sleep.

But how is Latin alive? In at least two ways, as I said. Firstly, looking at the map above, most of the western half of the continental Eurasian portion of it still more or less speaks Latin. Every generation would have understood what the previous one was saying all the way back to the point where they would’ve been speaking Latin, or a dialect of it, at the time represented by this map, which is CCXVII ANNO DOMINI, or DCCCLXX ANNO URBIS CONDITÆ. The kind of prescriptive “correction” of pronunciation and grammar rife in probably most human communities would’ve been going on back then. In Italia and Dacia, parents would’ve been having a go at their kids for pronouncing C as “ch” before E and I or missing the S or M off certain words, and in Hispania perhaps complaining about this new trend of saying “you will be” instead of “you are”. Then eventually they would’ve given up and died, and only the church people and nobles would have noticed anything unusual about the language they were using, until eventually they were calling their languages Italiano, Português, Castellano, Français, Català, limba română and so on (not sure about capitalisation here), and couldn’t understand Latin very well at all.

Latin, though, was and is still alive. It’s still spoken fluently, for example, in the Roman Catholic Church and much Latin terminology is used in technical discourse. I always write prescriptions in Latin and British herbalists communicate with each other in the likes of Italy and Czechia in Latin. I’ve done so myself, although my attempt at talking to a herbalist in Rome using Latin didn’t work at all. I can look into the back garden and see Euphorbia helioscopia and Fragaria vesca aplenty, and don’t even bother to think what they might be called in English most of the time. The first of these, of course, is Latinised Greek and therefore possibly a poor choice but the point is I do this, as do many others, and this is the legacy of the Roman Empire.

However, the question I want to ask here is this: what would Latin be like today if it had continued to be a vernacular language? Ecclesiastical Latin survived of course, but that has some peculiar features such as the palatisation of C and G before E and I (to “ch” and “j” sounds) which even now one Romance language at least, Sardinian, doesn’t always have. This form of Latin probably doesn’t represent how it would be today as a widely spoken language as it is formally taught and frozen in some ways, although it adopts modern vocabulary such as helicopterus. It also ignores the difference between long and short vowels. Classical Latin had five long and five short vowels and some of the diphthongs, including Æ, AV and Œ, had already become single vowels by the Augustan period from which today’s academic pronunciation is derived. This was from 710 to 771 AUC, the period during which Jesus was born and ending maybe a decade before he was killed. The vowels of Latin, however, didn’t undergo this particular type of merging because the formerly long and short vowels, although they became levelled in length, also changed pronunciation while they were doing so and therefore remained distinct.

The question arises, though, of how these circumstances might arrive. It really amounts to the Empire not falling, and in order to imagine how it might persist one has to have some idea of why it fell in the first place. I personally think it was a combination of the adoption of Christianity and some kind of issue related to physical resources such as the need to continue to conquer land to retrieve food over increasing distances until it was no longer possible to transport them economically, but I’m no historian. The question also arises of what kind of world we’d be living in now if this had happened. For instance, would slavery still exist and would the Empire have continued to expand? For the sake of simplicity, I want to assume the following state of affairs:

  • The Empire didn’t split in half after the death of Theodosius I in 395 CE (1148 AUC). This could mean the eastern Empire was less dominated by Greek, and the Byzantine Empire survived until 1453 CE.
  • Christianity was not adopted. Perhaps it just didn’t exist.
  • Slavery was abolished in any event fairly early. This is because without that, technological progress would be much slower since there would be no direct connection between the experience of people working in particular industries and “thinkers” who could pass on what they learnt from their work, and there would be more motivation to invent labour-saving devices. This would give the Empire technological and therefore military superiority over their neighbours and strengthen its prowess in the long term.
  • The Empire eventually became global and there is a single state, the world ruled from Rome.

I am aware that all of this might not result in a particularly marvellous world order but I also think this world, with no European Dark or Middle Ages and the continuing innovation of the Greek part of the Imperium, would be many centuries ahead of our own technologically. I’m going to conjecture that slavery was abolished in about the year 500 CE, followed by an industrial revolution in about 600, leading to a twentieth century level of technology by 800, meaning that there would be weapons of mass destruction and the conquest of the entire planet around then. By today, Rome would have dominated the world for twelve centuries. Note that this may very well not be a utopia, although it’s worth asking whether it’s even possible for a society like this to exist without being a utopia because I am confident that anything other than a utopia-like civilisation could exist for long and still be industrialised, so I imagine that of necessity such a world would perhaps have begun as an oppressive régime, but ceased to be so after a fairly short period of time, perhaps because the level of education required is incompatible with maintaining that level of technology. All of this is interesting and worth exploring in itself, but for now I simply present you with the global official language of the 28th century AUC.

Today’s Romance languages are descended from Vulgar Latin. For example, the Castilian words calle, casa and caballo now mean “street”, “house” and “horse”, but originally meant “dirt track”, “hut” and “nag”, or rather they descend from such words. If the Empire had survived, it seems likely that some of the prestige register would have continued and the words via, domus and equus would have been the source of the modern words. In some cases these have survived anyway in some form. Italian still calls roads “via” and the Castilian word for “mare” is yegua. Domus, on the other hand, seems to have died out. Similarly, the French for “head”, tête, originally meant “pot”, but caput survives in the Castilian cabeza, and also in the French chef, where however it doesn’t mean “head” in the literal sense. Hence one major feature of 28th century Latin to my mind would probably be that the vocabulary was taken from classical rather than vulgar sources, although there would probably have been some infiltration from the lower classes, particularly if the society it was spoken in had become more egalitarian. On that matter, would there now be a communist society in which it had become routine for people to refer to each other by a word translatable as “comrade”, such as “amica”?

The purest Latin in the real Empire was said to be spoken in what is now the south of France. This may be surprising, but it was probably due to the fact that the other Italic languages, related to Latin but not descended from it, had been spoken in the rest of the Italian peninsula and influenced the way it was spoken there. Centuries later, it was agreed that the most accurate Latin was spoken in Britain, and this was because the first language spoken by the peasants here was not closely related to it and therefore didn’t influence it. In general, the further one went from Rome, the more divergent the Latin was, with the proviso that in Italy itself the language was somewhat different from the standard upper class register. One of the features of the Roman dialect of Latin was that it seems to have changed L to I in some places, as for example with “fiore” in modern Italian rather than “flos”, becoming “*flore”. And this underlines the fact that Italian is not modern Latin. The chief differences from a kind of “central” standard include its distinctive double consonants and the fact that most words end in a vowel.

Now would be an appropriate point to highlight many of the differences of today’s Romance languages from classical Latin in the sense of being actual divergences from a standard which are not present in all cases:

Italian shows the influence of other Italic languages, such as “I” replacing “L”, which I seem to recall is from Faliscan. There’s also the so-called “Tuscan throat”, which is the tendency to change /k/ to a guttural fricative, said to be inherited from Etruscan, this being however a dialectal feature. Apart from that is the doubling of consonants and the use of vowels at the ends of most words. The general rhythm of the language is similar to that of Modern Greek and may also not be original.

Spanish, by which I mean all of Castilian, the other dialects of the Spanish part of the Iberian peninsula except for Catalan/Valencian and Gallego, and the language of first-language Spanish speakers of the Americas and other former parts of the Spanish Empire, is divergent in two major ways. Firstly, it has adopted a fair bit of Arabic vocabulary due to the Moors dominating the region for much of what would be considered the Dark and Middle Ages in much of the rest of Europe. Secondly, during the second Christian millennium it became phonetically quite divergent, particularly in Castille, where J came to be pronounced “kh”, C before front vowels and Z “th”, F became H and was eventually completely dropped and so forth. In non-Castilian dialects, C in those circumstances stayed as “s” but “LL” became “y” (I’m using English spelling conventions here rather than IPA). Spanish also uses the verbs “ser” and “estar” to express what appear to be mainly necessary and contingent states, and has personal “a” for the accusative, which as far as I know is unique, and uses “haber” to express the perfect but never an existential verb.

As far as Portuguese is concerned, and here I’m including non-European varieties again, there’s again considerable divergence in pronunciation, but the spelling is conservative, making the written language look closer to Italian and Latin than Spanish does. The most distinctive features of that pronunciation are the nasalisation of vowels, the contrast in pronunciation according to whether the syllable is stressed or not, which incidentally is also present in English, and the tendency to palatise, which gives it a superficially Slavic sound. Brazilian Portuguese, which is generally more conservative than European, also has a uvular R like the French one. N’s also get turned into M’s sometimes. As far as grammar is concerned, it uses “ter”, from the Latin “tenere”, meaning “to hold”, as an auxiliary verb for past tenses and has the unique feature of the personal infinitive, where personal endings are added to the infinitive in hypothetical situations. Portuguese and Spanish are the closest national languages to each other in the group, but Portuguese also shares features with Catalan which Spanish lacks. It and Romanian are the two least similar languages of the lot. I’m not sure about this, but I get the impression that Portuguese vocabulary is almost as pure as Italian’s.

This brings me to boring old French, and I know I’m being unfair here but its ubiquity in secondary schooling for an Englander of my generation lends it a patina of tedium. French is quite divergent because it descends from Latin spoken far from its native territory, which also suggests that the extinct British Romance language which vanished without trace after the Germanic invasions would have been “even more French”, since it was spoken on an island, but since it quickly died out it probably didn’t have much chance to change that much. French is probably the language whose pronunciation is both furthest from its spelling and classical Latin pronunciation, and the grammar is extremely simplified. It seems to have been influenced in two major ways. One is that it was spoken in a formerly Celtic-speaking area and although I’m not sure I think that its tendency to run words into each other and have them influence its pronunciation may date from that stage. The other is that the Franks, who spoke a Dutch-like language, also had a major hand in its form. For instance, the French vowels “eu”, “u” and “œ” are found in no other national Romance language but are common in Germanic ones. The other notable feature of French pronunciation apart from elision and liaison is nasalisation, which is accompanied with a vowel shift. Although nasalisation wasn’t as important an element in Latin, it did exist in words which ended in M, and therefore is in a sense a conservative feature.

Romanian is generally the most divergent language of the group but is also the most conservative grammatically. It’s the outlier for two reasons. Firstly, it borrowed a lot of vocabulary from eastern European sources, and secondly it’s part of the Balkan Sprachbund, entailing that it has unusual features such as a tendency to avoid the infinitive and having suffixed definite articles. Compared to its relatives, it’s closest to French and shares with French what appears to be an original tendency to have a slower rhythm than Italian and Spanish. It still has the grammatically neuter gender and case endings, and its verb conjugation also tends to be quite conservative, but is extremely irregular.

Romanian and French, although not close to each other, share a few features. Romanian seems to have borrowed a lot of French vocabulary, and French case endings fell into the same two cases as now exist in Romanian. There’s also the tendency to use schwa (the murmured final vowel in “Sparta”) and the general prosody of the languages. Some of these features are clearly radical but the rhythm and case endings clearly are not. Romanian also preserves the way the vowels changed from Latin in the Eastern Roman Empire and Sardinia, which is different from the rest of the group.

French, Spanish and Portuguese also form a kind of block with similar features, although of the three French is closer to Italian in certain characteristics, though not the ones found in Portuguese. These include the universal noun plurals in “-s”, and incidentally in “-x” in French, which is a spelling convention.

Catalan has the distinction of being the “most central” Romance tongue. It has more in common with the other languages than any of the others have with each other. There are two fairly striking features. One is that words have a tendency to be quite short, and the other is that there is an unusually large number of personal pronouns. Incidentally, it’s the most widely spoken language in Europe which is not the official tongue of any recognised state and has more speakers than a number of other languages which are official elsewhere. Being the “central” one, it may have the best claim to being today’s version of Latin.

There are a number of other languages in the group and also some extinct ones, some of which have vanished without trace. The most conservative of these is Sardinian, which however is also influenced by Catalan. It forms the definite article in a distinctive way, from “ipso”, the reflexive pronoun “itself”, unlike all the others which got it from “ille”, meaning approximately “that”. It also still pronounces C as “k” in all positions. It’s generally considered closest to Latin of all the Romance languages but does have influences from extinct sources which have altered it somewhat. It also does things no other language in the group does, such as changing initial V to F.

I’ll just mention Ladino in passing as I already went into it in some depth here. It has more conservative pronunciation than Spanish, which is because it split off before Spanish took its foray into weirdness. Ladino is kind of more “normal” than Spanish in that respect because it’s conservative.

There are several isolated Romance languages spoken in the Alps, mainly in Switzerland, including Ladin, Rumansh and Friulian. At a cursory listen to these, about which I don’t know much, they sound rather like Spanish to me.

Dalmatian is a bit like a missing link. It was spoken in the region between Italy and Romania until rather dramatically its last speaker died in a road-building explosion. I’ll cover the rest here. Provençal is the most successful historically, and has a number of peculiar features. It used case endings for longer than any other Western Romance language, it has oddly swapped gender endings – O for feminine and A for masculine – and has no nominative personal pronouns at all. Finally there are four extinct languages of which there is little or no written record: British Romance, African Romance, Moselle Romance and Pannonian. These can be detected through placenames, words borrowed into other languages and errors made in documents.

I would like to claim that Modern Latin would, unlike other surviving Italic languages, be descended from classical rather than Vulgar Latin, and would combine features currently found in all surviving Romance languages otherwise, but that it would be more conservative. It would also have borrowed terms directly from other languages which Latin as a living language never encountered such as Australian Aboriginal or North American First Nation languages.

This, then, is what I think it would be like and why:

  • A tendency to use vocabulary derived from classical Latin where surviving Romance languages have used vulgar, such as “caput” for “head”, “equus” for “horse”, “via” for “street” and “domus” for “house”.
  • No definite or indefinite articles. Latin itself had no word for “the” and whereas other Italic languages now use them, they are not entirely consistent in their etymology or placement. Sardinian uses “ipso” and Romanian suffixes them.
  • A future tense based on suffixing “habere” to the infinitive in most cases with the exception of “esse”. Romanian seems to be the only one which hasn’t done this and this is probably due to being in the Balkans.
  • Three genders. Romanian retains the neuter gender and others have a kind of neuter pronoun. However, since the form of the masculine and neuter nouns is often similar, the distribution of those genders would seem arbitrary to someone ignorant of the history of the language. Feminine would be more definitely separate.
  • Two cases for the nouns. This crops up in Old French, Provençal and modern Romanian. They’re likely to be absolute and oblique.
  • C and G would be palatised before E and I, and V would be pronounced “v”, unlike in Latin.
  • A nasal vowel would occur at the end of certain words but there would be no other nasalisation.
  • R would be trilled.
  • There would be a distinct future tense for “esse” but for no other verbs. “Stare” would not be used as an existential verb.
  • The general rhythm of the language would be slower than Italian and more like French.
  • The personal pronouns would include dative forms.
  • Vowels would have collapsed in the Western Romance manner rather than the Eastern.
  • Word order would be SVO except for pronouns.
  • Past tenses would be realised using “habere” as an auxiliary verb before the past participle.

There would be a number of other deducible features, but probably the best way to approach this is to produce a passage in the actual language. Although this is a world without Christianity, the Pater Noster, or “Our Father” prayer, more commonly known in English as the Lord’s Prayer, is a reliable source of the form of most written languages, and it’s therefore worth trying to reconstruct it here. I think it would look something like this:

Padre nostro, qui es in cielo, sanctificato sia nome tuo.

Venga tuo regno, sia facta tua voluntatem, come in cielo e come in terra.

Da nobis odie nostram panem quotidianom, e pardone nos de nostras debitas, come nos pardonemos nostros debitores

E non duca nos in tentationem, ma libera nos de mal. Amen.

I’m not sure how closely I followed the recommendations here. However, I have attempted to include a number of grammatical points. The absolute case is distinct from the oblique, the subjunctive third person singular of “estre” is “sia”, possessive adjectives can occur either side of the noun. “Nobis” is the dative of “nos”. The neuter and masculine singular oblique ends in a M, but this serves to nasalise the vowel and is not pronounced as a consonant. Likewise, “GN” is a palatised N like in French and Italian. H’s are silent, if they ever occur – in fact there may simply be no letter H, except in foreign loanwords. “And” is “e”, and consequently there’s no ampersand. The penultimate line uses the same form as the English version which has “debts” and “debtors” rather than “sins” or “trespasses” and the awkward “those who trespass against us”. Finally, I’ve missed off the doxology, but it would be something like “car regno, potentia e gloria son tue, a seculas de secula.”, but I’ve just thrown that together at the last moment.

A short note on loanwords. They would have tomatlas, patatas, minuas (kangaroos), dovaques (boomerangs) and so forth. These would mainly be neuter, as the languages they were borrowed from would usually lack grammatical gender entirely.

In conclusion, I think this is an entirely feasible if somewhat arbitrary conjecture as to what Latin would look like today if it had been in continuous use since Roman times as a vernacular. The written numerals would be different as they would probably have been adopted from a different source, possibly India, but Roman numerals would doubtless have been abandoned early, and there would be no Christian, Jewish, Islamic, Baha’i or Sikh religions. Other than that, I don’t know what the world would look like today, except that it seems likely that the human race would be found on other planets and in space habitats by now. But whatever, this is how they’d be talking, or somewhere near.

Transylvanian English

Castelul Bran – Dobre Cezar

Nowadays there are two divisions of Romance languages. In the West, there’s French, Italian, Castilian (Spanish), Portuguese and a number of less widely-spoken ones, some of which are considered dialects but including Occitan, Catalan, Sardinian and the various minority languages of Switzerland. In the East, there’s Romanian and a few other similar languages such as Moldovan, which seems to be Romanian written in the Cyrillic alphabet like Russian, Aromanian and Istro-Romanian. All of these languages are descended from Vulgar Latin. There are also some extinct Romance languages, including a likely British one about which practically nothing is known and African Romance, which was spoken along the southern coast of the Mediterranean and seems to have been somewhat similar to Spanish. Crucially for the purposes of the division between East and West, there was also Dalmatian, a language spoken along the coast between Italy and Albania, which was important enough to be the national language of the Republic of Ragusa. On 10th June 1898, Tuone Udaina, its last speaker, was killed in a roadworks explosion. It was similar in some ways to Romanian but it was not a Balkan language. I will explain.

As well as there being language families, distantly related or practically unrelated languages acquire common characteristics when they have contact with each other. This is known as a Sprachbund. I suspect, for example, that the languages of these isles form something of a Sprachbund in, for example, the circumlocutory way they use the present tense, but maybe I’m wrong. Regardless of the truth of this, the Balkan Sprachbund is one of the best known and most clearly defined. The core Balkan languages are Romanian, Bulgarian and Albanian, and outside those there are also Macedonian, Turkish, Greek and Balkan Romani. Of these, Bulgarian is the most Balkan of them all and Balkan Romani the least. Dalmatian is distinctive in being an Eastern Romance language like Romanian but not a Balkan one, so it does share features with Romanian but not those which make it like the latter’s neighbours.

Balkan languages most typically have the following features:

  • A definite article expressed by a suffix.
  • No infinitive of the verb or a tendency to avoid it.
  • A two-case system comprising nominative/accusative and genitive/dative.
  • Evidentiality.
  • A future tense expressed by a verb of volition (English also does this – “will do”).
  • Shared vocabulary.
  • A perfect tense expressed by the verb for “have” (English does this as well).
  • Subjunctive used to express a polite command.
  • Numerals between eleven and nineteen inclusive are expressed as “one on ten”, “two on ten” and so forth.
  • Shared calques (literal translations of compound words).
  • A central vowel, sometimes schwa (English has this too).
  • Raising of O to U in stressed syllables.
  • Changing L to R.
  • Absence of /w/.
  • Loss of L before I.

Even Greek has acquired some of these. Turkish, which is not an Indo-European language, has more influenced the others than the other way round, with the evidentiality. Languages related to these usually don’t have these features, so for example Polish and Italian haven’t got them on the whole.

I want to focus on Romanian because it’s a Romance language. Romanian is the outlier in the Romance family. It’s closest to French and furthest from Portuguese. It’s also been culturally influenced by French. Ironically, Romanian is also the most conservative national Romance language of all. Sardinian may be more conservative but isn’t official for the nation of Italy. The spoken rhythm and sound of Romanian is also somewhat like French and English, and another Balkan language, Albanian, also has a remarkably English-sounding rhythm to my ear but I may be hearing what I want to hear with this. The same definitely does not apply to Greek or Turkish. Greek is more like Italian in that respect, although it’s been a separate language from Latin and Italic languages generally for at least as long as the ancestor of English has.

Speaking of close relatives to English, the Gothic language has also been spoken in the Balkans and nearby, and there are also some loanwords from Upper German into Romanian such as “pom”, which means “tree”, more easily recognisable as “Baum” in German, cognate with “beam” in English. It’s also a faux ami for the French for “apple” of course. Nonetheless, other than Latin itself, the biggest sources of vocabulary in Romanian are the Slavic languages. It tends to use Slavic words even where every other member of the family uses a Latin one, and even when it’s a common word.

The reason I’m thinking about Romanian a lot at the moment is that a week or so ago, when I wrote Caveat Procrastinator and found I got a lot of hits from Romania. It doesn’t seem to mean anything in Romanian but it is of course a Latin phrase, so I’m not sure why this was. In any case, it reminded me of something I’ve long wondered about. What would English be like if it were part of the Balkan Sprachbund?

English already has several features in common with Balkan languages, notably the wide distribution of schwa, future tense expressed by a verb of volition and a perfect tense expressed by “have”. It has also borrowed a large number of words from Norman French. I seem to recall it’s something like half the vocabulary of the language is French or Italic in some way, often Latin. It also sounds like a core Balkan language. Looking beyond English, some of its closest relatives, the Scandinavian languages, have a suffixed definite article. Swedish, Faroese, Icelandic, Danish and the two Norwegian standards all use postpositive definite articles, which presumably means Norn did as well although I can’t recall that as a fact. A language spoken in Great Britain, namely Caithness and Sutherland, in the Middle Ages, probably had a postpositive definite article, and of course Danish had when it was spoken in the Danelaw.

I’m going to sketch a conlang (constructed language) based on the idea that English is a Balkan language. The most obvious thing to do is to replace all words of French origin with Romanian words, but in doing so it should be noted that the main period of borrowing was from the eleventh to the fourteenth century, and at that time Romanian was not a written language, and that the loanwords were from Norman French rather than the direct ancestors of Metropolitan French, which is the reason we say “camp”, “war” and “warrant” rather than “champ”, “gwer” and “guarant”. There may not be a simple way to replicate this with Romanian. Another source of vocabulary is the common Balkan lexis found in the region. Hence there are such words as “cuty” – “box”; “crommon” – “onion”; “mess” – “table” (that might be a real one). These are Anglicised of course.

Although we already have “will” for the future, we also have “shall”, depending on the person. This version of English would only have the former. It would also have “have” for the perfect tense, since it has it already.

The two-case system is something to contemplate. In English, our pronouns have three cases, nominative, objective and genitive. In Anglo-Saxon times, along with nouns, they had four or five: nominative, accusative, genitive, dative and instrumental. There are still traces of the instrumental today: “why” is the instrumental case of “what”, and the construction, “the more the merrier” also uses the instrumental case. The problem with English having two inflected cases is that it isn’t clear whether the genitive or the dative would win out, and since the pronoun “why” is instrumental, which otherwise merged with the dative, it’s possible we would’ve ended up saying “why” for “whose”, which is a bit too weird. As it is, we retain the dative form but don’t use it just as a dative, so we have “him”, “whom”, “seldom”, “whilom”, “me”, “thee” and “you”, but with the exception of “seldom” and “whilom” we can use any of these for either the recipient or the thing or person done to. Even so, we do have situations where the genitive and dative have merged, as with “her”, and the question then arises of which form would’ve won out. There’s probably a clue in the history of the real Balkan languages here.

Regarding the infinitive, there are signs of English speakers trying to avoid using it too. When I was a child, I didn’t realise that “do” in “try and do it” was infinitive and that therefore the wording was grammatically incorrect – it should be “try to do it”. There’s also a tendency for English speakers to use the present participle form where other European languages use the infinitive. Unfortunately, although I know English does this I have no idea when it happens. All I know is that German speakers sometimes use “to X” when native English speakers would say “X-ing”. Hence there are a couple of possibilities there. I’m aware that Romanian uses infinitive forms sometimes to create nouns from verbs where there is no commonly used infinitive form used as an actual verb. Therefore I can absolutely see English eschewing the infinitive. It’s a very English thing to do in a way.

All natural human languages are said to be able to express anything other human languages can. This is, incidentally, not true of formal languages. The axiom schema of replacement of Zermelo-Fraenkel set theory which can be expressed clearly using formal logic which seems to be inexpressible in English, and in fact it’s too fiddly to type here so I’m just going to copy-paste an image of it:

This is fairly easy to make sense of provided you know what the variables mean, but to express it in English is a bit of an undertaking. Likewise, if the Piraha language isn’t a hoax, rather a lot of what can be expressed easily in other natural human languages can’t be expressed in it, which is one reason I’m dubious. Nevertheless there are enrichments and deficiencies in various languages such as the fact that we have separate words for “do” and “make” in English but only one word for “know”, and one enrichment only rarely if ever found in Indo-European languages outside the Balkans is evidentiality. We are able to say “so I hear”, “so I’m told”, “apparently” and “allegèdly” in English, but it’s rather clumsy and vague and involves words which are kind of ad hoc. It’s like saying “In the past I live in Canterbury” rather than having the form “lived”. Evidentiality is considered so useful by some that Suzette Haden Elgin built it into the grammar of her women’s conlang Láadan. Turkish is thought to be the source of Balkan evidentiality as Turkic languages generally have it and Indo-European ones generally lack it. They have a “renarrative mood”. I should point out that a grammatical mood is like the imperative – telling someone what to do, the optative – “would that it were”, the indicative – “the cat sat on the mat” and the subjunctive – “it was suggested that she be dropped from the team”. The renarrative mood expresses the reporting of an event not witnessed by the speaker, so it might be expected that news items in the Balkans use the renarrative mood, but in fact they use the indicative, which might be seen as propaganda. I could certainly do with the renarrative because I often feel I’m making statements which are only apparently true. I mean, they might be true but I’m not certain. I’m just told they are. In fact, even if there’s no other point to this exercise, English having a renarrative mood would be incredibly useful. What form it could take is another issue. I can’t really see people peppering their speech with “allegèdly” all over the place. It needs to be either a short auxiliary or a direct inflection of the verb itself.

Another notable use of a non-indicative mood in the Balkans is the way they use the subjunctive to express a mild command. You see this kind of thing in Punjabi, where the polite version of a command is in the indicative future tense, which actually sounds like it might be rude – “you WILL do this!” – but on reflection merely means to imply that the speaker has such faith in the willingness of the addressee to do something that it’s practically a prediction: “I’m sure that this will happen (because you’re so great that you’d never think not to do it).” The subjunctive for the imperative is somewhat similar. It’s like “(it would be really nice) were you to do this” I think, and that kind of polite circumlocution is again very English, and I’m sure we can get on board with that too.

There is already something rather like the numeral system in two of the English forms: “eleven” and “twelve”. “Eleven” is connected to the words “one left” and “twelve” to “two left”, that is, they are the extra numbers left over after people who count one per digit have used up both hands. However, Romanian and the others differ from English and Romanian from other Romance languages. The Romanian word for “eleven” is “unsprezece”, that is, “unum super decem” or “one above ten”, run into a single word. This continues all the way through the teens to nineteen inclusively. For English this would be something like “one on ten”, “two on ten” and so on, perhaps run into a single word again and preserving archaic forms as often happens when this takes place, so maybe “anonten”, “twennenten”, “thirenten” and so forth.

Then there are the calques – literal translations of common expressions sometimes preserving the structure of the words. For instance, fruit doesn’t “ripen” in the Balkans but becomes “baked”, or alternatively bread “ripens” depending on which way round you want to look at it. “Whether I want to or not” and similar forms are expressed as “want – not want”. I’m afraid I’m getting these straight out of Wikipedia, sorry.

English, Scots, Yola and Dutch are unusual among the Germanic languages for not changing W and WH to /v/ or “v” and “hv”. In the Balkan languages, /w/ has become V. Thus what we think of as a distinctive feature of German-accented English could happen in Balkan English too. It would wipe out the W/WH distinction but nowadays that seems to be gone anyway. There’s also a tendency to change L’s to R’s, which of course occurs in the stereotypical Japanese accent, but in Balkan languages has not extended so far as to eliminate L completely. Finally, there’s the transformation of O to U in unstressed syllables. I’m not sure how much this would happen because we already tend to use schwa so much here.

There is in fact already a Germanic language currently spoken in the Balkans which has been there for a long time: Transylvanian Saxon. From the twelfth Christian century on, Germans from Saxony settled in Hungarian Transylvania to defend against the Tatars. Although they’re called Saxons, they are actually Franconian, i.e. their language was originally closer to Dutch, that is, even closer to English than High German is. Most of them have now returned to Germany but several thousand stayed. The language is quite similar to Letzeburgesch, a language spoken in Luxembourg generally understood to be German. I find Transylvanian Saxon to be pretty straightforward to understand and perceive it to be German with some slightly weird vowels and Low German tendencies which make it slightly closer to English than High German on occasion. I can’t speak it of course. It has an alveolar rolled R, and that immediately makes it more appealing. However, so far as I can tell, it isn’t influenced by Balkan Sprachbund characteristics at all, which is disappointing.

It’s quite a major task to make a conlang, and most people would consider it a waste of time. Nonetheless, this is at least the groundwork for such a task, and it isn’t entirely pointless. I can easily imagine a peculiar English spoken with a Transylvanian accent which enables one to tell whether the speaker has just been told something or witnessed it themselves, sticks a syllable meaning “the” on the ends of its nouns and so on, and I find that quite appealing. I want this to be done and I doubt anyone else will do it, but there are better ways of spending one’s time.