semitic – A Box Of Chocolates

Some time ago in the 1980s I think, I made one of my many attempts to learn Gàidhlig and noticed something rather strange. I already had some knowledge of Hebrew and Arabic from when I was younger, and it suddenly struck me that the Celtic language shared some remarkable unusual features with the other two. From what I can recall, these included verb-subject-object word order, two genders – feminine and masculine – and something I can only vaguely remember about how prepositions and pronouns work. At the time, I didn’t know what to make of it. It seemed to be more than a coincidence because three always counts to my mind as more than chance allows, but it was difficult to think of a way of how it could’ve happened. I eventually settled on a rather vague conclusion that maybe Semitic language speakers had travelled north from the Maghreb into Iberia, which Q-Celtic languages are sometimes claimed to originate, and that they then influenced the ancestor of the Irish language in some way. However, this doesn’t work particularly well as it fails to explain how Welsh and Cornish also have these features. After a while, I just put it down to coincidence and my tendency to see patterns where none exist other than the ones my mind has imposed upon them.

At this point I’m going to veer off into probability to illustrate why three things in common is my threshold for statistical significance. It’s common to plump for one in twenty as the point at which something is considered significant, and scientific experiments often use this. In recent years I’ve seen rather too many dubious-looking scientific papers which seem to go for a much lower limit and I now wonder if there has been a new development in statistical theory which justifies this, or whether it’s more to do with “publish or perish”. Anyway, probabilities multiply, so if you flip a fair coin three times and it comes up heads every time the probability of that outcome is one in two times one in two times one in two. 2³ is eight, still below the point when one decides something is significant, but the probability of something happening is not always one in two. For fair dice, you’d only need to throw a six twice for it to become significant: one in thirty-six is six squared. Taking this the other way, the mean probability for three events to multiply up to one in twenty is of course the cube root of twenty, which is just over one in 2.7. However, this reasoning is faulty because we see patterns as opposed to the absence of patterns, so given the large number of other grammatical features one could pluck out of Celtic and Semitic languages, the ones that don’t fit might be ignored and the calculation then becomes extremely complicated because one then has to consider how to delineate specific grammatical features and how to count them, then work out what the chances are that two sets of languages share three grammatical features based on this and the number of possible options. For instance, with syntax the options, assuming a largely fixed word order which doesn’t always happen, are SVO, SOV, OVS, VSO, VOS and OSV, which is one in six. However, other features are quite arbitrary. There are languages out there with more than two dozen grammatical genders, for example. It’s possible to imagine a language whose every noun has a different gender.

Another pattern which definitely is meaningful which can be plucked out of Celtic languages as they are today is the fact that they and Romance languages, more specifically Italic languages, which are Romance languages plus Latin and its closest contemporary relatives, are closer to one another than they are to other branches of the Indo-European language family. Some of these features are the result of parallel evolution. For instance, all of the surviving six Celtic languages have two grammatical genders consisting of feminine and masculine, and this is also true of all Western Romance languages (though not of Romanian, which still has neuter). Besides this, other Indo-European languages tend to use an ending like “-est” to express the superlative of adjectives, but Italic and Celtic tend to use something like “-issimum” – “best” versus “bellissimo” for example. There are a number of other similarities which may be preserved ancient features lost from the other languages, features acquired because they were neighbours or features acquired in their common ancestral language. These are, though, easy to account for because Italic and Celtic just are obviously related, were spoken near each other and so on. The idea of a parallel between Celtic and Semitic is much harder to explain, which is why it might not exist at all.

Recently, I discovered that my personal will o’ the wisp is not in fact just mine. Professional linguists have noticed this too, and there are even theories about how it might have happened and a number of other features in common. VSO and inflected prepositions are just two of several parallels. I should explain that in Gàidhlig and its relatives, prepositions vary according to who they refer to, so for example “agam” means “at me” and “agat” “at thee”. The origin of these is easy to account for, that the words have simply been run together over the millennia, but few other languages do this. Arabic and Hebrew, on the other hand, do. The languages also do things with these prepositions which other languages don’t. They express possession and obligation with them. “The hair on her” – “am falt oirre” is “her hair” and “I need/want/must have a knife” is “tha bhuam sgian” – “there is from me (a) knife”. That “(a)” indicates something else they have in common: they all have a word for “the” but none for “a”. It’s unusual for a language to have a way of expressing definiteness without indefiniteness. Interestingly, Anglo-Saxon and Old Norse, both spoken in these isles, also had a way to say “the” but not one to say “a(n)”, and this may be a clue as to how these apparent coincidences happened. Breton, however, does have an indefinite article. Likewise, all the languages repeat the pronoun at the end of a relative clause – “the chair which I sat on it” and not “the chair (which) I sat on”. There’s also the way the word for “and” is used, or rather, a word for “and”: “agus” in Gàidhlig (there’s another word, “is”) and “wa” in Arabic (“ve” in today’s Hebrew). In English, “and” is a simple coördinating conjunction like “or” and “but”, but in the other languages it can also be used as a subordinating one. It can also mean “when” or “as”. This is also unusual. “Agus”/”ve” can also be used to mean “but” or “although”, and in fact as I understand it, the Arabic “wa” is the only option to express “but”. Besides this, there’s what’s known as the construct state genitive in English descriptions of Hebrew grammar. Arabic doesn’t say “the man’s house” but “man the house”, or “taigh an duine” in Gàidhlig – “the house man”. This is in spite of the fact that the language in question has a genitive form for the noun in question. This makes approximately eight features found in Celtic and Semitic languages but only rarely in others.

And there’s more. The surviving Celtic languages are unusual among Indo-European languages in having these features, and are in general quite aberrant compared to the others. That said, there are branches of the family which have unusual features for it, such as Armenian, which has grammar more like other languages than Indo-European in that it hangs successive suffixes off the ends of words per idea as opposed to having combined ideas in each suffix (in English we have, for example, a final S for genitive (possessive) and plural and don’t need anything extra). Even so, were it not for the known history and the fact that so much Celtic vocabulary is clearly similar to that of other European languages, nobody would guess Celtic languages were Indo-European. In fact, the very features which they share with Semitic languages are the ones which make them unique in the Indo-European family.

They are also emphatically not related to each other, or at least so distantly related that there are languages native to Kenya and Tanzania which are closer to Hebrew and Arabic and a dead language spoken in present day China which is closer to Welsh (and in fact English) than they are to each other. Semitic languages are part of a family now referred to as “Afro-Asiatic”, which also includes Tamazight, a Berber language, and Ancient Egyptian, spoken five thousand years ago and still nowhere near the speech of the Kurgans at the time which are ancestral to Celtic, Germanic and the like. There are, however, a few theories about how this has happened.

One apparently anomalous circumstance which can be seen from the New Testament is that Paul wrote a letter to the Galatians. These lived in Anatolia, the Asian portion of present-day Turkey, and they spoke a Celtic language. This language was clearly in close proximity to the Semitic lingua franca of that region at the time, Aramaic, as well as various others such as Assyrian. It’s therefore been suggested that the whole of the Celtic branch was influenced by this local connection, all the way across to Ireland in the end. To me, this seems a little far-fetched, but it is true that there’s a concentration of a particular set of genes which marks the Irish, and incidentally myself, as possible wanderers from the Indo-European ancestral land who went as far as possible at the time. This may make the so-called Celts the ultimate invaders in a way and contradicts the common mystical, matriarchal and peaceful image some people seem to have of them. This migration also forms part of another theory, that farming, having been invented in the Fertile Crescent where Semitic languages were spoken, then spread culturally across Europe to these islands and took linguistic features with it. Either of these ideas being true could be expected to imply that all Celtic languages, not just the modern survivors here and in Brittany, had these features in common.

Significantly, the speakers of Celtic languages were probably the first Indo-European speakers to arrive in Great Britain and Ireland. Prior to that, clearly there were other people living here who had their own spoken but unwritten languages. It’s possible that traces of these may survive in place names. It used to be thought that the Picts spoke a non-IE language, possibly related to Basque, but this has now been refuted. The features Irish, Welsh and the rest have in common with Hebrew and Arabic are also apparently shared with Tamazight and other languages of the Maghreb, although to me that’s hearsay – I haven’t checked them out. Consequently, one rather outré theory, is that before the Celts got here the folk of Albion and the Emerald Isle spoke a Semitic language, and Celtic was influenced by this when it got here. However, there doesn’t seem to be much reason to suppose this to be so other than the connection.

Leaving those theories aside, I would bring up the issue of linguistic universals, and particularly implicational universals. Some features are common to all spoken languages. For example, every known spoken language has a vowel like /a/ as in “father” in it, every language which distinguishes questions tonally involves changing the pitch of the voice towards the end of the sentence, and every language has at least some plural pronouns. There’s a particular set of implicational universals around SOV languages which they tend to have in common, such as being exclusively suffixing, to the extent that it used to be thought that there was a so-called “Altaic” language family including Turkish and Mongolian, and some would even include Japanese and Korean in that, but they’ve turned out not to be closely related but have sometimes grown more alike through contact, but they also have many of these implicational universals, suggesting to me some kind of possible “standard” human spoken language with those grammatical features. I would tentatively suggest, and I may well be wrong, that the features Celtic and Semitic languages share are in fact similarly implicational universals. Both of them have an unusual syntax and this may lead them both down the same path.

But there’s an extra layer to this which intrigues me. There used to be a famous Hebrew teacher who introduced the subject as “Gentlemen, this is the language God spoke” (yes, this is extremely sexist but it was a long time ago), and similarly Arabic is considered a particularly sacred language almost designed by God to write the Qur’an. Hence the features mentioned are used in two very important sacred texts, and if I’m going to go all religious and mystical on you, just maybe the Celtic and Semitic languages have a special place in spiritual practices, and this is about that. But leaving that aside, it still seems to me that the most likely explanation for the things they have in common is simply that they are a particular “type” of language, just as Japanese and Turkish are, without needing to have any genetic relationship.

They’re also both really annoying!

The issue of overinterpretation will have to be held over until tomorrow, sorry.