Nowadays there are two divisions of Romance languages. In the West, there’s French, Italian, Castilian (Spanish), Portuguese and a number of less widely-spoken ones, some of which are considered dialects but including Occitan, Catalan, Sardinian and the various minority languages of Switzerland. In the East, there’s Romanian and a few other similar languages such as Moldovan, which seems to be Romanian written in the Cyrillic alphabet like Russian, Aromanian and Istro-Romanian. All of these languages are descended from Vulgar Latin. There are also some extinct Romance languages, including a likely British one about which practically nothing is known and African Romance, which was spoken along the southern coast of the Mediterranean and seems to have been somewhat similar to Spanish. Crucially for the purposes of the division between East and West, there was also Dalmatian, a language spoken along the coast between Italy and Albania, which was important enough to be the national language of the Republic of Ragusa. On 10th June 1898, Tuone Udaina, its last speaker, was killed in a roadworks explosion. It was similar in some ways to Romanian but it was not a Balkan language. I will explain.
As well as there being language families, distantly related or practically unrelated languages acquire common characteristics when they have contact with each other. This is known as a Sprachbund. I suspect, for example, that the languages of these isles form something of a Sprachbund in, for example, the circumlocutory way they use the present tense, but maybe I’m wrong. Regardless of the truth of this, the Balkan Sprachbund is one of the best known and most clearly defined. The core Balkan languages are Romanian, Bulgarian and Albanian, and outside those there are also Macedonian, Turkish, Greek and Balkan Romani. Of these, Bulgarian is the most Balkan of them all and Balkan Romani the least. Dalmatian is distinctive in being an Eastern Romance language like Romanian but not a Balkan one, so it does share features with Romanian but not those which make it like the latter’s neighbours.
Balkan languages most typically have the following features:
- A definite article expressed by a suffix.
- No infinitive of the verb or a tendency to avoid it.
- A two-case system comprising nominative/accusative and genitive/dative.
- Evidentiality.
- A future tense expressed by a verb of volition (English also does this – “will do”).
- Shared vocabulary.
- A perfect tense expressed by the verb for “have” (English does this as well).
- Subjunctive used to express a polite command.
- Numerals between eleven and nineteen inclusive are expressed as “one on ten”, “two on ten” and so forth.
- Shared calques (literal translations of compound words).
- A central vowel, sometimes schwa (English has this too).
- Raising of O to U in stressed syllables.
- Changing L to R.
- Absence of /w/.
- Loss of L before I.
Even Greek has acquired some of these. Turkish, which is not an Indo-European language, has more influenced the others than the other way round, with the evidentiality. Languages related to these usually don’t have these features, so for example Polish and Italian haven’t got them on the whole.
I want to focus on Romanian because it’s a Romance language. Romanian is the outlier in the Romance family. It’s closest to French and furthest from Portuguese. It’s also been culturally influenced by French. Ironically, Romanian is also the most conservative national Romance language of all. Sardinian may be more conservative but isn’t official for the nation of Italy. The spoken rhythm and sound of Romanian is also somewhat like French and English, and another Balkan language, Albanian, also has a remarkably English-sounding rhythm to my ear but I may be hearing what I want to hear with this. The same definitely does not apply to Greek or Turkish. Greek is more like Italian in that respect, although it’s been a separate language from Latin and Italic languages generally for at least as long as the ancestor of English has.
Speaking of close relatives to English, the Gothic language has also been spoken in the Balkans and nearby, and there are also some loanwords from Upper German into Romanian such as “pom”, which means “tree”, more easily recognisable as “Baum” in German, cognate with “beam” in English. It’s also a faux ami for the French for “apple” of course. Nonetheless, other than Latin itself, the biggest sources of vocabulary in Romanian are the Slavic languages. It tends to use Slavic words even where every other member of the family uses a Latin one, and even when it’s a common word.
The reason I’m thinking about Romanian a lot at the moment is that a week or so ago, when I wrote Caveat Procrastinator and found I got a lot of hits from Romania. It doesn’t seem to mean anything in Romanian but it is of course a Latin phrase, so I’m not sure why this was. In any case, it reminded me of something I’ve long wondered about. What would English be like if it were part of the Balkan Sprachbund?
English already has several features in common with Balkan languages, notably the wide distribution of schwa, future tense expressed by a verb of volition and a perfect tense expressed by “have”. It has also borrowed a large number of words from Norman French. I seem to recall it’s something like half the vocabulary of the language is French or Italic in some way, often Latin. It also sounds like a core Balkan language. Looking beyond English, some of its closest relatives, the Scandinavian languages, have a suffixed definite article. Swedish, Faroese, Icelandic, Danish and the two Norwegian standards all use postpositive definite articles, which presumably means Norn did as well although I can’t recall that as a fact. A language spoken in Great Britain, namely Caithness and Sutherland, in the Middle Ages, probably had a postpositive definite article, and of course Danish had when it was spoken in the Danelaw.
I’m going to sketch a conlang (constructed language) based on the idea that English is a Balkan language. The most obvious thing to do is to replace all words of French origin with Romanian words, but in doing so it should be noted that the main period of borrowing was from the eleventh to the fourteenth century, and at that time Romanian was not a written language, and that the loanwords were from Norman French rather than the direct ancestors of Metropolitan French, which is the reason we say “camp”, “war” and “warrant” rather than “champ”, “gwer” and “guarant”. There may not be a simple way to replicate this with Romanian. Another source of vocabulary is the common Balkan lexis found in the region. Hence there are such words as “cuty” – “box”; “crommon” – “onion”; “mess” – “table” (that might be a real one). These are Anglicised of course.
Although we already have “will” for the future, we also have “shall”, depending on the person. This version of English would only have the former. It would also have “have” for the perfect tense, since it has it already.
The two-case system is something to contemplate. In English, our pronouns have three cases, nominative, objective and genitive. In Anglo-Saxon times, along with nouns, they had four or five: nominative, accusative, genitive, dative and instrumental. There are still traces of the instrumental today: “why” is the instrumental case of “what”, and the construction, “the more the merrier” also uses the instrumental case. The problem with English having two inflected cases is that it isn’t clear whether the genitive or the dative would win out, and since the pronoun “why” is instrumental, which otherwise merged with the dative, it’s possible we would’ve ended up saying “why” for “whose”, which is a bit too weird. As it is, we retain the dative form but don’t use it just as a dative, so we have “him”, “whom”, “seldom”, “whilom”, “me”, “thee” and “you”, but with the exception of “seldom” and “whilom” we can use any of these for either the recipient or the thing or person done to. Even so, we do have situations where the genitive and dative have merged, as with “her”, and the question then arises of which form would’ve won out. There’s probably a clue in the history of the real Balkan languages here.
Regarding the infinitive, there are signs of English speakers trying to avoid using it too. When I was a child, I didn’t realise that “do” in “try and do it” was infinitive and that therefore the wording was grammatically incorrect – it should be “try to do it”. There’s also a tendency for English speakers to use the present participle form where other European languages use the infinitive. Unfortunately, although I know English does this I have no idea when it happens. All I know is that German speakers sometimes use “to X” when native English speakers would say “X-ing”. Hence there are a couple of possibilities there. I’m aware that Romanian uses infinitive forms sometimes to create nouns from verbs where there is no commonly used infinitive form used as an actual verb. Therefore I can absolutely see English eschewing the infinitive. It’s a very English thing to do in a way.
All natural human languages are said to be able to express anything other human languages can. This is, incidentally, not true of formal languages. The axiom schema of replacement of Zermelo-Fraenkel set theory which can be expressed clearly using formal logic which seems to be inexpressible in English, and in fact it’s too fiddly to type here so I’m just going to copy-paste an image of it:

This is fairly easy to make sense of provided you know what the variables mean, but to express it in English is a bit of an undertaking. Likewise, if the Piraha language isn’t a hoax, rather a lot of what can be expressed easily in other natural human languages can’t be expressed in it, which is one reason I’m dubious. Nevertheless there are enrichments and deficiencies in various languages such as the fact that we have separate words for “do” and “make” in English but only one word for “know”, and one enrichment only rarely if ever found in Indo-European languages outside the Balkans is evidentiality. We are able to say “so I hear”, “so I’m told”, “apparently” and “allegèdly” in English, but it’s rather clumsy and vague and involves words which are kind of ad hoc. It’s like saying “In the past I live in Canterbury” rather than having the form “lived”. Evidentiality is considered so useful by some that Suzette Haden Elgin built it into the grammar of her women’s conlang Láadan. Turkish is thought to be the source of Balkan evidentiality as Turkic languages generally have it and Indo-European ones generally lack it. They have a “renarrative mood”. I should point out that a grammatical mood is like the imperative – telling someone what to do, the optative – “would that it were”, the indicative – “the cat sat on the mat” and the subjunctive – “it was suggested that she be dropped from the team”. The renarrative mood expresses the reporting of an event not witnessed by the speaker, so it might be expected that news items in the Balkans use the renarrative mood, but in fact they use the indicative, which might be seen as propaganda. I could certainly do with the renarrative because I often feel I’m making statements which are only apparently true. I mean, they might be true but I’m not certain. I’m just told they are. In fact, even if there’s no other point to this exercise, English having a renarrative mood would be incredibly useful. What form it could take is another issue. I can’t really see people peppering their speech with “allegèdly” all over the place. It needs to be either a short auxiliary or a direct inflection of the verb itself.
Another notable use of a non-indicative mood in the Balkans is the way they use the subjunctive to express a mild command. You see this kind of thing in Punjabi, where the polite version of a command is in the indicative future tense, which actually sounds like it might be rude – “you WILL do this!” – but on reflection merely means to imply that the speaker has such faith in the willingness of the addressee to do something that it’s practically a prediction: “I’m sure that this will happen (because you’re so great that you’d never think not to do it).” The subjunctive for the imperative is somewhat similar. It’s like “(it would be really nice) were you to do this” I think, and that kind of polite circumlocution is again very English, and I’m sure we can get on board with that too.
There is already something rather like the numeral system in two of the English forms: “eleven” and “twelve”. “Eleven” is connected to the words “one left” and “twelve” to “two left”, that is, they are the extra numbers left over after people who count one per digit have used up both hands. However, Romanian and the others differ from English and Romanian from other Romance languages. The Romanian word for “eleven” is “unsprezece”, that is, “unum super decem” or “one above ten”, run into a single word. This continues all the way through the teens to nineteen inclusively. For English this would be something like “one on ten”, “two on ten” and so on, perhaps run into a single word again and preserving archaic forms as often happens when this takes place, so maybe “anonten”, “twennenten”, “thirenten” and so forth.
Then there are the calques – literal translations of common expressions sometimes preserving the structure of the words. For instance, fruit doesn’t “ripen” in the Balkans but becomes “baked”, or alternatively bread “ripens” depending on which way round you want to look at it. “Whether I want to or not” and similar forms are expressed as “want – not want”. I’m afraid I’m getting these straight out of Wikipedia, sorry.
English, Scots, Yola and Dutch are unusual among the Germanic languages for not changing W and WH to /v/ or “v” and “hv”. In the Balkan languages, /w/ has become V. Thus what we think of as a distinctive feature of German-accented English could happen in Balkan English too. It would wipe out the W/WH distinction but nowadays that seems to be gone anyway. There’s also a tendency to change L’s to R’s, which of course occurs in the stereotypical Japanese accent, but in Balkan languages has not extended so far as to eliminate L completely. Finally, there’s the transformation of O to U in unstressed syllables. I’m not sure how much this would happen because we already tend to use schwa so much here.
There is in fact already a Germanic language currently spoken in the Balkans which has been there for a long time: Transylvanian Saxon. From the twelfth Christian century on, Germans from Saxony settled in Hungarian Transylvania to defend against the Tatars. Although they’re called Saxons, they are actually Franconian, i.e. their language was originally closer to Dutch, that is, even closer to English than High German is. Most of them have now returned to Germany but several thousand stayed. The language is quite similar to Letzeburgesch, a language spoken in Luxembourg generally understood to be German. I find Transylvanian Saxon to be pretty straightforward to understand and perceive it to be German with some slightly weird vowels and Low German tendencies which make it slightly closer to English than High German on occasion. I can’t speak it of course. It has an alveolar rolled R, and that immediately makes it more appealing. However, so far as I can tell, it isn’t influenced by Balkan Sprachbund characteristics at all, which is disappointing.
It’s quite a major task to make a conlang, and most people would consider it a waste of time. Nonetheless, this is at least the groundwork for such a task, and it isn’t entirely pointless. I can easily imagine a peculiar English spoken with a Transylvanian accent which enables one to tell whether the speaker has just been told something or witnessed it themselves, sticks a syllable meaning “the” on the ends of its nouns and so on, and I find that quite appealing. I want this to be done and I doubt anyone else will do it, but there are better ways of spending one’s time.
