in the
Paleo-Balkanic branch based on "The Indo-European Language Family" by Brian D. Joseph and Adam Hyllested (2022).
Pre-Indo-European linguistic substratum Pre-Indo-European sites are found throughout the territory of Albania; such as in Maliq, Vashtëm, Burimas, Barç, Dërsnik in
Korçë, Kamnik in
Kolonjë, Kolsh in
Kukës, Rashtan in
Librazhd and Nezir in
Mat. As in other parts of Europe, these migratory Indo-European tribes entered the Balkans and contributed to the formation of the historical Paleo-Balkan tribes, to which Albanians trace their origin. The previous populations – during the process of assimilation by the immigrating IE tribes – have played an important part in the formation of the various ethnic groups generated by their long symbiosis. Consequently, the IE languages that developed in the Balkan Peninsula, in addition to their natural evolution, have also been impacted by the idioms of the assimilated pre-Indo-European people. In terms of linguistics, the pre-Indo-European substrate language spoken in the southern Balkans has probably influenced
pre-Proto-Albanian, the ancestor idiom of Albanian. The extent of this linguistic impact cannot be determined with precision due to the uncertain position of Albanian among
Paleo-Balkan languages and their scarce attestation. Some loanwords, however, have been proposed, such as and ; compare with pre-Greek . Albanian is also the only language in the Balkans which has retained elements of the
vigesimal numeral system – , – which was prevalent in the pre-Indo-European languages of Europe; such as the
Basque language, which broadly uses vigesimal numeration.
Attestation The first attested mention of Albanian occurred in 1285 at the
Venetian city of
Ragusa (present-day
Dubrovnik,
Croatia) when a crime witness named Matthew testified: "I heard a voice crying in the mountains in Albanian" (). The earliest attested written specimens of Albanian are
Formula e pagëzimit (1462) and
Arnold Ritter von Harff's lexicon (1496). The first Albanian text written with
Greek letters is a fragment of the
Ungjilli i Pashkëve (Passover Gospel) from the 15 or 16th century. The first printed books in Albanian are
Meshari (1555) and
Luca Matranga's
E mbsuame e krështerë (1592). However, as Fortson notes, Albanian written works existed before this point; they have simply been lost. The existence of written Albanian is explicitly mentioned in a letter attested from 1332, and the first preserved books, including both those in Gheg and in Tosk, share orthographic features that indicate that some form of common literary language had developed.
Toponymy In the Balkans and southern Italy, several toponyms, river and mountain names which have been attested since antiquity can be explained etymologically via Albanian or have evolved phonologically through Albanian and later adopted in other languages. Inherited toponyms from a Proto-Albanian language and the date of adoption of non-Albanian toponyms indicate in
Albanology the regions were the Albanian language originated, evolved and expanded. Depending on which proposed etymology and phonological development linguists support, different etymologies are usually used to link Albanian to Illyrian, Messapic, Dardanian, Thracian or an unattested Paleo-Balkan language. •
Brindisi is a town in southern Italy.
Brundisium was originally a settlement of the
Iapygian Messapians, descendants of an Illyrian people who migrated from the Balkans to Italy in Late Bronze/Early Iron Age transition. The name highlights the ties between Messapic to Albanian as Messapic
brendo (stag) is linked to Old Gheg
bri (horns). The preservation of old Doric /u/ indicates that the modern name derives from populations to whom the toponym was known in its original Doric pronunciation. The initial stress in Albanian
Durrës presupposes an Illyrian accentuation on the first syllable. •
Lezhë is a city in Albania and in ancient times the area was inhabited by Illyrians. The town was known as Lissos in Ancient Greek and Lissus in Latin. The ancient name of the town developed into modern Lezhe (archaic: Lesh) through Albanian sound changes. When this settlement happened is a matter of debate, as Proto-Albanians might have moved relatively late in antiquity in the area which might have been an eastern expansion of Proto-Albanian settlement as no other toponyms known in antiquity in the area presuppose an Albanian development. The development of
Nish 'foal' after the loan from Latin
caballus into Albanian
kalë 'horse'. The Albanian name
Mazrek(u), which means '
horse breeder' in Albanian, is found throughout all Albanian regions, and notably it was the name used by the
Kastrioti noble family to highlight their tribal affiliation (Albanian:
farefisní). Also the Palaeo-Balkan word for '
mule' has been preserved in Albanian
mushk(ë) 'mule'.
Hydronyms Concerning the inheritance of hydronymic vocabulary, it has been noted that there were no lexemes relating to
seamanship in the
Proto-Indo-European language. PIE hydronyms reconstructed so far refer to swamps, marshes, lakes, and riverine environments, but not to the sea. For instance, the Greek term
thalassa "sea" is
Pre-Greek, not an inherited Indo-European word. The Albanian term for "sea" (
det ), which was considered by some Albanologists to be an inherited term from Proto-Albanian
*deubeta as a cognate of Proto-Germanic
*deupiþō- "depth", is firmly dismissed by present-day historical linguists. Instead, a borrowing from Greek
delta "river delta" has been proposed recently. At least two other Albanian terms from the same semantic field are early Greek loanwords:
pellg "pond, basin, depth" from πέλαγος
pelagos "sea", and
zall "riverbank, river sand", from αι҆γιαλός "sea-shore", which underwent in Proto-Albanian a semantic shift. Also all Albanian words relating to seamanship appear to be loans. Words referring to large streams and their banks tend to be loans, but
lumë ("river") is native, as is
rrymë (the flow of river water). Words for smaller streams and stagnant pools of water are more often native, except
pellg. Albanian has maintained since
Proto-Indo-European a specific term referring to a riverside forest (
gjazë), as well as its words for marshes. Albanian has maintained native terms for "whirlpool", "water pit" and (aquatic) "deep place", leading
Orel to speculate that Albanian was likely spoken in an area with an excess of dangerous whirlpools and depths. The term
mat, meaning "height", "beach", "bank/shore" in
Northern Albanian and "beach", "shore" in
Arbëresh, is inherited from Proto-Albanian
*mata <
*mn̥-ti "height" (cf. Latin
mŏns "mountain"), after which the river
Mat (and the
region with the same name) in north-central Albania was named, which can be explained as "mountain river". The meaning "bank/shore" hence would have emerged only at a later time (cf. German
Berg "mountain" in relation to Slavic
*bergъ "bank/shore").
Vegetation Regarding
forests, words for most
conifers and
shrubs are native, as are the terms for "
alder", "
elm", "
oak", "
beech", and "
linden", while "
ash", "
chestnut", "
birch", "
maple", "
poplar", and "
willow" are loans.
Social organization The original
kinship terminology of Indo-European was radically reshaped; changes included a shift from "mother" to "sister", and were so thorough that only three terms retained their original function; the words for "son-in-law", "mother-in-law" and "father-in-law". All the words for second-degree blood kinship, including "aunt", "uncle", "nephew", "niece", and terms for grandchildren, are ancient loans from Latin.
Linguistic contacts Overall patterns in loaning Openness to loans has been called a "characteristic feature" of Albanian. The Albanian original lexical items directly inherited from
Proto-Indo-European are far fewer in comparison to the loanwords, though loans are considered to be "perfectly integrated" and not distinguishable from native vocabulary on a synchronic level. Although Albanian is characterized by the absorption of many loans, even, in the case of Latin, reaching deep into the core vocabulary, certain semantic fields nevertheless remained more resistant. Terms pertaining to
social organization are often preserved, though not those pertaining to political organization, while those pertaining to trade are all loaned or innovated. While the words for plants and animals characteristic of mountainous regions are entirely original, the names for fish and for agricultural activities are often assumed to have been borrowed from other languages. However, considering the presence of some preserved old terms related to the sea fauna, some have proposed that this vocabulary might have been lost in the course of time after proto-Albanian tribes were pushed back into the inland during invasions.
Wilkes holds that the Slavic loans in Albanian suggest that contacts between the two populations took place when Albanians dwelt in forests 600–900 metres above sea level.
Greek Linguistic contact between Albanian and Greek has been securely dated to the Iron Age. Also contacts between the respective post-PIE languages which gave rise to the two languages also occurred in previous times. Common traces of the Mediterranean-Balkan substratum are considered to date to the common Indo-European phase of Albanian and Greek (cf.
Graeco-Albanian). Innovative creations of
agricultural terms shared only between Albanian and Greek, such as
*h₂(e)lbʰ-it- 'barley' and
*spor-eh₂- 'seed', were formed from non-agricultural Proto-Indo-European roots through semantic changes to adapt them for agriculture. Since they are limited only to Albanian and Greek, they could be traced back with certainty only to their last common Indo-European ancestor, and not projected back into
Proto-Indo-European. Shortly after they had diverged from one another, Albanian, Greek and Armenian, also underwent a longer period of contact (as can be seen, for example, in the irregular correspondence: Greek σκόρ(ο)δον, Armenian
sxtor,
xstor, and Albanian
hudhër,
hurdhë "garlic"). Furthermore, intense Greek–Albanian contacts have certainly occurred thereafter, with ongoing connections between them in the Balkans from the ancient times, continuing up to the present-days. Ancient Greek loans in
Proto-Albanian originated from two distinct geographical and historical groups: borrowings from the Greek colonies on the Adriatic coast from the 7th century BCE, either directly or indirectly through trade communication in the hinterland; direct borrowings from Greek-speaking populations of
ancient Macedonia during the 5th–4th centuries BCE, before the replacement of
Ancient Macedonian with
Koine Greek. Several Proto-Albanian terms have been preserved in the lexicon of
Hesychius of Alexandria and other ancient glossaries. Some of the Proto-Albanian glosses in Hesychius are considered to have been loaned to the Dorik Greek as early as the 7th century BCE. Witczak (2016) specifically points to seven words recorded by the Greek grammarian
Hesychius of Alexandria (5th century AD), and particularly to the term 'a kind of earring', which was first attested in the work of the
choral lyric poet Alcman (
fl. 7th century BCE). This means that the ancestors of the Albanians were in contact with the northwestern part of Ancient Greek civilization and probably borrowed words from Greek cities (
Dyrrachium,
Apollonia, etc.) in the Illyrian territory, colonies which belonged to the Doric division of Greek, or from contacts in the
Epirus area. The earliest Greek loans began to enter Albanian circa 600 BC, and are of Doric provenance, tending to refer to vegetables, fruits, spices, animals and tools. This stratum reflects contacts between Greeks and Proto-Albanians from the 8th century BC onward, with the Greeks being either colonists on the Adriatic coast or Greek merchants inland in the Balkans. The second wave of Greek loans began after the split of the Roman empire in 395 and continued throughout the Byzantine, Ottoman and modern periods. According to Hermann Ölberg, the modern Albanian lexicon may include 33 words of ancient Greek origin, although it can be increased if the Albanian lexicon is properly evaluated. An argument claimed by some scholars as an indication of a location of Albanian further north than present-day Albania in antiquity is the number of loanwords from
Ancient Greek, mostly from Doric dialect, which is considered by them relatively small, even though Southern
Illyria neighbored the
Classical Greek civilization and there were a number of
Greek colonies along the Illyrian coastline. For instance, according to Bulgarian linguist
Vladimir I. Georgiev there is limited Greek influence in Albanian (See
Jireček Line of Roman times), and if Albanians had been inhabiting a homeland situated in modern Albania continuously since ancient times, the number of Greek loanwords in Albanian should be higher. However, the number of surviving loanwords is not a valid argument, as many Greek loans were likely lost through replacement by later Latin and Slavic loans, just as notoriously happened to most native Albanian vocabulary. On the other hand, the specifically Northwestern/Doric affiliations and ancient dating of Greek loans imply a specifically Western Balkan Albanian presence to the north and west of Greeks specifically in antiquity, though Huld cautions that the classical "precursors" of the Albanians would be "'Illyrians' to classical writers", but that the Illyrian label is hardly "enlightening" since classical ethnology was imprecise. Evidence of a significant level of early
linguistic contact between Albanian and Greek is provided by ancient common
structural innovations and
phonologic convergence such as: • the rise of the
close front rounded vowel /y/ (documented in
Attic and
Koine Greek); • the rise of
dental fricatives; • the voicing of voiceless
plosives after
nasal consonants; • the replacement, with a form that featured a
prefix, of the inherited
present tense 3rd person singular of the verb "be" (documented in
Koine Greek). Those innovations are limited only to the Albanian and Greek languages and are not shared with other languages of the
Balkan sprachbund. Since they precede the Balkan sprachbund era, those innovations date to a prehistoric phase of the Albanian language, spoken at that time in the same area as Greek and within a social frame of bilingualism among early Albanians having to be able to speak some form of Greek.
Latin and early Romance loans Latin loans are dated to the period between 167 BC and 400 AD. 167 BC coincides with the fall of the kingdom ruled by
Gentius, and reflects the early date of the entry of Latin-based vocabulary in Albanian, when the coastal areas of the Western Balkans were Romanized. It entered Albanian in the Early Proto-Albanian stage and evolved in later stages as a part of the Proto-Albanian vocabulary and within its phonological system. Albanian is one of the oldest languages that came into contact with Latin and adopted Latin vocabulary. It has preserved 270 Latin-based words which are found in all Romance languages, 85 words which are not found in Romance languages, 151 which are found in Albanian but not in
Eastern Romance and its descendant Romanian, and 39 words which are found only in Albanian and Romanian. The contact zone between Albanian and Romanian was likely located in eastern and southeastern Serbia. The preservation of Proto-Albanian vocabulary and linguistic features in Romanian highlights that at least partly Balkan Latin emerged as Albanian-speakers shifted to Latin. The other layer of linguistic contacts of Albanian with Latin involves Old Dalmatian, a western Balkan derivative of Balkan Latin. Albanian maintained links with both coastal western and central inland Balkan Latin formations. Hamp indicates there are words that follow Dalmatian phonetic rules in Albanian, giving as an example the word
drejt 'straight' <
d(i)rectus matching developments in Old Dalmatian
traita <
tract. Romanian scholars Vatasescu and Mihaescu, using lexical analysis of Albanian, have concluded that Albanian was also heavily influenced by an extinct Romance language that was distinct from both Romanian and
Dalmatian. Because the Latin words common to only Romanian and Albanian are significantly less than those that are common to only Albanian and
Western Romance, Mihaescu argues that Albanian evolved in a region with much greater contact with Western Romance regions than with Romanian-speaking regions, and located this region in present-day
Albania,
Kosovo and Western
North Macedonia, spanning east to
Bitola and
Pristina. The Christian religious vocabulary of Albanian is mostly Latin as well, including even the basic terms such "to bless", "altar," and "to receive communion". It indicates that Albanians were Christianized under the Latin-based liturgy and ecclesiastical order which would be known as "Roman Catholic" in later centuries.
Slavic The contacts began after the
Slavic migrations to the Balkans in the 6th and 7th centuries. The modern Albanian lexicon contains around 250 Slavic borrowings that are shared among all the dialects. Slavic settlement probably shaped the present geographic spread of the Albanians. It is likely that Albanians took refuge in the mountainous areas of northern and central
Albania, eastern
Montenegro, western
North Macedonia, and
Kosovo. Long-standing contact between Slavs and Albanians might have been common in mountain passages and agriculture or fishing areas, in particular in the valleys of the
White and
Black branches of the
Drin and around the
Shkodër and
Ohrid lakes. Such contact with one another in these areas has caused many changes in Slavic and Albanian local dialects.
Historical linguist Eric P. Hamp, analyzing the influence of substrates on the Old
Serbo-Croatian language, has concluded that the
toponymic and
Romanian evidence indicate that the South Slavs who became Serbo-Croatian speakers settled in a zone of former
Albanoid speech, which reasonably explains why the resultant population was well-predisposed to preserve the richest system of
lateral consonant distinctions and alternations among the later Slavic-speaking peoples. The evolution of the ancient toponym
Lychnidus into
Oh(ë)r(id) (
city and
lake), which is attested in this form from 879 CE, required an early long-standing period of Tosk Albanian–East South Slavic bilingualism, or at least contact, resulting from the Tosk Albanian
rhotacism -n- into
-r- and Eastern South Slavic
l-vocalization ly- into
o-. As Albanian and Slavic have been in contact since the early Middle Ages, toponymical loanwords in both belong to different chronological strata and reveal different periods of acquisition. Old Slavic loanwords into Albanian develop early Slavic *s as sh and *y as u within Albanian phonology of that era.
Norbert Jokl defined this older period from the earliest Albanian-Slavic contacts to 1000 AD at the latest, while contemporary linguists like
Vladimir Orel define it as between the 6th and the 8th century AD. Newer loanwords preserve Slavic /s/ and other features which no longer show phonological development within Albanian. Such toponyms from the earlier period of contact in Albania include
Bushtricë (
Kukës),
Dishnica (
Përmet),
Dragoshtunjë (
Elbasan),
Leshnjë (
Leshnjë,
Berat and other areas),
Shelcan (Elbasan),
Shishtavec (Kukës/Gora),
Shuec (
Devoll) and
Shtëpëz (
Gjirokastër),
Shopël (
Iballë),
Veleshnjë (
Skrapar) and others. Similar toponyms in a later period produced different results e.g.
Bistricë (
Sarandë) instead of
Bushtricë or
Selcan (
Këlcyrë) instead of
Shelcan. Part of the toponyms of Slavic origin were acquired in Albanian before undergoing the changes of
Slavic liquid metathesis (before ca. the end of the 8th century). They include
Ardenicë (Lushnjë), Berzanë (Lezhë), Gërdec and Berzi (Tiranë) and a cluster of toponyms along the route Berat-Tepelenë-Përmet.
Labëri, from the
Albanian endonym, resulted through the Slavic liquid metathesis, and was reborrowed in that form into Albanian.
Unidentified Romance language hypothesis It has been concluded that the partial Latinization of Roman-era Albania was heavy in coastal areas, in the plains, and along the
Via Egnatia, which passed through Albania. In these regions, Madgearu notes that the survival of Illyrian names and the depiction of people with Illyrian dress on gravestones is not enough to prove successful resistance against Romanization, and that in these regions there were many Latin inscriptions and Roman settlements. Madgearu concludes that only the northern mountain regions escaped Romanization. In some regions, Madgearu concludes that it has been shown that in some areas a Latinate population that survived until at least the seventh century passed on local place names that had mixed characteristics of Eastern and Western Romance into Albanian. ==Archaeology==