Arabic Morphological Lexicon (Aramolex)
Arabic is a non-concatinative language, it can be described as derivational language meaning that the morphotactics depend rather on affixation i.e. adding morphemes onto the word without changing the radical or the “root", that is, preserving the order of the verb core binyanim, this results in the highly regular inflectional/derivational patterns distinguishing the Arabic language.
It also gives the way to the highly prolific vocabulary that characterizes the Arabic morphology to be reproduced using any suitable production tool carefully designed to work on a root-based algorithm; for example a root like [ksr] "to break" may be seeded into the module to yield roughly 30,000 conjugations which is the thorough listing of the verb paradigm plus all inflectional morphemes applied using specific orthographic and heuristic rules, this is theoretically true for any other triconsonantal sound root with minor exceptions.
Current inflection/derivation tools are limited to the third person masculine singular past tense form which serves as the "dictionary form" used to identify any verb instead of the infinitive the way appears in the English dictionaies for example; this basic form has no but a minor usage in NLP related operations e.g. Tokenization, Stemming or POS tagging not serving as practical. So it is important to generate the "real world" derivational candidates found in ordinary literature texts by having an exhaustive morphological lexicon.
ID: Unique ID, character+8 digits ID number (C########) | Vocalized: vocalized Arabic, word surface form fully diacritized |
Category: main category part of speech | KATS: KATS version of "Vocalized" field for software coding compatibility purposes |
Subcategory: extended POS subcategory for sorting and statistical purposes | Arguments: Person/Number/Gender combination marking the subject and/or object |
POS: part of speech code, KTagset is used for tagging purposes | Affixes: prefixes and suffixes, pairs in brackets |
Root: Arabic root, concatenated radical | Gloss: POS glossary, English description of POS field |
Lemma: Arabic headword (canonical form) |
Please download larger samples in TXT format from Aramolex sample. |
ID | Category | POS | SubPOS | Root | Lemma | Vocalized | KATS | Arguments | Affixes | Gloss |
---|---|---|---|---|---|---|---|---|---|---|
N001697 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرِهِمَا | kaAsirihimaA | 3DM•SM | [-,ihimaA] | APP-GEN |
N001698 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرِهِمَا | kaAsirihimaA | 3DF•SM | [-,ihimaA] | APP-GEN |
N001699 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَتِهِمَا | kaAsiratihimaA | 3DM•SF | [-,atihimaA] | APP-GEN |
N001700 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَتِهِمَا | kaAsiratihimaA | 3DF•SF | [-,atihimaA] | APP-GEN |
N001701 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرِهِم | kaAsirihim | 3PM•SM | [-,ihim] | APP-GEN |
N001702 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرِهِن | kaAsirihin | 3PF•SM | [-,ihin] | APP-GEN |
N001703 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَتِهِم | kaAsiratihim | 3PM•SF | [-,atihim] | APP-GEN |
N001704 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَتِهِن | kaAsiratihin | 3PF•SF | [-,atihin] | APP-GEN |
N001705 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَيْهِ | kaAsiray&hi | 3SM•DM | [-,ay&hi] | APP-GEN |
N001706 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَيْهَا | kaAsiray&haA | 3SF•DM | [-,ay&haA] | APP-GEN |
N001707 | N | NCGI | •1 | كسر | كَاسِر | كَاسِرَتَيْهِ | kaAsiratay&hi | 3SM•DF | [-,atay&hi] | APP-GEN |
N001957 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَاي | mak&suwraAy | 1SM•DM | [-,Ay] | PPP-NOM |
N001958 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَاي | mak&suwraAy | 1SF•DM | [-,aAy] | PPP-NOM |
N001959 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَتَاي | mak&suwrataAy | 1SM•DF | [-,ataAy] | PPP-NOM |
N001960 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَتَاي | mak&suwrataAy | 1SF•DF | [-,ataAy] | PPP-NOM |
N001961 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَانَا | mak&suwraAnaA | 1DM•DM | [-,AnaA] | PPP-NOM |
N001962 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورتَانَا | mak&suwrtaAnaA | 1DF•DM | [-,taAnaA] | PPP-NOM |
N001963 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَتَانَا | mak&suwrataAnaA | 1DM•DF | [-,ataAnaA] | PPP-NOM |
N001964 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَتَانَا | mak&suwrataAnaA | 1DF•DF | [-,ataAnaA] | PPP-NOM |
N001965 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَانَا | mak&suwraAnaA | 1PM•DM | [-,AnaA] | PPP-NOM |
N001966 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَانَا | mak&suwraAnaA | 1PF•DM | [-,AnaA] | PPP-NOM |
N001967 | N | NPNI | •1 | كسر | مَكْسُور | مَكْسُورَتَانَا | mak&suwrataAnaA | 1PM•DF | [-,ataAnaA] | PPP-NOM |
V007206 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهُ | tunaS~irahu | 2SM3SM | [tu, hu] | IMF-SUB-ACT |
V007207 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهَا | tunaS~irahaA | 2SM3SF | [tu, haA] | IMF-SUB-ACT |
V007208 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرِيهِ | tunaS~iriyhi | 2SF3SM | [tu, yhi] | IMF-SUB-ACT |
V007209 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرِيهَا | tunaS~iriyhaA | 2SF3SF | [tu, yhaA] | IMF-SUB-ACT |
V007210 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهُمَا | tunaS~irahumaA | 2SM3DM | [tu, humaA] | IMF-SUB-ACT |
V007211 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهُمَا | tunaS~irahumaA | 2SM3DF | [tu, humaA] | IMF-SUB-ACT |
V007212 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرِيهِمَا | tunaS~iriyhimaA | 2SF3DM | [tu, yhimaA] | IMF-SUB-ACT |
V007213 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرِيهِمَا | tunaS~iriyhimaA | 2SF3DF | [tu, yhimaA] | IMF-SUB-ACT |
V007214 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهُم | tunaS~irahum | 2SM3PM | [tu, hum] | IMF-SUB-ACT |
V007215 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرَهُن | tunaS~irahun | 2SM3PF | [tu, hun] | IMF-SUB-ACT |
V007216 | V | VISA | T2 | نصر | نَصَّرَ | تُنَصِّرِيهِم | tunaS~iriyhim | 2SF3PM | [tu, yhim] | IMF-SUB-ACT |
V008920 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهَا | naSar&naAhaA | 1DF3SF | [-,naAhaA] | PRF-IND-ACT |
V008921 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهُمَا | naSar&naAhumaA | 1DM3DM | [-,naAhumaA] | PRF-IND-ACT |
V008922 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهُمَا | naSar&naAhumaA | 1DM3DF | [-,naAhumaA] | PRF-IND-ACT |
V008923 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهُمَا | naSar&naAhumaA | 1DF3DM | [-,naAhumaA] | PRF-IND-ACT |
V008924 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهُمَا | naSar&naAhumaA | 1DF3DF | [-,naAhumaA] | PRF-IND-ACT |
V008925 | V | VPIA | T1 | نصر | نَصَرَ | نَصَرْنَاهُم | naSar&naAhum | 1DM3PM | [-,naAhum] | PRF-IND-ACT |
V010406 | V | VPSA | T3 | نصر | نَاصَرَ | نَاصَرْنَاكُمَا | naASar&naAkumaA | 1DF2DF | [-,naAkumaA] | PRF-SUB-ACT |
V010407 | V | VPSA | T3 | نصر | نَاصَرَ | نَاصَرْنَاكُم | naASar&naAkum | 1DM2PM | [-,naAkum] | PRF-SUB-ACT |
V010408 | V | VPSA | T3 | نصر | نَاصَرَ | نَاصَرْنَاكُن | naASar&naAkun | 1DM2PF | [-,naAkun] | PRF-SUB-ACT |
V010409 | V | VPSA | T3 | نصر | نَاصَرَ | نَاصَرْنَاكُم | naASar&naAkum | 1DF2PM | [-,naAkum] | PRF-SUB-ACT |
V010410 | V | VPSA | T3 | نصر | نَاصَرَ | نَاصَرْنَاكُن | naASar&naAkun | 1DF2PF | [-,naAkun] | PRF-SUB-ACT |
V015010 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْنَ | nuSir&na | •••3PF | [-,na] | PRF-IND-PAS |
V015011 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرُوا | nuSiruwA | •••3PM | [-,wA] | PRF-IND-PAS |
V015012 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْنَ | nuSir&na | •••3PF | [-,na] | PRF-IND-PAS |
V015013 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْتُ | nuSir&tu | •••1SM | [-,tu] | PRF-IND-PAS |
V015014 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْتُ | nuSir&tu | •••1SF | [-,tu] | PRF-IND-PAS |
V015015 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْتُ | nuSir&tu | •••1SM | [-,tu] | PRF-IND-PAS |
V015016 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْتُ | nuSir&tu | •••1SF | [-,tu] | PRF-IND-PAS |
V015017 | V | VPIP | T1 | نصر | نَصَرَ | نُصِرْنَا | nuSir&naA | •••1DM | [-,naA] | PRF-IND-PAS |