Digital scholarship blog

11 February 2021

Investigating Instances of Arabic Verb Form X in the BL/QFP Translation Memory

The Arabic language has a root+pattern morphology where words are formed by casting a (usually 3-letter) root into a morphological template of affixed letters in the beginning, middle and/or end of the word. While most of the meaning comes from the root, the template itself adds a layer of meaning. For our latest Hack Day, I investigated uses of Arabic Verb Form X (istafʿal) in the BL/QFP Translation Memory.

I chose this verb form because it conveys the meaning of seeking or acquiring something for oneself, possibly by force. It is a transitive verb form where the subject may be imposing something on the object and can therefore convey subtle power dynamics. For example, it is the form used to translate words such as ‘colonise’ (yastaʿmir) and ‘enslave’ (yastaʿbid). I wanted to get a sense of whether this form could reflect unconscious biases in our translations – an extension of our work in the BLQFP team to address problematic language in cataloguing and translation.

The other reason I chose this verb form is that it is achieved by affixing three consonants to the beginning of the word, which made it possible to search for in our Translation Memory (TM). The TM is a bilingual corpus, stretching back to 2014, of the catalogue descriptions we translate for the digitised India Office Records and Arabic scientific manuscripts on the QDL. We access the TM through our translation management system (memoQ), which offers some basic search functionalities. This includes a ‘wild card’ option where the results list all the words that begin with the three Form X consonants under investigation (است* and يست*).

Snippet of results in memoQ using the wildcard search function
Figure 1: Snippet of results in memoQ using the wildcard search function.

 

My initial search using these two 3-letter combinations returned 2,140 results. I noticed that there were some recurring false positives such as certain place names and the Arabic calque of ‘strategy’ (istrātījiyyah). The most recurring false positive (699 counts), however, was the Arabic verb for ‘receive’ (istalam) – which is unsurprising given frequent references to correspondences being sent and received in catalogue descriptions of IOR files. What makes this verb a false positive is that the ‘s’ is in fact a root consonant, and therefore the verb actually belongs to Form VIII (iftaʿal). 

After eliminating these false positives, I ended up with 1349 matches. From these, I was able to identify 55 unique verbs used in relation to IOR files. I then conducted a more targeted search of three cases of each verb: the perfective (past) istafʿal, the imperfective (present) yastafʿil, and the verbal noun (istifʿāl). I used the wild card function again to capture variations of these cases with suffixes attached (e.g. pronoun or plural suffixes). Although these would have been useful too, I did not look for the active (mustafʿil) and passive (mustafʿal) participles because the single short vowel that differentiates them is rarely represented in Arabic writing. Close scrutiny of the context of each result would have been needed in order to assign them correctly, and I did not have enough time for that in a single day.

List of the Form X verbs found in the TM and their frequency (excluding six verbs that only occur once)
Figure 2: List of the Form X verbs found in the TM and their frequency (excluding six verbs that only occur once)

 

I made a note of the original English term(s) that the Form X verb was used to translate. I then identified seven potentially problematic verbs that required further investigation. These six verbs typically convey an action that is being either forcefully or wrongfully imposed.

Seven potentially problematic verbs that take Form X in the TM
Figure 3: Seven potentially problematic verbs that take Form X in the TM

 

My next step was to investigate the use of these verbs in context more closely. I looked at the most frequent of these verbs (istawlá/yastawlī/istīlaʾ) in our TM, first using the source + target view, and then the three-column concordance view of the target text. The first view allowed me to scrutinise how we have been employing this verb vis-à-vis the original term used in the English catalogue description. It struck me that, in some cases, more neutral verbs such as ‘take’ and ‘take possession of’ were used on the English side; meaning that bias was introduced during translation.

Source + target view of concordance results for the verb istawlá
Figure 4: Source + target view of concordance results for the verb istawlá

 

The second view makes it possible to see the text immediately preceding and succeeding the verb, typically displaying the assigned subject and object of the verb. It therefore shows who is doing what to whom more clearly, even though the script direction goes a bit awry for Arabic. Here, I noticed that the subjects were disproportionately non-British: it is overwhelmingly native rulers and populations, ‘pirates’, and rival countries who were doing the forceful or wrongful taking in the results. This may indicate an unconscious bias that has travelled from the primary sources to the catalogue descriptions and is something that requires further investigation.

Three-column view of concordance results for the verb istawlá
Figure 5: Three-column view of concordance results for the verb istawlá

 

My hack day investigation was conducted in the spirit of continuous reflection on and improvement of our translation process. Using a verb form rather than specific words as a starting point provided an aggregate view of our practices, which is useful in trying to tease out how the descriptions on the QDL may collectively convey an overall stance or attitude. My investigation also demonstrates the value of our TM, not only for facilitating and maintaining consistency in translation, but as a research tool with countless possibilities. My findings from the hack day are naturally rough-and-ready, but they provide the seed for further conversations about problematic language and unconscious bias among translators and cataloguers.

This is a guest post by linguist and translator Dr Mariam Aboelezz (@MariamAboelezz), Translation Support Officer, BL/QFP Project

Digital scholarship blog recent posts

Archives

Tags

Other British Library blogs