MUTABLE Research Plan

[Updated 10/2017]

[This is the original research plan, last updated in 2017]


Translation Studies has begun to study the process of translating “as it happens” in real-life situations and at translators’ workplaces (Risku 2014: 334-335). While the traditional subject matter of translation process research has been the individual translator and their cognitive mental process (comprehension, transfer, production), often experimented in laboratory settings, (Englund-Dimitrova 2010), we now testify an extension of the concept of ‘translation process’ to the “collaborative ‘making-of’ a translation” and its interactive aspects (Risku 2014: 342). This shift of focus brings along analyses of social and cognitive aspects of translating: what actors (clients, translators, etc.) are involved in the process and how different resources, such as dictionaries, computer and technological tools, are exploited to support the mental translation process (Risku 2014). The conceptual extension occurs simultaneously with a paradigmatic change in cognitive science by which cognition as something purely mental is redefined to being also embodied, social, and shared with others.

Audio description (AD) furnishes a relevant case to study collaboratively created translation and the socio-cognitive aspects of translating. In AD, visual elements are verbalised and spoken out loud for the benefit of visually impaired persons (VIPs), allowing them a better access to visual and audiovisual information and art such as films and paintings. Some professional practices of AD in Finland and Germany deploy teams of sighted and non-sighted describers to create audio descriptions. When such teams work on the translation task (see Image 1), the sighted participant(s) makes visual information intelligible for the non-sighted participant through verbal and bodily means, such as speech and gesturing. The blind participant, on the other hand, brings in his/her analysis of the film’s soundtrack and indicates, for instance, what aspects can be heard from the soundtrack and therefore do not need verbalisation.

The purpose of the image is to illustrate how a concrete team audio description process can be organized. The picture shows a blind describer, a sighted describer and a sighted describer and technician working together on an audio description process.
(Image 1 ) A screenshot from the online video “Lesen statt Hören – Hören statt Sehen” by Norddeutscher Rundfunk (startin at 00:01:00).

Analysing the interaction in the team renders insight into the collaborative making of a translation and even the mental translation process of individual translators. Describers are using multimodal resources as cognitive scaffolds (see Risku 2014: 335) for problem-solving and decision-making as they justify or challenge translation solutions, such as the selection of a verb to adequately describe an action in the film scene. In fact, this “thinking aloud”, which occurs naturally in the collaborative AD, is deployed as a method in experimental settings of the traditional translation process research to shed light into the cognitive processes in translators’ minds. In collaborative AD, thus, translating becomes embodied and interactive, and individual cognitions are shared.

A growing body of research shows that intersubjectivity is constructed multimodally in interaction, that is, understanding between discourse participants is reached via various communication resources, such as speech, gestures, and gaze (see e.g. Haddington & Kääntä 2011, Hausendorf et al. 2012). Interaction at workplace has been studied both from the perspective of talk (Drew & Heritage 1992, Koester 2010) and multimodality (e.g. Schmitt 2012). An emerging subject of research is the effect of communicative impairments on interaction (e.g. Wiklund 2012). However, the study of the effect of visual impairments and of interaction between perceptually asymmetrical participants (sighted and non-sighted persons), and the focus on the role of auditory resources make a new contribution to the field. The absence of sight in part of the discourse participants likely affects the interaction, for instance so that some functions of gaze are compensated by other modes, such as speech prosody.

While AD has been the object of intensive inquiry in Translation Studies for the last decade or so, research has largely concentrated on the analysis of translation products (audio described materials) and the reception of AD. Very little has been published on the interactional dimension of AD, that is, the face-to-face, real-time describing between sighted and non-sighted persons. It is, however, in particular in these situations that translations can be negotiated and shared and other communication resources than speech and language called for assistance (such as haptics, see Quereda Herrera 2007). The AD translation process or the collaboration in it has not been discerned, apart from few general, country-specific descriptions of work process (e.g. Benecke 2014, Green & Rolph 2007).

Collaborative AD can be assumed to be more usable than an AD that is based on the interpretation of one sighted describer alone, because it benefits from the perceptual experience and needs of the user, that is, the non-sighted describer. Sighted describers have an audiovisual access to the film, being capable of perceiving both the source (images) and target material (speech). Non-sighted describers, for one, have a monomodal, auditory access to the film but pay special attention how the film communicates with sound effects, music, and dialogue, and how the audio describing voice mediates the visual information from the film (e.g. how intonation reflects the organisation of information). Nonetheless, we lack empirical evidence of how exactly this expertise becomes effective in the collaboration although earlier research and the practice indicate this (Benecke 2014, Hirvonen & Tiittula 2012, Remael 2012). Another relevant question is whether the perceptually-grounded asymmetry leads to unequal power relations in the translation task (cf. Koester 2010: 82), for example by limiting the non-sighted describer in decision-making because their monomodal access to the source material.


Collaborative AD is investigated from two perspectives: 1) the collaborative AD process as translation process, and 2) the multimodal interaction between sighted and visually impaired describers on one hand and between the describers and the audiovisual source material on the other.

To begin with, the project describes the AD process as collaborative translation process. This involves determining the procedures and organisation of work. Relevant questions are: What is the socio-cognitive environment of the audio describers’ like (see Risku 2014)? How is the work divided into individually and collaboratively realised tasks? What are the specific tasks of the sighted and non-sighted describers?

The project then takes special focus on one part of the process, the face-to-face encounters between audio describers. By analysing this interaction, the study seeks to produce knowledge of translating as an intersubjectively shared and multimodally embodied process and of the use of non-visual, in particular auditory, communication resources in interaction between perceptually asymmetrical participants. The project will discern how cognitive operations related to translating are displayed and shared by the describers through speech, gestures, and other bodily communication. Since the outcome of the collaborative AD process is a shared translation (i.e. one audio description), the focal questions include how the team negotiates individual interpretations and how it deploys multimodal resources in constructing a common understanding.

The perceptually-grounded asymmetry is likely to have consequences to translating and interaction. To what extend does it result in distinct interpretations of what is relevant in the translation? How do the sighted describers make known their interpretations of the film’s visuals: For instance, do they deploy touching and gesturing besides verbalising their thoughts? Specific attention will be directed at the non-sighted describers interpreting the film through the soundtrack: For instance, what kind of visually represented aspects can they deduce from sound effects? With regard to interaction, the analysis can show, among other things, how participants organise the discourse through speech in the absence of gaze.

Expected Impacts and Utilisation of Results

By increasing knowledge of translating as a multimodal, collaborative process and of AD as a translation process, the project contributes to the field of Translation Studies in general and to the study of AD and multimodality in translation in particular. The research also provides new knowledge of the role of users of translation (see Suojanen et al. 2015). The methods of analysing interaction can benefit other research that deals with face-to-face encounters in translation and interpreting. By acquiring authentic data of AD processes, the project expands the data collection of authentic translation processes, and the results can be related to other research dealing with such data (see Risku 2014: 333).

In addition, the project has an interdisciplinary impact. Firstly, the analysis of translation process from the multimodal interactional perspective complements the view of translating in Translation Studies. This study shows how translations are the result of collaboration between individual translators and users. Second, the project contributes to Linguistics and the study of interaction and workplace discourse by shedding light on the interaction between sighted and non-sighted discourse participants. Third, the research gives new insight into Film and Television Studies on the role of audition in perceiving and interpreting audiovisual texts. Finally, for all of these disciplines, the project intends to provide data of naturally occurring interactional situations to be reused in future studies.

Dealing with a socially relevant phenomenon, the project’s findings can lead to scrutinising AD practices nationally and internationally as the research indicates possibilities and limitations of collaborative AD. Therefore it can propose measures to make this collaboration more effective. By analysing VIPs as translators as well as users of audiovisual material, we gain more understanding of the audiovisual world from the VIP perspective, which can lead to identifying the expertise in VIPs related to inter-sensory translating and multimodal communication. The project therefore creates new knowledge of VIPs that can be applied to foster inclusion in the society as well as diversity at workplace.

Critical Points for Success and Alternative Implementation

The main critical point relates to the acquisition of data. The project collects video recordings from the interactive encounters during real-life AD processes in Finland and Germany. This involves using human beings as research subjects, and it is possible that some audio describers decline to be studied or that some do not want to be filmed. In order to have enough time to recruit research subjects, data acquisition was started early, and I collected my first data already in 2016 before the project was funded.

In the Finnish context, the specific challenge lies in that the AD of film and other processes of collaborative AD are rare (few productions per year and few describers who work in a team). In order to complement the research – actually with more interesting research questions – I also collect data from any interactive encounters in which AD occurs: guided art exhibitions tours for visually-impaired people, classroom activities and teaching at school with visually-impaired children, etc.

Other risks in data acquisition relate to technological aspects. Good quality of sound and image in the video must be assured, meaning that the recording situation is planned and tested beforehand and on site again. This risk was alleviated by a pilot study by which the recording was rehearsed.

Publication and Dissemination Plan

The research publishes both international and national peer-reviewed contributions and other texts and media targeted to various stakeholders and the general public. Some of the articles will be co-authored to support the multidisciplinarity of the project and to carry out interdisciplinary discussions. Additionally, multidisciplinarity reflects in the publication plan so that different articles apply partly different methodological approaches.

Provisional plan of publications:

Edited book: “Accessibility and asymmetry of resources viewed through Translation Studies”. (a compiled book in Finnish, co-edited with Tuija Kinnunen)

Article 1: “Forms of collaborative translation: modelling teamwork in audio description”.

A2: “VIP access? Visually impaired translators interpreting a movie through the soundtrack”.

A3: “What a difference the voice makes… Functions of prosody from the perspective of non-sighted discourse participants”. (Co-authored with Mari Wiklund)

A4: “The case of gaze in the interaction with non-sighted participants”. (Co-authored with Liisa Tiittula)

A5: “Sharing cognition in translation: Multimodal negotiations of translation strategies and solutions during the audio description process”.

A6: “Are you seeing what I’m hearing? Consequences of perceptual asymmetry between blind and sighted audio describers to the translation process”.

Research Method and Material

The project combines approaches from Translation Studies and Linguistics with cognitive scientific, interactional, and multimodal perspectives in order to analyse and explain a complex phenomenon. The general approach is data-driven and directed at reconstructing the translation process as it gradually evolves and at determining how communicative resources are being used in different phases of the process.

To carry out the descriptive analysis of the translation process, the method of dynamic network model of situated translation processes is adapted (Risku et al. 2013). It involves analysing both internal (mental) and external (non-mental) aspects of translation by ethnographic methods: observing the process, analysing its documentation (e.g. translation drafts and comments) and interviewing audio describers. This model allows the structuring of the whole translation process from its preparation phase (organisation of work, scheduling, etc.) until it is finished (recording the AD in speech, evaluation of the translation product by the participants, etc.).

Multimodal interaction analysis is applied to study the use of multimodal resources (e.g. Haddington & Kääntä 2011; Hausendorf et al. 2012). This method has roots in Conversation Analysis and it postulates that participants in discourse display their understanding by their multimodally constructed actions in the ongoing interaction. The project approaches interaction also from the cognitive perspective and studies whether and how cognitive processes are portrayed in interaction (e.g. Heritage 2005), for instance how participants make known their perceptions of a film scene (see Hirvonen 2014: 49).

The data of the project will be gathered from authentic AD processes in Finland and abroad. The interactive encounters of the AD teams will be video recorded. Documents that the describers employ will be gathered in electronic form. The source materials that are translated in the processes will also be used. The audio describers’ interviews will be audio-recorded. The research subjects are visually impaired and normal sighted persons and they will be contacted per e-mail and requested to participate in the study in writing.

The video recordings will be transcribed and the different communication resources aligned by using the ELAN tool for multimodal annotation. The interviews with the audio describers will be transcribed into text files. The multimodal annotation is labour-intensive, and significant findings can come from a relatively small set of data (e.g. few hours of recorded interaction).

Data Management and Ethical Issues

This data management plan is drafted following the instructions from Finnish Social Science Data Archive.

Data collection methods and data content will be carefully documented to enable subsequent data sharing. The aim is to receive a general consent from the subjects that applies to research use in general and enables the reuse of the video data in future studies. The collected data will be stored in the university’s facilities and protected. If a general consent is acquired, the video files will be archived at Language Bank of Finland, which will take care of data management. If a general consent is not acquired, the video and audio files will be destroyed after the original research has been completed. Research participants will be informed of the archiving before data collection. Data usage rights relating to the video data will be determined both for the research project and for usage after the project has been completed, but any personal documentation collected in the project (e.g. e-mail exchange of the research subjects) remains in the subjects’ ownership.

As the investigated AD processes include narrative films or other copyright material which are produced by third parties (film or television companies), the rights of using this material in research will be agreed on with original data creators (a written permission will be obtained).

The research adheres to the Guidelines of Responsible Conduct of Research drawn up by National Advisory Board on Research Integrity. The research respects the autonomy of research subjects, avoids harming them, and protects their privacy and personal data. Participation of the human subjects is voluntary and based on an informed consent. Subjects have the right to withdraw from the study at any time. The research and the data collecting situations avoid producing mental harm to the subjects.

Any personal data that may allow the identification of research participants will be removed or masked from the transcriptions and visual representations shown in public (e.g. research articles). However, for the sake of successful and purposeful analysis, some identifying parts cannot be masked in the video data themselves (e.g. facial expressions will be analysed). Subjects will be informed about the fact that they can be recognised on the basis of the issue of discussion: the authors of film audio descriptions is public information and visible, for instance, in the internet. A solution can therefore be to refer to them by the authentic name, providing an agreement has been made in advance. On the whole, any damage or harm to subjects’ personal life or professional career will be avoided, but this does not prevent the publication of research findings that concern the profession in general.

The project complies with the open access policies of Academy of Finland and University of Helsinki (?) and deposits the published articles in a full-text version in the institutional repository.



Benecke, Bernd 2014. Audiodeskription. Modell und Methode. Münster: LIT Verlag.

Drew, Paul & Heritage, John (Eds.) 1992. Talk at Work. Interaction in institutional settings. Cambridge: Cambridge University Press.

Englund-Dimitrova, Birgitta 2010. Translation process. In: Gambier, Yves & Doorslaer, Luc van (Eds.): Handbook of Translation Studies. Amsterdam/Philadelphia: John Benjamins. E-book.

Greening, Joan & Rolph, Deborah 2007. Accessibility: raising awareness of audio description in the UK. In: Díaz Cintas, Jorge; Orero, Pilar & Remael, Aline (Eds.): Media for all: subtitling for the deaf, audio description and sign language. Amsterdam: Rodopi, pp. 127–138.

Haddington, Pentti and Kääntä, Leila (Eds.) 2011. Kieli, keho ja vuorovaikutus. Multimodaalinen näkökulma sosiaaliseen toimintaan [Language, body and interaction. A multimodal perspective to social action]. Helsinki: Suomalaisen kirjallisuuden seura.

Hausendorf, Heiko; Mondada, Lorenza & Schmitt, Reinhold (Eds.) (2012): Raum als interaktive Ressource. Tübingen: Günter Narr.

Heritage, John 2005. Cognition in discourse. In: Molder, Hedwig te & Potter, Jonathan (Eds.): Conversation and Cognition. Cambridge: Cambridge University Press, pp. 184–202.

Hirvonen, Maija 2014. Multimodal Representation and Intermodal Similarity – Cues of Space in the Audio Description of Film. Helsinki: University of Helsinki. Available at:

Hirvonen, Maija 2013a. Perspektivierungsstrategien und -mittel kontrastiv: Die Verbalisierung der Figurenperspektive in der deutschen und finnischen Audiodeskription. trans-kom: Zeitschrift für Translationswissenschaft und Fachkommunikation 6:1, pp. 8–38.

Hirvonen, Maija 2013b. Sampling Similarity in Image and Language – Figure and Ground in the Analysis of Filmic Audio Description. SKY Journal of Linguistics 26 (2013), pp. 87–115.

Hirvonen, Maija 2012. Contrasting Visual and Verbal Cueing of Space – Strategies and devices in the audio description of film. New Voices in Translation Studies 8, pp. 21–43.

Hirvonen, Maija & Tiittula, Liisa 2012. Verfahren der Hörbarmachung von Raum. Analyse einer Hörfilmsequenz. In: Hausendorf et al. (Eds.): Raum als interaktive Ressource, pp. 381–427.

Koester, Almut 2010. Workplace Discourse. London/New York: Continuum International.

Quereda Herrera, María 2007. Interpretación simultánea/bilateral con apoyo táctil. In: Jiménez Hurtado, Catalina (Ed.), Traducción y accesibilidad. Frankfurt/Main: Peter Lang, pp. 229–240.

Remael, Aline 2012. For the use of sound. Film sound analysis for audio-description: Some key issues. MonTI (Monografías de Traducción e Interpretación) 4, pp. 255–276.

Risku, Hanna­; Windhager, Florian & Apfelthaler, Matthias 2013. A dynamic network model of translatorial cognition and action. Translation Spaces 2, pp. 151–182.

Risku, Hanna 2014. Translation process research as interaction research: From mental to socio-cognitive processes. MonTI Special Issue – Minding Translation, pp. 331–353.

Schmitt, Reinhold 2012. Körperlich-räumliche Grundlagen interaktiver Beteiligung am Filmset: Das Konzept ‘Interaktionsensemble’. In: Hausendorf et al. (Eds.): Raum als interaktive Ressource, pp. 37–87.

Suojanen, Tytti; Koskinen, Kaisa & Tuominen, Tiina (2015). User-Centered Translation. London/New York: Routledge.

Tiittula, Liisa 2007. Blickorganisation in der side-by-side-Positionierung am Beispiel eines Geschäftsgesprächs. In: Schmitt, Reinhold (Ed.): Koordination. Analysen zur multimodalen Interaktion. Tübingen: Günter Narr, pp. 225–261.

Wiklund, Mari 2012. Gaze behavior of pre-adolescent children afflicted with Asperger Syndrome. Communication & Medicine 9(2), 173–186.