Nathan, David John, SOAS, UK,

This paper will describe seismic shifts that are currently taking place in the field of endangered languages archiving, drawing on research and implementation at the Endangered Languages Archive, together with developments among other innovative archives reported at a recent workshop Language Documentation and Archiving. Endangered languages archives, a new type of facility of which several have sprung up over the last 15 years, have brought new considerations into account:

  • endangered languages materials are predominantly in the language documentation genre, newly created through fieldwork conducted in areas where languages are threatened
  • such language documentation material consists initially, fundamentally, and crucially of media recordings (audio and video) of spontaneous, naturalistic language use such as conversation (Himmelmann 1998)
  • due to the nature and contexts of communities whose languages are endangered, the identities of speakers and the content of their recordings can be sensitive, so that nuanced access protocols must be designed and implemented.

These considerations have brought new types of participants into engagement with each other, some for the first time. Linguists are now found to be working at one moment with members of remote communities and soon after with audio-video specialists, archive technicians, and even journalists and filmmakers. Over the last decade, some researchers have advocated for more direct community participation (e.g., Grinevald 2003), and some archives have steadily built up trust relationships with communities. However, the field has generally remained in a ‘steady state’, emphasising language as data and building infrastructure for research interoperability, in a ‘language resources’ approach.

Now, the situation is suddenly changing, largely as a result of shifting expectations amongst language community members. Following projects funded by DoBeS1, ELDP,2 and DEL,3 more members of communities have been exposed to language documentation and become aware not only of its potential but also of its methods. Digital technologies central to language documentation – media devices (including mobile phones), computers, and the World Wide Web – are rapidly becoming available to people in language endangerment ‘hotspots’, such as in much of Africa, Asia and South America. The shift has been precipitated by growing expectations of participation, control, and personal and local relevance, fostered by social networking platforms such as Facebook, Orkut, YouTube and Twitter. In fact, many communities are first experiencing the internet through these platforms and see the web as a more social, participatory and personally relevant space than those who were ‘first generation’ netizens.

While many communities are keen to participate in the new media landscape (deliberately or incidentally as new providers of language documentation), the challenge is to reconcile their contributions with the structures and policies of archives. Mechanisms need to be created and monitored to determine if increased access to a variety of naturalistic language recordings creates more value for their audiences than is lost through perceived methodological limitations in data collection (Trilsbeek & König 2011). Despite the potential problems of crowdsourcing, it provides a way for language speakers to establish real links with resources, rather than being merely ‘participant metadata’. Here is an example: the web is currently trending towards mash-up pages, mobile ‘apps’, and aggregating portals. These gather resources based on a particular user’s preferences, and display them according to topic, geographic location, or language. But what happens when that user wants to view information connected to a specific person, say language speaker X? Unless speaker X is a true participant in the digital platform as a member, owner, or even curator, rather than a mere meta-data-point, then such a speaker-centred page will be an incomplete and insipid representation, with weak implementation of access control because nobody except speaker X can properly decide who he/she wants to share with. We are fortunate, therefore, to see the maturation and continual rise of online and mobile social networking and innovative ‘apps’ which personalise individuals’ interactions, provide further exemplars for implementing participatory models, and organically solve protocol problems through individually-managed access control.

Mary Linn (2011), for example, has proposed a framework called Community Based Language Archiving, in which the language community is involved in every step, from documentation planning to curating to dissemination. It turns traditional archiving on its head, because the primary curatorial task, namely contextualisation, becomes about the context of the users themselves, because it is they (especially as community members newly welcomed into the archive ecology) who ultimately determine the success of archives in meeting their goals.

Other proposals are for decentralised, web-based functions that allow language speakers to interact with archives’ existing resources. They can add further materials, comments, or contextualisation. They can identify themselves or their relatives in order to claim their moral rights in recordings and other materials, and make corrections to erroneous data, interpretations, and attributions (Garrett 2011).

Existing archivists also need to regain participatory roles. The era of the engineer as the pace-setter is giving way to usage-driven design and evolution, catering for the shifting interests of users. Archivists, whose job entails understanding their audiences, can contribute not only to contextualisation of materials and curation of exhibitions etc but also to software design issues.

Participants at a workshop on language documentation and archiving (Nathan ed. 2011) came up with a number of emerging trends that can be expected to be increasingly influential dynamics for our archives:

Form of documentation: Despite extensive theorisation of documentation in previous work, there has been little discussion of the form of documentation: its granularity, structure, organisation, links, and how it is to be navigated. Archives which attempt to provide attractive and usable interfaces that encourage user engagement will find themselves frustrated with current models for language documentation, which still see documentation as consisting of unstructured collections of data. New genres of expression will be needed, and these will result from collaborative efforts among documenters, archives, and contributors and users.

Community curation and contextualisation: ‘Community curation’ represents a paradigm-changing challenge, changing roles within archives. The original contributors and their language-learning community members become the principal curators, presenting a radical but rational inversion: the archive concept of ‘context’ is no longer that of the materials or their (supposed) provenance, but of the users, where users are often the language speakers. Contextualisation of materials is at the heart of archiving, but in many of our current archives, the art of contextualisation has been replaced by the science of software development. However, communities may wish to play a role in framing the interpretation of their materials to others (cf. Christen 2011: 197). Similarly, community access to materials is not reducible to file transfer, but entails access to meaning (Christen 2011: 194). Issues of quality control (Trilsbeek & König 2011) become just one of many elements to be renegotiated between technical ‘best practices’ and community aspirations and activities.

Promotion: Archives need to do more than acquire, curate, preserve and disseminate materials. To reach target audiences for language revitalisation, archives also need to actively promote the materials they hold (Wilbur 2011; Woodbury 2011). Archives need to develop relationships with their audiences that are not based purely on access to language materials, for the success of archive outreach may depend on first developing contact, relationships and trust in order to encourage usage or other participation with the materials. Ultimately, social networking provides promotion through social endorsement.

Publishing: Paradoxically, even as archives seem to become fragmented and fluid as they respond to audience participation, they will come to be seen less as archives and more as publishers, due to their increased attention to genre, exhibitions, promotion, and other outreach.

Endangered languages archives are an essential component of ethical and effective responses to language endangerment. However, the ‘language resources’ approach is giving way to a participatory one, because neither the quality of documentary materials, nor the effectiveness of technologies are meaningfully measurable without considering audiences and usages. Full engagement with language speakers through social networking will provide new sources of data for researchers as well as new forms of cultural repatriation for communities, and new ways of supporting threatened languages.


1.Dokumentation bedrohter Sprachen, funded by the Volkswagen Foundation.

2.Endangered Languages Documentation Programme, SOAS, funded by the Arcadia Fund.

3.Documenting Endangered Languages, National Endowment for the Humanities.