As in previous years, the two days preceding the Digital Humanities conference (16th-17th July) have been set aside for community-run workshops. Half-day workshops cost € 20, full-day workshops € 35.
Monday, 16 July, 09:00 – 17:00 h, Lecture Hall H, Main Building
Erhard Hinrichs (Eberhard Karls University Tübingen,Germany), Heike Neuroth (Unversity of Göttingen, Germany), Peter Wittenburg (Max-Planck Institute for Psycholinguistics, Nijmegen, The Netherlands)
Large research infrastructure projects in the Humanities and Social Sciences such as Bamboo, CLARIN, DARIAH, eAqua , Metanet and Panacea increasingly offer their resources and tools as web applications or web services via the internet. Such web‐based access has a number of crucial advantages over traditional means of service provision via downloadable resources or desktop applications. Since web applications can be invoked from any browser, downloading, installation, and configuration of individual tools on the user’s local computer is avoided. The paradigm of service‐oriented architectures (SOA) is often used as a possible architecture for bundling web applications and web services. While the use of web services and SOAs is quickly gaining in popularity, there are still a number of open technology and research questions which await more principal answers. The purpose of this joint CLARIN/DARIAH workshop is to provide a forum to address these issues.
Workshop homepage: http://clarin-d.de/index.php/de/news/veranstaltungen-2/workshops/104-workshopdh2012
You can download the proceedings of the workshop as PDF file.
WS02 Full day workshop: Crowdsourcing meaning: a hands-on introduction to CLÉA, the Collaborative Literature Éxploration and Annotation Environment
Monday, 16 July, 09:00 – 17:00 h, Room 122, West Wing
Marco Petris, Evelyn Gius, Lena Schüch, Jan Christoph Meister (University of Hamburg, Germany)
The workshop is a hands-on workshop on text annotation with CLÉA, the Collaborative Literature Éxploration and Annotation Environment developed at the University of Hamburg. The session will start with the presentation of interdisciplinary use cases in which a complex tagset that operationalizes literary theory (namely narratology) is applied, and which will lead to the discussion of conceptual prerequisites that have been crucial for the development of CLÉA. This will be followed by a practical introduction after which participants may annotate their own texts. Finally, we would like to engage participants in a design critique of CLÉA and a general discussion about requirements for text analysis tools in their fields of interest. In this way, the workshop is of interest to humanities scholars of all fields that are concerned with text analysis (either with or without experience in digital text analysis) as well as software developers interested in non-deterministic text analysis and automated annotation.
Monday, 16 July, 09:00 – 17:00 h, Room 120, West Wing
Manfred Thaller (University of Cologne, Germany)
A discussion process about a “core curriculum” for the Digital Humanities, mainly within Germany, started in 2009. Covering 10 BA and 12 MA / MSc programs so far, it resulted in a joint course catalogue and first steps towards criteria for compatible curricula. We would like to extend this discussion to the international level at the DH2012.
Major topics will be:
We will approach DH degree courses known to us. Active indications of interest in the workshop, particularly from institutions who are currently planning Digital Humanities degree courses, are extremely welcome. Please direct them to: email@example.com
Monday, 16 July, 09:00 – 12:30 h, Room 221, West Wing
Christian Brockmann, Dorij Wangchuk (University of Hamburg, Germany)
This workshop covers digital methodology in cross-cultural manuscript studies and the use of digital technology in the physical examination of manuscripts. It will consist of brief introductory presentations on current developments in these areas by international experts, short hands-on and demonstration units on multispectral imaging and computer-assisted script and feature analysis by members of the Hamburg Centre for the Study of Manuscript Cultures, as well as discussions on expected future developments, application perspectives, challenges and possible fields of cooperation.
The focus is on the study of manuscripts as a characteristic feature and expression of those cultures that are built on their use, and on making digital methods applied directly to the study of these physical objects more broadly accessible in a cross-disciplinary context. Recent technological advances, e.g. in working with damaged or otherwise illegible manuscripts, will also be addressed.
List of speakers:
Monday, 16 July, 13:30 – 17:00 h, Room 221, West Wing
Mia Ridge (Open University, United Kingdom)
Have you ever wanted to be able to express your ideas for digital humanities data-based projects more clearly, or wanted to know more about hack days and coding but been too afraid to ask? In this hands-on tutorial led by an experienced web programmer, attendees will learn how to use online tools to create visualisations to explore humanities data sets while learning how computer scripts interact with data in digital applications. Attendees will learn the basic principles of programming by playing with small snippets of code in a fun and supportive environment. The instructor will use accessible analogies to help participants understand and remember technical concepts. Working in pairs, participants will undertake short exercises and put into practice the scripting concepts they are learning about. The tutorial structure encourages attendees to reflect on their experiences and consolidate what they have learned from the exercises with the goal of providing deeper insight into computational thinking.
The tutorial aims to help humanists without a technical background understand more about the creation and delivery of digital humanities data resources. In doing so, this tutorial is designed to support greater diversity in the ‘digital’ part of the digital humanities community. This tutorial is aimed at people who want to learn enough to get started playing with simple code to manipulate data, or gain an insight into how programming works. No technical knowledge is assumed. Attendees are asked to bring their own laptops or net books.
The tutorial will include:
Attention: The registration for this workshop is free (limited to 70 participants) and will be processed separately, please visit http://nedimahstwg2.eventbrite.com/
Tuesday, 17 July, 09:00 – 17:00 h, Lecture Hall H, Main Building
Leif Isaksen (University of Southampton, UK), Shawn Day (Digital Humanities Observatory, Ireland), Jens Andresen (University of Aarhus, Denmark), Eero Hyvönen, Eetu Mäkelä (Aalto University, Finland)
All cultural expressions are related to the dimensions of space and time in the manner of their production and consumption, the nature of their medium and the way in which they express these concepts themselves. This workshop seeks to identify innovative practices among the Digital Humanities community that explore, critique and re-present these spatial and temporal aspects. It is part of the ESF-funded NEDIMAH Network and organised by its Working Group on Space and Time (STWG).
We invite those working with digital tools and techniques that manage, analyse and exploit spatial and temporal concepts in the Humanities to present a position paper at this workshop. Possible topics might be:
Tuesday, 17 July, 09:00 – 12:30 h, Room 122, West Wing
Christian-Emil Ore (University of Oslo, Norway), Sebastian Rahtz (University of Oxford, UK), Øyvind Eide (King’s College London, UK)
The aim of this workshop is to present and discuss current ontology based annotation in text studies and to give participants an introduction and updated insight to the field. One of the expected outcomes from the workshop is to throw light on the consequences and experiences of a renewed database approach to computer assisted textual work, based on the developments over the last decade in text encoding as well as in ontological systems.
The Network for Digital Methods in the Arts and Humanities (NeDiMAH) is a research network running from 2011 to 2015, funded by the European Science Foundation, ESF. The network will examine the practice of, and evidence for, advanced ICT methods in the arts and humanities across Europe, and articulate these findings in a series of outputs and publications. To accomplish this, NeDiMAH provides a locus of networking and interdisciplinary exchange of expertise among the trans-European community of digital arts and humanities researchers, as well as those engaged with creating and curating scholarly and cultural heritage digital collections. NeDiMAH will work closely with the EC funded DARIAH and CLARIN e-research infrastructure projects, as well as other national and international initiatives.
NeDiMaH includes the following Working Groups:
1. Spatial and temporal modelling,
2. Information visualisation,
3. Linked data and ontological methods,
4. Developing digital data
5. Using large scale text collections for research
6. Scholarly digital editions
The WGs will examine the use of formal computationally-based methods for the capture, investigation, analysis, study, modelling, presentation, dissemination, publication and evaluation of arts and humanities materials for research. To achieve these goals the WGs will organise annual workshops and whenever possible, the NeDiMAH workshops will be organised in connection with other activities and initiatives in the field.
Tuesday, 17 July, 09:00 – 12:30 h, Room 222, West Wing
Stéfan Sinclair (McGill University, Canada), Geoffrey Rockwell (University of Alberta, Canada)
You have a collection of digital texts, now what? This workshop provides a gentle introduction to text analysis in the digital humanities using Voyant Tools, a collection of free web-based tools that can handle larger collections of texts, be they digitized novels, online news articles, twitter feeds, or other textual content. This workshop will be a hands-on, practical guide with lots of time to ask questions, so participants are encouraged to bring their own texts.
Tuesday, 17 July, 13:30 – 17:00 h, Room 122, West Wing
Seth van Hooland, Max De Wilde (Université Libre de Bruxelles, Belgium), Ruben Verborgh (Multimedia Lab, Ghent University, Belgium)
Co-organizers: Thomas Steiner (UPC.edu), Johannes Hercher (Universität Potsdam, Germany)
The early-to-mid 2000s economic downturn in the US and Europe forced Digital Humanities projects to adopt a more pragmatic stance towards metadata creation and to deliver short-term results towards grant providers. It is precisely in this context that the concept of Linked and Open Data (LOD) has gained momentum. In this tutorial, we want to focus on reconciliation, the process in which we map domain specific vocabulary to another (often more commonly used) vocabulary that is part of the Semantic Web in order to annex the metadata to the Linked Data Cloud. We believe that the integration of heterogeneous collections can be managed by using subject vocabulary for cross linking between collections, since major classifications and thesauri (e.g. LCSH, DDC, RAMEAU, etc.) have been made available following Linked Data Principles.
Re-using these established terms for indexing cultural heritage resources represents a big potential of Linked Data for Digital Humanities projects, but there is a common belief that the application of LOD publishing still requires expert knowledge of Semantic Web technologies. This tutorial will therefore demonstrate how Semantic Web novices can start experimenting on their own with non-expert software such as Google Refine.
Participants of the tutorial will be asked to bring an export (or a subset) of metadata from their own projects or organizations and to pre-install Google Refine on their laptop. All necessary operations to reconcile metadata with controlled vocabularies which are already a part of the Linked Data cloud will be presented in detail, after which participants will be given time to perform these actions on their own metadata, under assistance of the tutorial organizers. Previous tutorials have mainly relied on the use of the Library of Congres Subject Headings (LCSH), but for the DH2012 conference we will test out beforehand SPARQL endpoints of controlled vocabularies in German (available for example on http://wiss-ki.eu/authorities/gnd/) in order to make sure that local participants will be able to experiment with metadata in German.
This tutorial proposal is a part of the Free your Metadata research project (www.freeyourmetadata.org). The website offers a variety of video’s, screencasts and documentation on how to use Google Refine to clean and reconcile metadata with controlled vocabularies already connected to the Linked Data cloud. The website also offers an overview of previous presentations and workshops.
Matthew Jockers, Elijah Meeks (Stanford University, USA)
Tuesday, 17 July, 13:30 – 17:00 h, Room 222, West Wing
Jan Rybicki ( Jagiellonian University, Poland), Maciej Eder (Pedagogical University, Poland)
Stylometry, or the study of measurable features of (literary) style, such as sentence length, vocabulary richness and various frequencies (of words, word lengths, word forms, etc.), has been around at least since the middle of the 19th century, and has found numerous practical applications in authorship attribution research. These applications are usually based on the belief that there exist such conscious or unconscious elements of personal style that can help detect the true author of an anonymous text; that there exist stylistic fingerprints that can betray the plagiarist; that the oldest authorship disputes (St. Paul’s epistles or Shakespeare’s plays) can be settled with more or less sophisticated statistical methods.
While specific issues remain largely unresolved (or, if closed once, they are sooner or later reopened), a variety of statistical approaches has been developed that allow, often with spectacular precision, to identify texts written by several authors based on a single example of each author’s writing. But even more interesting research questions arise beyond bare authorship attribution: patterns of stylometric similarity and difference also provide new insights into relationships between different books by the same author; between books by different authors; between authors differing in terms of chronology or gender; between translations of the same author or group of authors; helping, in turn, to find new ways of looking at works that seem to have been studied from all possible perspectives. Nowadays, in the era of ever-growing computing power and of ever-more literary texts available in electronic form, we are able to perform stylometric experiments that our predecessors could only dream of.
This half-day workshop is a hands-on introduction to stylometric analysis in the programming language R, using an emerging tool, a collection of Maciej Eder’s and Jan Rybicki’s scripts, which perform multivariate analyses of the frequencies of the most frequent words, the most frequent word n-grams, and the most frequent letter n-grams. One of the scripts produces Cluster Analysis, Multidimensional Scaling, Principal Component Analysis and Bootstrap Consensus Tree graphs based on Burrows’s Delta and other distance measures; it applies additional (and optional) procedures, such as Hoover’s “culling” and pronoun deletion. As by-products, it can be used to generate various frequency lists; a stand-alone word-frequency-maker is also available. Another script provides insight into state-of-the-art supervised techniques of classification, such as Support Vector Machines, k-Nearest Neighbor classification, or, more classically, Delta as developed by Burrows. Our scripts have already been used by other scholars to study Wittgenstein’s dictated writings or, believe it or not, DNA sequences!
The workshop will be an opportunity to see this in practice in a variety of text collections, investigated for authorial attribution, translatorial attribution, genre, gender, chronology… Text collections in a variety of languages will be provided; workshop attendees are welcome to bring even more texts (in either plain text format or tei-xml). No previous knowledge of R is necessary: our script is very user-friendly (and very fast)!
During a brief introduction, (1) R will be installed on the users’ laptops from the Internet (if it has not been already installed); (2) participants will receive CDs/pendrives with the script(s), a short quickstart guide and several text collections prepared for analysis; (3) some theory behind this particular stylometric approach will be discussed, and the possible uses of the tools presented will be summarized. After that and (4) a short instruction, participants will move on to (5) hands-on analysis to produce as many different results as possible to better assess the various aspects of stylometric study; (6) additional texts might be downloaded from the Internet or added by the participants themselves. The results, both numeric and visualizations, will be analyzed. For those more advanced in R (or S, or Matlab), details of the script (R methods, functions, and packages) will be discussed.