As a non-specialist in data preservation research, I’m finding that my ignorance about a lot of sector-specific jargon (or perhaps ‘terminology’ is a bit more friendly : ) ) can actually work in our favour on this project. Whilst those who know their data preservation/records management vocabulary employ it – as in any specialist field – for its role in accurate and concise communication, and this is crucial in order to advance specialist research in the field, the Incremental project is all about making research data management make sense to the users, whose primary activity is researching, say, medieval manuscripts or heart disease or Caravaggio’s painting methods or deep-sea mammal life.
The uncomfortable truth is that a lot of people – including those who work in research in universities – don’t even know records management or data curation (or whatever we think it should be called) even exists as a field. Given this fact, it should come as no surprise that terms in regular use across information curation, data management and related fields mean little or nothing to a lot of people who, nevertheless, need to know about the ideas, methods and processes these baffling terms describe, in order to have a chance of accessing their valuable research data now or in the future.
So we know we have to translate research data management vocabulary from specialist to non-specialist – but how? We need to find words and phrases that at least have a chance of making sense to researchers who are not specialists in information management / records management / data preservation / data curation. See? Even trying to get across this simple point leads me into a linguistic minefield. Archivists, librarians, IT specialists and other professionals involved in information preservation don’t have an agreed vocabulary for activities and roles in this area, so we on this project really have our work cut out trying to make sense to anyone else!
As part of work this month to scope out the existing data management guidance resources in UK universities, I’ve stumbled across an example: in a few UK universities, ‘research data management’ is used to mean how a researcher goes about gathering information in the first place, i.e the nuts and bolts of the formulation of questionnaires and interview schema, of which software to use to record and share this data, and how researchers can order and manipulate their information as they work on their project. To me, a lot of this is ‘research methods’, not ‘research data management’. However, that’s how quite a few UK universities use the term on their guidance pages. In other institutions, the term is used to mean how one looks after the data after the work is completed, and how one takes care of its longevity, access, integrity and security. Some other institutions include both these processes in their advice, but that’s rare. The Digital Curation Centre, a prominent force in, well, digital curation, offers a useful (if slightly intimidating, the first time you see it) lifecycle model (http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf) which supports this third, inclusive view, i.e. that data exists within a lifecycle, beginning with the creation or receipt of the data, and finishing with how to look after it in perpetuity (or until it should be disposed of). One phrase: three related but distinct views of what it might mean.
There are probably researchers in UK universities with dozens of other ideas about what the phrase ‘research data management’ might mean or involve – and they probably don’t like the sound of any of them. So how do we find vocabulary that makes sense to non-research-data-management-specialists, and actually makes them feel alright about engaging in good data management practices? Is it all about sticking with an agreed established term and hoping that if we say it enough, users will eventually get their heads around what it means? Is it about contextualising data management guidance and advice in researcher-friendly areas of institutional websites, alongside a whacking great glossary of what we mean by each of the terms we use? Or should we proceed by enthusiastic promotion of the benefits to researchers of good data management, with the awkward vocabulary and frightening names for things tagged on – again with glossary – as unattractive but necessary addenda, like the copious small print at the bottom of loan adverts?