Passing on data management skills

March 30, 2011

In the training session of the MRD International workshop we heard from the five RDMTrain projects and DaMSSI – the support project led by DCC and RIN.

All five projects are producing discipline-specific training materials targeted at postgraduate researchers and PhD students.  The disciplines covered are:
(n.b. project name and contact person in brackets)

  • Creative arts (CAiRO – Stephen Gray)
  • Archaeology (DataTrain – Lindsay Lloyd-Smith)
  • Anthropology (DataTrain – Irene Peano)
  • Social science (MANTRA – Cuna Ekmekcioglu)
  • Geoscience (MANTRA – Cuna)
  • Psychology (DMTpsych – Richard Plant & MANTRA – Cuna)
  • Public health (DATUM – Julie McLeod)

They’re delivering varied lengths and formats of course: a 5-day summer school (CAiRO); 4-6 fortnightly modules / lectures (DataTrain archaeology, DATUM and DMTpsych); a one day interactive workshop (DataTrain anthropology); and eight online learning units (MANTRA).

The projects are typically tying their provision in with existing postgraduate research methods courses.  MANTRA will become part of the University of Edinburgh’s Transkills programme.

Concerns were raised about the volume of information out there and the need to keep courses light and interesting (as data management is often considered to be dull!)  Projects are re-telling anecdotes and horror stories, as well as focusing on breakout exercises and discussion so learning is centred around the student experience.

Some highlights and take home messages from the presentations:

  • CAiRO’s “Arts vs science: death-match” slide gave an insight into how arts research is different to science (and what implications this has on training)
  • DataTrain courses are led by recent PhDs, as they’re more in tune with students (also felt to be a more sustainable model)
  • DATUM project has put together a custom Google search of useful DM training sources
  • DMTpsych found researchers preferred printed DMP guidance so they could read offline (in the bath!) and discuss ideas to decide what to include in plan
  • MANTRA is licensing all content as attribution only for widest re-use – respect!
  • Transferable skills are a key focus of DaMSSI.  They’re working with professional bodies to endorse career profiles, which show data management skills are useful in all walks of life.

All these project run until July 2011.  Find out more on the JISC MRD programme page

Vocabulary/jargon/terminology: synonyms and specialist language

July 14, 2010

As a non-specialist in data preservation research, I’m finding that my ignorance about a lot of sector-specific jargon (or perhaps ‘terminology’ is a bit more friendly : ) ) can actually work in our favour on this project.  Whilst those who know their data preservation/records management vocabulary employ it – as in any specialist field – for its role in accurate and concise communication, and this is crucial in order to advance specialist research in the field, the Incremental project is all about making research data management make sense to the users, whose primary activity is researching, say, medieval manuscripts or heart disease or Caravaggio’s painting methods or deep-sea mammal life.

The uncomfortable truth is that a lot of people – including those who work in research in universities – don’t even know records management or data curation (or whatever we think it should be called) even exists as a field.  Given this fact, it should come as no surprise that terms in regular use across information curation, data management and related fields mean little or nothing to a lot of people who, nevertheless, need to know about the ideas, methods and processes these baffling terms describe, in order to have a chance of accessing their valuable research data now or in the future.

So we know we have to translate research data management vocabulary from specialist to non-specialist – but how?  We need to find words and phrases that at least have a chance of making sense to researchers who are not specialists in information management / records management / data preservation / data curation.  See?  Even trying to get across this simple point leads me into a linguistic minefield.  Archivists, librarians, IT specialists and other professionals involved in information preservation don’t have an agreed vocabulary for activities and roles in this area, so we on this project really have our work cut out trying to make sense to anyone else!

As part of work this month to scope out the existing data management guidance resources in UK universities, I’ve stumbled across an example: in a few UK universities, ‘research data management’ is used to mean how a researcher goes about gathering information in the first place, i.e the nuts and bolts of the formulation of questionnaires and interview schema, of which software to use to record and share this data, and how researchers can order and manipulate their information as they work on their project.  To me, a lot of this is ‘research methods’, not ‘research data management’.  However, that’s how quite a few UK universities use the term on their guidance pages.  In other institutions, the term is used to mean how one looks after the data after the work is completed, and how one takes care of its longevity, access, integrity and security.  Some other institutions include both these processes in their advice, but that’s rare.  The Digital Curation Centre, a prominent force in, well, digital curation, offers a useful (if slightly intimidating, the first time you see it) lifecycle model ( which supports this third, inclusive view, i.e. that data exists within a lifecycle, beginning with the creation or receipt of the data, and finishing with how to look after it in perpetuity (or until it should be disposed of).  One phrase: three related but distinct views of what it might mean.

There are probably researchers in UK universities with dozens of other ideas about what the phrase ‘research data management’ might mean or involve – and they probably don’t like the sound of any of them.  So how do we find vocabulary that makes sense to non-research-data-management-specialists, and actually makes them feel alright about engaging in good data management practices?  Is it all about sticking with an agreed established term and hoping that if we say it enough, users will eventually get their heads around what it means?  Is it about contextualising data management guidance and advice in researcher-friendly areas of institutional websites, alongside a whacking great glossary of what we mean by each of the terms we use?  Or should we proceed by enthusiastic promotion of the benefits to researchers of good data management, with the awkward vocabulary and frightening names for things tagged on – again with glossary – as unattractive but necessary addenda, like the copious small print at the bottom of loan adverts?

Scoping study and implementation plan released

July 2, 2010

We are pleased to announce that the Incremental project Scoping study and implementation plan is now available. This report describes our findings from interviews and informal conversations with dozens of researchers and technical support staff across the Universities of Cambridge and Glasgow and outlines our implementation plan for the coming months of the project.

Some highlights from the report:

Simple issues cause serious risks and irritation

Many researchers, across disciplines, are unaware of the best formats and storage media to preserve or share files, and many have no clear naming or file structure conventions. These kinds of relatively simple issues pose the risk of serious data losses in the short and long term, and frequently cost researchers’ time and frustration searching for data or trying to revive old files.

Resources must be simple, engaging and easy to access

Researchers were interested in guidance, simple tools, and support for data management, but this came with several caveats. Information needs to be clear, quick to understand, engaging, and relevant to their circumstances. They are often unaware of existing resources and training and don’t know where to look for support. Many complained that training is often inconveniently timed and not well-tailored to their needs, suggesting online resources, ‘a really smart little leaflet’ or someone to talk to face-to-face would be more helpful.

Language matters

Our study underscored the need to provide jargon-free guidance – most researchers don’t know what ‘digital curation’ is and humanities researchers don’t think of their manuscripts as ‘data’. Researchers and support staff tended to be suspicious of ‘policies,’ which sound like hollow mandates, but were sometimes receptive to ‘procedures’ or ‘advice’ which may be essentially the same thing, but convey a sense of purpose and assistance rather than requirement.

And so, here are our plans:

1. Produce simple, accessible, visual guidance on creating, storing, and managing data

This will include producing easy-to-navigate centralised webpages at each institution, pointing researchers to existing support and new resources created by the project. We’ll consider the format of guidance and move towards more engaging formats such as illustrated fact sheets, flow diagrams, checklists, and FAQs.

2. Offer practical data training with discipline-specific examples and local champions

We will work with enthusiasts within departments to embed slides and resources within existing training and inductions (i.e. train the trainer). We will also create brief online tutorials and/or screen-casts, and include case-studies from within disciplines wherever possible.

3. Connect researchers with support staff who offer one-to-one advice, guidance, and partnering

We will work with departments and the research office within each institution to make sure that researchers are referred to existing support staff for one-to-one advice during the proposal-writing stage of projects and beyond.

4. Work towards the development of a comprehensive data management infrastructure

This project is part of an overall effort to support data management and preservation activities at both institutions, and will be continued through the broader research data infrastructure and policy development at Cambridge and Glasgow.

Very exciting! For more information, have a look at the report.  As ever, we welcome your thoughts and suggestions.

Scoping Study Report and Implementation Plan

July 1, 2010

Our scoping study and implementation planning is now complete and we will be publishing shortly,  a report detailing the scoping study methodology,  the key findings that emerged from that study,  e.g. concerns and issues that researchers have surrounding the management of research data, together with our recommendations and planned activites to address those needs.

Watch this space!

More new staff for Incremental

June 29, 2010

Hi I’m Kellie and along with Laura I’m a new addition to the Incremental team at HATII, University of Glasgow. Prior to joining Incremental I worked on the Planets project for four years, planning, organising and evaluating the training programme and alongside Laura managing the successful delivery of the ‘Digital Preservation – the Planets Way’ series of events. I was also involved in two qualitative research workpackages for the project which explored the ways users work with digital collections and future usage scenarios. A Chartered Librarian by trade, I’ve worked in a variety of sectors including higher education, local government and specialist libraries before deciding to join the digital preservation and curation research sphere.

New staff join ‘Incremental’

June 28, 2010

I’m Laura and I’ve just joined the Incremental project team, where I’m delighted to be working alongside Sarah and Kellie at HATII, University of Glasgow, as well as Catharine and Lesley at Cambridge.  My previous experience in information preservation research was on the Planets project (, where Kellie and I worked on delivering the outreach and training events, educating librarians, archivists and technical people about Planets tools and services for digital preservation.

I was also a member of staff of the Performing Arts centre of the late, lamented Arts and Humanities Data Service, along with Sarah.  I keep that particular flame alive by continuing to obsess about the preservation of live performance.

My profile is available at

Implementation planning meeting, Glasgow

May 11, 2010

5th May – 7th May

After flight cancellations and a long , albeit picturesque journey up to Glasgow, the Cambridge team made it up to HATII last Wednesday and had a very productive two days.

The  objectives of our visit were to:

  • Discuss and compare Glasgow’s findings with those from the Cambridge requirements gathering phase.
  • Agree on a number of recommendations to address the key issues and concerns identified.
  • Agree on format and content of the presentation at the JISC programme meeting on the 17th.
  • Make a start on producing a report, outlining our findings, recommendations and implemenation plan.

When we left to catch our flights on Friday evening (the ash thankfully no longer an issue) we had achieved all of the above..Phew..

An  interesting outcome from our discussions that’s worth mentioning is that it had been envisaged that there might be (some) different concerns or issues between institutions  and disciplines – but in reality, few were noted from the requirements gathering. Rather, it seems that, it is actually an issue of available resources (e.g. practical guidance and training) and technical infrastucture.

The Incremental project believes it can most helpfully address the former, whilst acknowledging that the provision of resources sits within a wider, long term goal of  developing a conprehensive data management infrastructure.

Sharing Ideas (and sandwiches) in Manchester: the JISC Managing Research Data Workshop

April 6, 2010

We had an exciting and productive day in Manchester on 12th March, when we got to meet people from other JISC MRD projects and learn about the diverse approaches that they are taking.

The challenge of building researchers interest and enthusiasm for data management seemed a near-universal issue among projects, though the solutions for addressing this problem varied.  For example, the fine folks at ADMIRAL try to grab people briefly and repeatedly first thing in the morning to avoid disrupting their days. Some programmes are creating a discipline-specific infrastructure, incentivising researchers with honorariums for workshop participation, or are leveraging institutional support to encourage local participation. (The latter probably isn’t an option in the somewhat decentralised environs of Cambridge, but it has definitely been helpful for Glasgow’s scoping). For the most part, we are finding that in most others programmes, as in ours, there are very few sticks available, but plenty of carrots if you know how to spot them in the garden patch.

Some of the ideas that came out of the user requirements session were intriguing, while others seem a bit beyond our reach. One participant suggested that departments/universities use automatic classification systems to determine the relative value of data that users are holding. A nice idea, but we can’t see it working with our multi-disciplinary aims or the diverse and metadata-free systems of some researchers.

One idea that has potential is to gather anonymous anecdotes about risky behaviour and what can-and-has gone wrong. This could be separated by categories like storage, metadata, funding, roles and responsibilities, etc. We have definitely encountered those horror stories where a year of data is lost or half a lifetime of data is uninterpretable — but it’s more often a story about a colleague than about the researcher speaking to us personally.

In another session, Tom Howard of the ERIM Project produced some beautiful slides to get us thinking about modelling research workflows. This is something that we haven’t explored much yet here at Incremental, so it was exciting to see emerging approaches. If we do this, we’ll probably use fairly free-form and high-level boxes and arrows to help researchers visualise their processes and locate the intervention points with the most gain for the least pain. Crayons might be involved (just kidding).

In another session, we discussed data management plans. While we have made one of our own and will revisit it as the project progresses (hooray!), we find that this is atypical for the average researcher on the move. So far, we have found that it is a rare PI who shares this plan with team members (e.g. post-docs) or checks to see that it’s being followed. Other projects have had similar experiences. The key seems to be (a) to get university research offices on-board, and, even more importantly (b) connect the data management plans to the project outputs and anticipated uses of the research data in the near future.

Overall: a great day of meeting, greeting, and learning about each others’ projects. We’re excited to share progress in May and see how everyone’s next steps are going.

EIDCSR Workshop – Oxford – 29.03.2010

April 6, 2010

This proved to be a very useful and thought provoking workshop. It was interesting to hear the different approaches taken by the Universities of Oxford, Melbourne and Edinburgh in tackling data curation.

Points taken away from the day:

There is lots of information out there – the challenge is: how to package it up to make it relevant to researchers?

Often those engaged in data curation are working in isolation – how can we bring these people together, to share expertise?

It is recognised that policies need to be implementable – i.e. backed up with discipline specific guidance, so that is makes sense to researchers.

How do we get researchers thinking about preservation at the beginning of the lifecycle, rather than at the end?

Lots to think about!

Welcome to the Incremental project blog!

March 15, 2010

The Incremental project is funded by JISC under the managing research data programme

The project is a collaboration between the Cambridge University Library and Humanities Advanced Technology and Information Institute (HATII) at the University of Glasgow. The project is a first step in improving and facilitating the day-to-day and long-term management of research data in higher education institutions (HEI’s).

We shall be using this blog to let you know what’s happening with the project as it progresses.


Get every new post delivered to your Inbox.