Passing on data management skills

March 30, 2011

In the training session of the MRD International workshop we heard from the five RDMTrain projects and DaMSSI – the support project led by DCC and RIN.

All five projects are producing discipline-specific training materials targeted at postgraduate researchers and PhD students.  The disciplines covered are:
(n.b. project name and contact person in brackets)

  • Creative arts (CAiRO – Stephen Gray)
  • Archaeology (DataTrain – Lindsay Lloyd-Smith)
  • Anthropology (DataTrain – Irene Peano)
  • Social science (MANTRA – Cuna Ekmekcioglu)
  • Geoscience (MANTRA – Cuna)
  • Psychology (DMTpsych – Richard Plant & MANTRA – Cuna)
  • Public health (DATUM – Julie McLeod)

They’re delivering varied lengths and formats of course: a 5-day summer school (CAiRO); 4-6 fortnightly modules / lectures (DataTrain archaeology, DATUM and DMTpsych); a one day interactive workshop (DataTrain anthropology); and eight online learning units (MANTRA).

The projects are typically tying their provision in with existing postgraduate research methods courses.  MANTRA will become part of the University of Edinburgh’s Transkills programme.

Concerns were raised about the volume of information out there and the need to keep courses light and interesting (as data management is often considered to be dull!)  Projects are re-telling anecdotes and horror stories, as well as focusing on breakout exercises and discussion so learning is centred around the student experience.

Some highlights and take home messages from the presentations:

  • CAiRO’s “Arts vs science: death-match” slide gave an insight into how arts research is different to science (and what implications this has on training)
  • DataTrain courses are led by recent PhDs, as they’re more in tune with students (also felt to be a more sustainable model)
  • DATUM project has put together a custom Google search of useful DM training sources
  • DMTpsych found researchers preferred printed DMP guidance so they could read offline (in the bath!) and discuss ideas to decide what to include in plan
  • MANTRA is licensing all content as attribution only for widest re-use – respect!
  • Transferable skills are a key focus of DaMSSI.  They’re working with professional bodies to endorse career profiles, which show data management skills are useful in all walks of life.

All these project run until July 2011.  Find out more on the JISC MRD programme page


Costing data management

November 8, 2010

There have been a few events of late on costing research data management. Two that I’ve attended are:

Roles and responsibilities were a key theme. Is data management the concern of researchers, their institutions, funders or disciplinary data centres? At the RDMF, Jeff Haywood, Vice Principal for Knowledge Management at University of Edinburgh, described the institution as the place of last resort for preserving data. They hope to direct researchers to external data centres where possible but are concerned to keep a register of the data so they know where their assets are and can act to secure these if external services are under the threat of closure.

A breakout session at the RDMF on institutional solutions versus national data centres reached a similar conclusion. It isn’t a matter of choice – we have to live with a mixed landscape. It was argued there should be more services at local level: a sort of first step data management service. A series of handovers could then scale up to various levels as appropriate based on the nature of the data, the available infrastructure and the specific requirements of each case. Jeff’s argument holds well in this scenario – HEIs don’t need to provide a complete infrastructure, just add to existing provision where required and most importantly know what they own and where this is.

At the JISC workshop, Andrew Bush of KPMG addressed how costs can be built into research funding bids when there’s a gap in provision.  He recommended that data management support costs should be recovered through indirects, as this is apparently where research councils see them being placed. He advised not to class data management infrastructure as research facilities, as the cost of these should only be applied when the facility is used by a project – not on every bid – so you need to work at capacity. Also, as projects typically draw on data management infrastructure once finished, it’s better to include this as an indirect cost. It seems research funders are willing to meet data management costs but it’s quite an untested area so examples of how people have costed in support would be welcome.

One aspect where headway has been made is in defining some of those costs. The JISC MRD projects have been asked to identify researcher needs and pilot services to address these. At Leicester they’ve been investigating the provision of ‘good enough’ data centres, which provide robust but cheaper storage to researchers. The cost comparison was £400 per Tb per year versus the usual £1 a Gb a day on university SANs. Jonathan Tedds reported that the reception to this has been overwhelming, as researchers often struggle to manage their own storage and back-up efficiently. Comparable charges were noted by other JISC projects too.

More work is underway across the MRD programme on defining benefits and business models for sustainability. This will be presented at the International workshop in Birmingham in March 2011.

Sudamih training workshop

July 30, 2010

Thursday, 22nd July. The day started according to plan – get up 6:15: check, get to station, 6:55: check, train to Richmond 6.58: check, 4 minutes journey to catch 7:06 to Reading…. boo….stupid tube. As we sat stuck at a red signal outside Richmond station, I watched as my train to Reading whizzed by, kicking myself for not getting the earlier tube. Now I would have to wait for 30 minutes at Richmond for the next train (which meant I could have had a crucial 30 minutes more under the covers :() and more importantly call the Sudamih team to say I was going to be late…umm luckily I wasn’t speaking till 10am but still, it was going to be a little tight and I would miss out on the welcome coffee and biccies 😦  Arrived into Oxford at 9:20 and after a super speedy taxi ride, arrived at Oxford’s e-research building just as everyone was going in…phew..…

As promised, the workshop proved to be wide ranging with speakers from the Digital Curation Centre, the Research Information Network, Vitae (the national researcher training body), and projects at Oxford and King’s College London. All the talks were interesting and generated useful discussions on data management training from the institutional to the national level. I won’t go into all of them as further details of the workshop and copies of presentations can be found here.

But to highlight a few: James Wilson, (project manager of Oxford’s Sudamih project) kicked off the meeting with a talk on the findings from their scoping study to assess current data management practices in the humanities.  Findings were, reassuringly, similar to ours with researchers requesting guidance and training on a range of data management issues.

So how do they propose to address these? Well, in their view there is a clear need for both broad courses and more detailed technical training. This may take the form of an introduction to data management which will be integrated into existing courses, guidance on how to organise and link research notes with sources, support with how to prepare technical bids and the creation of a database service for the humanities.  Very interesting and I can definitely see an opportunity to collaborate/share resources.

I was up next, talking about Incremental’s scoping study, our findings and how we plan to address these in terms of guidance and in particular, training.

Finally some interesting thoughts from Eric Meyer of Oxford University who reported some early findings from a study that looks at information practices of those researching in the humanities.  Of particular interest was the finding that researchers are taught disciplinary biases very early on in their careers; for example, they develop clear views on which sources of information are deemed valuable and which are not.  When it came to citation practices, researchers and students cited lots of digital publications but then indicated that they had consulted the paper version as well!  Is the digital version seen as less trustworthy?

Eric also drew our attention to the first year annual report of a 3 year study (JISC/BL) tracking the research behaviour of Generation Y doctoral students (children of the Baby Boomers, born between 1982 and 1994).  The assumption that Generation Y would be early adopters and keen users of the latest technology applications and tools in their research was, in fact, not supported by their study. On the contrary, it would appear that Generation Y doctoral students, in common with others, are quite risk averse and ‘behind the curve’ in using digital technology, not at the forefront; and this despite the fact that the majority appear to be keen users of the latest technology applications in their personal lives.

The reason for this, they propose, is not due to lack of skill but is more likely to be because the students do not see the immediate utility of the technology within their research and their preferred ways of working.  This is an important finding, and one that Incremental should bear in mind when it points researchers towards the available web 2.0 tools that are out there.