Passing on data management skills

March 30, 2011

In the training session of the MRD International workshop we heard from the five RDMTrain projects and DaMSSI – the support project led by DCC and RIN.

All five projects are producing discipline-specific training materials targeted at postgraduate researchers and PhD students.  The disciplines covered are:
(n.b. project name and contact person in brackets)

  • Creative arts (CAiRO – Stephen Gray)
  • Archaeology (DataTrain – Lindsay Lloyd-Smith)
  • Anthropology (DataTrain – Irene Peano)
  • Social science (MANTRA – Cuna Ekmekcioglu)
  • Geoscience (MANTRA – Cuna)
  • Psychology (DMTpsych – Richard Plant & MANTRA – Cuna)
  • Public health (DATUM – Julie McLeod)

They’re delivering varied lengths and formats of course: a 5-day summer school (CAiRO); 4-6 fortnightly modules / lectures (DataTrain archaeology, DATUM and DMTpsych); a one day interactive workshop (DataTrain anthropology); and eight online learning units (MANTRA).

The projects are typically tying their provision in with existing postgraduate research methods courses.  MANTRA will become part of the University of Edinburgh’s Transkills programme.

Concerns were raised about the volume of information out there and the need to keep courses light and interesting (as data management is often considered to be dull!)  Projects are re-telling anecdotes and horror stories, as well as focusing on breakout exercises and discussion so learning is centred around the student experience.

Some highlights and take home messages from the presentations:

  • CAiRO’s “Arts vs science: death-match” slide gave an insight into how arts research is different to science (and what implications this has on training)
  • DataTrain courses are led by recent PhDs, as they’re more in tune with students (also felt to be a more sustainable model)
  • DATUM project has put together a custom Google search of useful DM training sources
  • DMTpsych found researchers preferred printed DMP guidance so they could read offline (in the bath!) and discuss ideas to decide what to include in plan
  • MANTRA is licensing all content as attribution only for widest re-use – respect!
  • Transferable skills are a key focus of DaMSSI.  They’re working with professional bodies to endorse career profiles, which show data management skills are useful in all walks of life.

All these project run until July 2011.  Find out more on the JISC MRD programme page


SEMINAR ANNOUNCEMENT: Managing Performance Data and Documentation: a free Incremental seminar at University of Glasgow

January 19, 2011

Free seminar: Managing Performance Data and Documentation

Register now athttp://tiny.cc/performancedata.

Thursday 17 February 2011, 10am-3pm (incl. lunch)

Venue: Turnbull Hall, Southpark Terrace, Southpark Avenue, University of Glasgow, Glasgow G12 8LG.  A Google map for the venue is at: http://tinyurl.com/4le623e.

Cost: Free (yes, including lunch!)

Who it’s for: Researchers, performers, research-related staff and postgraduate students working in the live and performing arts.

Register now at: http://tiny.cc/performancedata.

Research in the live and performing arts produces interesting and varied types of documentation and data, including text, images, audio and video.  On Thursday 17 February, we will bring together researchers and performers working in the live and performing arts across the UK, to inspire and provide guidance for better management of these materials.

  • In the morning session, a panel of researchers and artists from across the UK will share inspirational case studies about how they tackled their data management challenges.
  • In the afternoon, experts from the University of Glasgow will provide guidance on varied data types used in the live and performing arts, and raise awareness of specific support available for researchers and students at the University.

Researchers, performers, postgraduate students and research-related staff working in the live and performing arts are all very welcome.  Registration is required, but free.

Please register as soon as possible to attend – registration closes at 12 noon on Monday 14th February.  For more information, a full programme and to register, please visit http://tiny.cc/performancedata.

Any questions?  Please email Laura Molloy, at: Laura.Molloy[at]Glasgow.ac.uk.

This free seminar will take place on Thu 17 February 2011, 10am – 3pm (including lunch) at the Turnbull Hall, Southpark Terrace, Southpark Avenue, University of Glasgow, Glasgow G12 8LG.

This seminar is supported by the JISC Incremental project at the University of Glasgow, which aims to help researchers across all disciplines to manage and care for their research data and records.  Incremental ’s website is at: http://www.lib.cam.ac.uk/preservation/incremental/glasgow.html.

Free seminar: Managing Performance Data and Documentation

Thursday 17 February 2011, 10am-3pm (incl. lunch)

Venue: Turnbull Hall, Southpark Terrace, Southpark Avenue, University of Glasgow, Glasgow G12 8LG.  A Google map for the venue is at: http://tinyurl.com/4le623e.

Cost: Free (yes, including lunch!)

Who it’s for: Researchers, performers, research-related staff and postgraduate students working in the live and performing arts.

Register now at: http://tiny.cc/performancedata.

 

Research in the live and performing arts produces interesting and varied types of documentation and data, including text, images, audio and video.  On Thursday 17 February, we will bring together researchers and performers working in the live and performing arts across the UK, to inspire and provide guidance for better management of these materials.

 

· In the morning session, a panel of researchers and artists from across the UK will share inspirational case studies about how they tackled their data management challenges.

· In the afternoon, experts from the University of Glasgow will provide guidance on varied data types used in the live and performing arts, and raise awareness of specific support available for researchers and students at the University.

 

Researchers, performers, postgraduate students and research-related staff working in the live and performing arts are all very welcome.  Registration is required, but free.

 

Please register as soon as possible to attend – registration closes at 12 noon on Monday 14th February.  For more information, a full programme and to register, please visit http://tiny.cc/performancedata.

 

Any questions?  Please email Laura Molloy, at: Laura.Molloy[at]Glasgow.ac.uk.

 

This free seminar will take place on Thu 17 February 2011, 10am – 3pm (including lunch) at the Turnbull Hall, Southpark Terrace, Southpark Avenue, University of Glasgow, Glasgow G12 8LG.

 

This seminar is supported by the JISC Incremental project at the University of Glasgow, which aims to help researchers across all disciplines to manage and care for their research data and records.  Incremental ’s website is at: http://www.lib.cam.ac.uk/preservation/incremental/glasgow.html.

 

 

 

Free seminar: Managing Performance Data and Documentation


Attending the International Digital Curation Conference (IDCC10)

December 21, 2010

Saturday 4th December – After a couple of movies, lunch and 40 winks, I touched down safely in Chicago. They call it the windy city, but as I exited the ‘L’ (Chicago’s equivalent to the tube) downtown, I was hit by minus degree temperatures and the realisation that my M&S gloves were not going to cut it…

This was my first  time to the International Digital Curation Conference (IDCC). The theme this year was ‘growing the curation community’, a theme very relevant to the aims of Incremental and I was keen to use this opportunity to, not only tell everyone about Incremental (my first victim had been the lady at immigration, when she asked why I was in the US…), but to hear what others are doing in the field both in the UK and US.

The programme promised some interesting speakers, and it didn’t disappoint. The two highlights for me were Chris Lintott, Astronomer and PI of Galaxy Zoo and MacKenzie Smith, Associate Director for Technology at MIT Libraries, who presented two different perspectives on digital curation.

Chris Lintott gave a fascinating talk describing a number of projects which use crowd sourcing,  such as Galaxy Zoo which gets the public to classify galaxies imaged as part of the Sloan Digital Sky Survey, and Old Weather which uses the public to digitise weather observations recorded in ship’s logbooks.

MacKenzie Smith talked about how digital curation tends to be viewed through a technology lense but an alternative view is seeing curation from an organisational perspective. She described the various layers of digital curation such as storage, management, linking, discovery, delivery and management of data, and that rather than just one institution or group, it is a combination of research groups: professional societies, data centres, libraries and archives, businesses, universities and funding agencies all interoperating in digital curation. The question is how and what role they, should they play in digital curation?

Presentations given by other JISC funded research data management or training projects such as James Wilson of Oxford, Robin Rice from Edinburgh and Wendy White of Southampton, were also of great interest.

On the Wednesday afternoon, we got to present our paper ‘Making sense: talking data management with researchers’. We were given a slot in the ‘Digital Curation Education’ parallel session and were able to describe our approach and plans for support and training researchers in data management.

It was interesting to hear the other presentations in this session, particularly as these served to highlight the different approaches that the UK and the US are taking in digital curation education.   Whilst the UK is focusing on researchers’ data management practice, the US are focusing very much on educating the library community in digital curation. One issue, two different approaches.

I think hearing both data creators and data managers talk about the challenges of data curation, the clear message that I took away from the conference, was that researchers clearly have the expertise in creating and using their data, but that they do not need to manage their data alone. Data curation is clearly a collaborative effort and other services such as IT, Libraries can play a key role too, by support ing researchers to make informed decisions.


Costing data management

November 8, 2010

There have been a few events of late on costing research data management. Two that I’ve attended are:

Roles and responsibilities were a key theme. Is data management the concern of researchers, their institutions, funders or disciplinary data centres? At the RDMF, Jeff Haywood, Vice Principal for Knowledge Management at University of Edinburgh, described the institution as the place of last resort for preserving data. They hope to direct researchers to external data centres where possible but are concerned to keep a register of the data so they know where their assets are and can act to secure these if external services are under the threat of closure.

A breakout session at the RDMF on institutional solutions versus national data centres reached a similar conclusion. It isn’t a matter of choice – we have to live with a mixed landscape. It was argued there should be more services at local level: a sort of first step data management service. A series of handovers could then scale up to various levels as appropriate based on the nature of the data, the available infrastructure and the specific requirements of each case. Jeff’s argument holds well in this scenario – HEIs don’t need to provide a complete infrastructure, just add to existing provision where required and most importantly know what they own and where this is.

At the JISC workshop, Andrew Bush of KPMG addressed how costs can be built into research funding bids when there’s a gap in provision.  He recommended that data management support costs should be recovered through indirects, as this is apparently where research councils see them being placed. He advised not to class data management infrastructure as research facilities, as the cost of these should only be applied when the facility is used by a project – not on every bid – so you need to work at capacity. Also, as projects typically draw on data management infrastructure once finished, it’s better to include this as an indirect cost. It seems research funders are willing to meet data management costs but it’s quite an untested area so examples of how people have costed in support would be welcome.

One aspect where headway has been made is in defining some of those costs. The JISC MRD projects have been asked to identify researcher needs and pilot services to address these. At Leicester they’ve been investigating the provision of ‘good enough’ data centres, which provide robust but cheaper storage to researchers. The cost comparison was £400 per Tb per year versus the usual £1 a Gb a day on university SANs. Jonathan Tedds reported that the reception to this has been overwhelming, as researchers often struggle to manage their own storage and back-up efficiently. Comparable charges were noted by other JISC projects too.

More work is underway across the MRD programme on defining benefits and business models for sustainability. This will be presented at the International workshop in Birmingham in March 2011.


Sharing Ideas (and sandwiches) in Manchester: the JISC Managing Research Data Workshop

April 6, 2010

We had an exciting and productive day in Manchester on 12th March, when we got to meet people from other JISC MRD projects and learn about the diverse approaches that they are taking.

The challenge of building researchers interest and enthusiasm for data management seemed a near-universal issue among projects, though the solutions for addressing this problem varied.  For example, the fine folks at ADMIRAL try to grab people briefly and repeatedly first thing in the morning to avoid disrupting their days. Some programmes are creating a discipline-specific infrastructure, incentivising researchers with honorariums for workshop participation, or are leveraging institutional support to encourage local participation. (The latter probably isn’t an option in the somewhat decentralised environs of Cambridge, but it has definitely been helpful for Glasgow’s scoping). For the most part, we are finding that in most others programmes, as in ours, there are very few sticks available, but plenty of carrots if you know how to spot them in the garden patch.

Some of the ideas that came out of the user requirements session were intriguing, while others seem a bit beyond our reach. One participant suggested that departments/universities use automatic classification systems to determine the relative value of data that users are holding. A nice idea, but we can’t see it working with our multi-disciplinary aims or the diverse and metadata-free systems of some researchers.

One idea that has potential is to gather anonymous anecdotes about risky behaviour and what can-and-has gone wrong. This could be separated by categories like storage, metadata, funding, roles and responsibilities, etc. We have definitely encountered those horror stories where a year of data is lost or half a lifetime of data is uninterpretable — but it’s more often a story about a colleague than about the researcher speaking to us personally.

In another session, Tom Howard of the ERIM Project produced some beautiful slides to get us thinking about modelling research workflows. This is something that we haven’t explored much yet here at Incremental, so it was exciting to see emerging approaches. If we do this, we’ll probably use fairly free-form and high-level boxes and arrows to help researchers visualise their processes and locate the intervention points with the most gain for the least pain. Crayons might be involved (just kidding).

In another session, we discussed data management plans. While we have made one of our own and will revisit it as the project progresses (hooray!), we find that this is atypical for the average researcher on the move. So far, we have found that it is a rare PI who shares this plan with team members (e.g. post-docs) or checks to see that it’s being followed. Other projects have had similar experiences. The key seems to be (a) to get university research offices on-board, and, even more importantly (b) connect the data management plans to the project outputs and anticipated uses of the research data in the near future.

Overall: a great day of meeting, greeting, and learning about each others’ projects. We’re excited to share progress in May and see how everyone’s next steps are going.


EIDCSR Workshop – Oxford – 29.03.2010

April 6, 2010

This proved to be a very useful and thought provoking workshop. It was interesting to hear the different approaches taken by the Universities of Oxford, Melbourne and Edinburgh in tackling data curation.

Points taken away from the day:

There is lots of information out there – the challenge is: how to package it up to make it relevant to researchers?

Often those engaged in data curation are working in isolation – how can we bring these people together, to share expertise?

It is recognised that policies need to be implementable – i.e. backed up with discipline specific guidance, so that is makes sense to researchers.

How do we get researchers thinking about preservation at the beginning of the lifecycle, rather than at the end?

Lots to think about!