Initial test user thoughts, and my cautious optimism

November 5, 2010

We recently had our very first guinea pig researcher evaluation exercise of the nascent Cambridge web pages for data management support. This was, as you might imagine, very helpful. We’ve been squinting into our computer screens for a while on these, and this gave us an opportunity to take a quick step back and make some adjustments.

We have now received a bit of support and encouragement for our (a) FAQ-centric format, (b) light-weight level of detail with many ‘further reading’ links for adventurous souls, (c) categories of support/guidance, i.e. ‘managing your data’, ‘organising your data’, ‘accessing your data’, and ‘looking after your data’, and (d) topics.  So, with this first data point, I breathe a tentative sigh of relief. Wheh!

There were also some well-earned criticisms and helpful suggestions. Here are some of those:

  • Practice what you preach! (We have instructed researchers to use open formats, and, where necessary, use PDF/A rather than PDF formats. But we had a lone Excel spreadsheet attachment on one of our pages. Hmm…)
  • ‘Teaser’ text explaining links or categories must be precise and complete. (This may sound obvious, but it’s encouraging to see that our test user read and depended on this text, and we will give it some more thought).
  • Including Pros & Cons is pretty much always helpful, especially when a question has no definitive answer that will serve all users best.
  • ‘Return to top’ buttons are a site user’s best friend!
  • Reminder: There are preferred formats for preserving file content in the long term, and preferred formats for preserving maximum file usability in the short/medium term, and the two aren’t always the same; we need to make sure that users understand this.

We’re continuing the work on the Cambridge and Glasgow websites, associated training resources, and data-management related workshops hosted at each institution.

As part of this process, we are hoping to send some of our draft resources your way soon, for your thoughts and appraisal. More on that soon– watch this space!

And, of course, as always, please send your thoughts on website usability and communication in our direction! Finally: have a safe, fun, and scalding-free fireworks weekend!


The crossing point

August 20, 2010

I’ve been looking at existing online provision of research data management guidance at leading UK universities and, yes, I’ve found something of a trend.  There may be some useful guidance on each website, but it’s anyone’s guess where it is, and it’s certainly not all in one, easy-to-find place.

Many – if not all – UK universities have some webpages aimed at researchers.  These are usually called ‘Research Support’, ‘Our Research Environment’, ‘Research Services’ or something scary relating to commercialisation and knowledge transfer.  Anyway, they’re usually pretty obvious when you find them because they’re full of photos of attractive people wearing safety specs and looking intently at things in test-tubes.  The text is reassuring, generally promising to hold the hands of researchers through all aspects of finding, bidding for and managing funding for research.  Oddly, though, they don’t often say anything about looking after that valuable information which people are going to the lengths of giving you money to gather in the first place.

Then, in another place on the website entirely, usually in the ‘Staff’ webpages, we find information on training and development.  Elsewhere again for the usually-bewildering IT support department website and the services and tools they provide.

And then we must look elsewhere for the information – if it’s online at all – about records management or information management.  If you’re lucky enough to actually locate these pages, you’ve often followed a trail – entirely by chance, I should imagine – that goes something like, ‘Home > About the university > Governance and administration > University directorates > Records management and information access > Legal compliance > Records management’.  I wish I was making this up.  Alternatively, you could try the free text Google search box and hope that your choice from ‘records management’, ‘information management’, ‘data management’, ‘research data management’, ‘managing research data’ or ‘research information’ comes up trumps.

Elsewhere, we find some webpages aimed at library users.  These pages, naturally, take the reader through using the library, how to find things, how to get hold of your subject librarian, should you still be lucky enough to have one, and any special collections or galleries the library may be attached to.  This is often – but not always – where you find any mention of the institutional repository.

Yes, the institutional repository, or ‘IR’: often as not, it’s not linked from anywhere except maybe an obscure corner of the IT services website, or maybe a dusty by-way on the library webpages.  Sometimes we only know it exists because the SHERPA list tells us so.  Sometimes even then, it doesn’t turn up online.  When this is the case, you could be forgiven for resorting to your university website’s ‘A-Z’ index – but wait!   It turns out that the IR is very, very unlikely to be listed there under ‘institutional’ or ‘repository’ or ‘archive’ or even ‘research’.  Most university IRs seem to be called something cute, often a name from classical mythology which nobody can remember the relevance of, or a witty acronym from which a highly unlikely title has been tortuously back-formed.  Sometimes they’re just plain baffling and you may as well just search the whole site for ‘EPrints’ and hope for the best.

My point is this – if you are a researcher in need of data management guidance (in the widest, ‘lifecycle’, understanding of the term), you need a little bit of input from each of these places, throughout the life of your project.

  • You need to know from the library where to find the resources you need for your work, if don’t want to trust your review of literature to the likes of Google.
  • You need the staff training or development service to provide you with training on the research software or methods you want to use, and which will allow you to preserve your data in a meaningful way.
  • You need the records management people to let you know what the university thinks you should be keeping, what you should be getting rid of and what the best ways to do these things are.
  • You need to know from the institutional repository how you can submit your work, what format it should be in, what your rights are if you do submit a piece of research to them, and how other people are going to find your work.
  • You need the research support people for funder-specific data management requirements, and to let you know if there’s a research-specific data management policy that differs from the general, institutional records management and/or retention policy.
  • You need to know from IT support what your IT people are prepared to offer you in terms of access to specialist software, equipment, data storage, back-up services and the rest of it.
  • And – crucially – when you’re writing that last-minute bid for funding, you need smooth interaction between these departments to answer questions like, ‘What’s the best way to record my findings during the project and share them with the rest of the team?’, ‘Where and how should I store my data?, ‘Are IT services responsible for backing-up my research data?’, ‘Will my funder pay for the cost of a new server and staff time to administer it?’, ‘Will my funder let me publish my findings in the institutional repository?’, ‘Should I keep my research data once I’ve published or submitted my findings, and if so, where?’ and probably ‘What is a technical appendix anyway?’

The information needed to reliably answer such questions often falls between the realms of IT services and research support services, or research support and the institutional repository, or research support and the training people, or – well, you get the idea.

Help with managing research data is provided by many institutions, but delivery is fragmented and inconsistent.  In many institutions, these resources or pieces of guidance are separate islands, with no crossing points between them. This is no good to researchers – it makes finding guidance much more difficult and time-consuming than it needs to be. You may have found contacts through your personal network or the protocol of your department to help you with this stuff but if you’re new, out of the loop or just not so lucky, bids can be faulty or delayed, funding missed out on and, as a result, research careers damaged.

I say all this based, as mentioned at the start, on a survey I recently undertook of the websites of twenty leading UK universities whose websites I, as a random visitor, studied.  I found evidence of just under a third offering any kind of researcher-specific data management advice online (although it should be noted that I didn’t have access to staff-only intranets).  The other two-thirds of university websites apparently provided only records management advice for either unspecified types of records or specifically for administrative records only (although of course a lot of the practice outlined was still highly relevant to research data).

I gave myself five minutes on each site to get to the research data management advice, if it existed, by navigation of likely-looking links.  After that time, I resorted to the free text search box.   In ninety percent of cases, I had to use the all-site search in order to find any records management or information management guidance at all.   Only one of the twenty university websites appeared to offer any link between the data management advice pages and the IR.  (I’d be interested to know what percentage of university research staff at each institution know a) what an institutional repository is; b) whether they have one; c) what it’s called and d) where it is online. Hey, I think I’ll find that out …)

Only fifteen percent of websites visited listed their IR in the website A-Z index in a way that you’d be able to find it without knowing its cute, in-house name, and a quarter of institutions listed it only under this name.

So, in short, to improve matters, universities need to consider the pieces of guidance they already supply their research staff about data management, and draw them together to form comprehensive, simple resources that will make sense from the working researcher, with little time and no data-management specialist knowledge.  These resources should act as crossing points between previously-separate realms.  And this is where the opportunity is for Incremental to make things better for researchers.

If we can find good practice in UK university research data management guidance, whether that’s in a well-written list of FAQs, or a well-organised website pulling together guidance from across a university website into one accessible, obvious place, then all to the good.  If we can’t find this, or find enough of it, we need to start making it and positioning it on our respective university websites in a way that is prominent and intuitive for research staff of that university. These connections can be the crossing points to help researchers get to the guidance they need, when they need it, and if we manage that, I think Incremental’s job is done!

Does your university offer meaningful help with data management?  Or are you struggling to find the assistance you need to look after your data?  Are you responsible for promoting one of these services at your institution?  Let us know in the comments.

Team building with buns

July 16, 2010
Incremental team

the Incremental team

Monday 12th July saw a jolting 5am start to catch the early flight down to London. Laura had been over in Vienna for the Planets project review so we met at Stansted for a coffee to bring us all round. A quick train ride, hotel drop and jaunt through Cambridge town centre got us to the Uni Library raring to go. So, the plan for the two days:

  • review the resources we’ve uncovered so far to identify gaps;
  • consider models we could learn from to produce clear, meaningful, easy-to-find support and guidance;
  • start to formulate the structure and content of the webpages;
  • plan evaluation and dissemination activities;
  • and a chance for Kellie to meet the rest of the team!

Most of the reviewing and initial brainstorming was achieved on day one, but it was on Tuesday that we got real progress – not least because we had a helping hand 😉 Morning sustenance came in the form of Chelsea buns, as recommended by Cambridge-beau Stephen Fry. Just look at the picture – mmmm!

Chelsea buns

Fitzbillies' Chelsea buns

We picked them up on the way in – just about managing to drag Laura from Fitzbillies before she pressed them for his address (they deliver his favourite buns direct). Fully charged, we ploughed ahead and agreed an initial structure, as well as the content and text for the homepage, allowing us to split up the next tasks. A first mock-up won’t be too long coming, so watch this space!

There was lots of useful discussion in-between on how to make sure what we develop really does match up with what users want. We’re mindful to check that what researchers have expressed and what we’ve understood matches in reality (cue ongoing evaluation through observation and iterative development). Spreading the word (and as you’ll see from Laura’s vocab post, quite what word(s) is a moot point!) both by engaging with users and reaching out to service providers through advocacy is key. Like the DCC, we’re trying to assume a bridging, mediation role in which we pull together and position guidance and support so people can utilise it to the full. We’re also keen to embed messages in existing training and make sure that training is available when and how researchers need it. There’s a lot of DCC training and Planets resources we can build on here.

So, lots to do. Better crack on…

Vocabulary/jargon/terminology: synonyms and specialist language

July 14, 2010

As a non-specialist in data preservation research, I’m finding that my ignorance about a lot of sector-specific jargon (or perhaps ‘terminology’ is a bit more friendly : ) ) can actually work in our favour on this project.  Whilst those who know their data preservation/records management vocabulary employ it – as in any specialist field – for its role in accurate and concise communication, and this is crucial in order to advance specialist research in the field, the Incremental project is all about making research data management make sense to the users, whose primary activity is researching, say, medieval manuscripts or heart disease or Caravaggio’s painting methods or deep-sea mammal life.

The uncomfortable truth is that a lot of people – including those who work in research in universities – don’t even know records management or data curation (or whatever we think it should be called) even exists as a field.  Given this fact, it should come as no surprise that terms in regular use across information curation, data management and related fields mean little or nothing to a lot of people who, nevertheless, need to know about the ideas, methods and processes these baffling terms describe, in order to have a chance of accessing their valuable research data now or in the future.

So we know we have to translate research data management vocabulary from specialist to non-specialist – but how?  We need to find words and phrases that at least have a chance of making sense to researchers who are not specialists in information management / records management / data preservation / data curation.  See?  Even trying to get across this simple point leads me into a linguistic minefield.  Archivists, librarians, IT specialists and other professionals involved in information preservation don’t have an agreed vocabulary for activities and roles in this area, so we on this project really have our work cut out trying to make sense to anyone else!

As part of work this month to scope out the existing data management guidance resources in UK universities, I’ve stumbled across an example: in a few UK universities, ‘research data management’ is used to mean how a researcher goes about gathering information in the first place, i.e the nuts and bolts of the formulation of questionnaires and interview schema, of which software to use to record and share this data, and how researchers can order and manipulate their information as they work on their project.  To me, a lot of this is ‘research methods’, not ‘research data management’.  However, that’s how quite a few UK universities use the term on their guidance pages.  In other institutions, the term is used to mean how one looks after the data after the work is completed, and how one takes care of its longevity, access, integrity and security.  Some other institutions include both these processes in their advice, but that’s rare.  The Digital Curation Centre, a prominent force in, well, digital curation, offers a useful (if slightly intimidating, the first time you see it) lifecycle model ( which supports this third, inclusive view, i.e. that data exists within a lifecycle, beginning with the creation or receipt of the data, and finishing with how to look after it in perpetuity (or until it should be disposed of).  One phrase: three related but distinct views of what it might mean.

There are probably researchers in UK universities with dozens of other ideas about what the phrase ‘research data management’ might mean or involve – and they probably don’t like the sound of any of them.  So how do we find vocabulary that makes sense to non-research-data-management-specialists, and actually makes them feel alright about engaging in good data management practices?  Is it all about sticking with an agreed established term and hoping that if we say it enough, users will eventually get their heads around what it means?  Is it about contextualising data management guidance and advice in researcher-friendly areas of institutional websites, alongside a whacking great glossary of what we mean by each of the terms we use?  Or should we proceed by enthusiastic promotion of the benefits to researchers of good data management, with the awkward vocabulary and frightening names for things tagged on – again with glossary – as unattractive but necessary addenda, like the copious small print at the bottom of loan adverts?

Scoping study and implementation plan released

July 2, 2010

We are pleased to announce that the Incremental project Scoping study and implementation plan is now available. This report describes our findings from interviews and informal conversations with dozens of researchers and technical support staff across the Universities of Cambridge and Glasgow and outlines our implementation plan for the coming months of the project.

Some highlights from the report:

Simple issues cause serious risks and irritation

Many researchers, across disciplines, are unaware of the best formats and storage media to preserve or share files, and many have no clear naming or file structure conventions. These kinds of relatively simple issues pose the risk of serious data losses in the short and long term, and frequently cost researchers’ time and frustration searching for data or trying to revive old files.

Resources must be simple, engaging and easy to access

Researchers were interested in guidance, simple tools, and support for data management, but this came with several caveats. Information needs to be clear, quick to understand, engaging, and relevant to their circumstances. They are often unaware of existing resources and training and don’t know where to look for support. Many complained that training is often inconveniently timed and not well-tailored to their needs, suggesting online resources, ‘a really smart little leaflet’ or someone to talk to face-to-face would be more helpful.

Language matters

Our study underscored the need to provide jargon-free guidance – most researchers don’t know what ‘digital curation’ is and humanities researchers don’t think of their manuscripts as ‘data’. Researchers and support staff tended to be suspicious of ‘policies,’ which sound like hollow mandates, but were sometimes receptive to ‘procedures’ or ‘advice’ which may be essentially the same thing, but convey a sense of purpose and assistance rather than requirement.

And so, here are our plans:

1. Produce simple, accessible, visual guidance on creating, storing, and managing data

This will include producing easy-to-navigate centralised webpages at each institution, pointing researchers to existing support and new resources created by the project. We’ll consider the format of guidance and move towards more engaging formats such as illustrated fact sheets, flow diagrams, checklists, and FAQs.

2. Offer practical data training with discipline-specific examples and local champions

We will work with enthusiasts within departments to embed slides and resources within existing training and inductions (i.e. train the trainer). We will also create brief online tutorials and/or screen-casts, and include case-studies from within disciplines wherever possible.

3. Connect researchers with support staff who offer one-to-one advice, guidance, and partnering

We will work with departments and the research office within each institution to make sure that researchers are referred to existing support staff for one-to-one advice during the proposal-writing stage of projects and beyond.

4. Work towards the development of a comprehensive data management infrastructure

This project is part of an overall effort to support data management and preservation activities at both institutions, and will be continued through the broader research data infrastructure and policy development at Cambridge and Glasgow.

Very exciting! For more information, have a look at the report.  As ever, we welcome your thoughts and suggestions.

Sharing Ideas (and sandwiches) in Manchester: the JISC Managing Research Data Workshop

April 6, 2010

We had an exciting and productive day in Manchester on 12th March, when we got to meet people from other JISC MRD projects and learn about the diverse approaches that they are taking.

The challenge of building researchers interest and enthusiasm for data management seemed a near-universal issue among projects, though the solutions for addressing this problem varied.  For example, the fine folks at ADMIRAL try to grab people briefly and repeatedly first thing in the morning to avoid disrupting their days. Some programmes are creating a discipline-specific infrastructure, incentivising researchers with honorariums for workshop participation, or are leveraging institutional support to encourage local participation. (The latter probably isn’t an option in the somewhat decentralised environs of Cambridge, but it has definitely been helpful for Glasgow’s scoping). For the most part, we are finding that in most others programmes, as in ours, there are very few sticks available, but plenty of carrots if you know how to spot them in the garden patch.

Some of the ideas that came out of the user requirements session were intriguing, while others seem a bit beyond our reach. One participant suggested that departments/universities use automatic classification systems to determine the relative value of data that users are holding. A nice idea, but we can’t see it working with our multi-disciplinary aims or the diverse and metadata-free systems of some researchers.

One idea that has potential is to gather anonymous anecdotes about risky behaviour and what can-and-has gone wrong. This could be separated by categories like storage, metadata, funding, roles and responsibilities, etc. We have definitely encountered those horror stories where a year of data is lost or half a lifetime of data is uninterpretable — but it’s more often a story about a colleague than about the researcher speaking to us personally.

In another session, Tom Howard of the ERIM Project produced some beautiful slides to get us thinking about modelling research workflows. This is something that we haven’t explored much yet here at Incremental, so it was exciting to see emerging approaches. If we do this, we’ll probably use fairly free-form and high-level boxes and arrows to help researchers visualise their processes and locate the intervention points with the most gain for the least pain. Crayons might be involved (just kidding).

In another session, we discussed data management plans. While we have made one of our own and will revisit it as the project progresses (hooray!), we find that this is atypical for the average researcher on the move. So far, we have found that it is a rare PI who shares this plan with team members (e.g. post-docs) or checks to see that it’s being followed. Other projects have had similar experiences. The key seems to be (a) to get university research offices on-board, and, even more importantly (b) connect the data management plans to the project outputs and anticipated uses of the research data in the near future.

Overall: a great day of meeting, greeting, and learning about each others’ projects. We’re excited to share progress in May and see how everyone’s next steps are going.

EIDCSR Workshop – Oxford – 29.03.2010

April 6, 2010

This proved to be a very useful and thought provoking workshop. It was interesting to hear the different approaches taken by the Universities of Oxford, Melbourne and Edinburgh in tackling data curation.

Points taken away from the day:

There is lots of information out there – the challenge is: how to package it up to make it relevant to researchers?

Often those engaged in data curation are working in isolation – how can we bring these people together, to share expertise?

It is recognised that policies need to be implementable – i.e. backed up with discipline specific guidance, so that is makes sense to researchers.

How do we get researchers thinking about preservation at the beginning of the lifecycle, rather than at the end?

Lots to think about!