Incremental in the press!

November 2, 2010

Well, in the Digital Preservation Coalition newsletter, at least!  Find us in the ‘Who’s Who: Sixty Second Interview’ section here:

http://www.dpconline.org/newsroom/whats-new/651-whats-new-issue-31-november-2010#WhosWho31

One of the most difficult things is keeping responses short and pithy.  The more I thought about the questions, the more I wanted to say.  (The other most difficult thing is finding a photo I was prepared to put in the article!)

So, have a quick read and let us know what you thought.  Was I talking nonsense?  Let us know in the comments!


The crossing point

August 20, 2010

I’ve been looking at existing online provision of research data management guidance at leading UK universities and, yes, I’ve found something of a trend.  There may be some useful guidance on each website, but it’s anyone’s guess where it is, and it’s certainly not all in one, easy-to-find place.

Many – if not all – UK universities have some webpages aimed at researchers.  These are usually called ‘Research Support’, ‘Our Research Environment’, ‘Research Services’ or something scary relating to commercialisation and knowledge transfer.  Anyway, they’re usually pretty obvious when you find them because they’re full of photos of attractive people wearing safety specs and looking intently at things in test-tubes.  The text is reassuring, generally promising to hold the hands of researchers through all aspects of finding, bidding for and managing funding for research.  Oddly, though, they don’t often say anything about looking after that valuable information which people are going to the lengths of giving you money to gather in the first place.

Then, in another place on the website entirely, usually in the ‘Staff’ webpages, we find information on training and development.  Elsewhere again for the usually-bewildering IT support department website and the services and tools they provide.

And then we must look elsewhere for the information – if it’s online at all – about records management or information management.  If you’re lucky enough to actually locate these pages, you’ve often followed a trail – entirely by chance, I should imagine – that goes something like, ‘Home > About the university > Governance and administration > University directorates > Records management and information access > Legal compliance > Records management’.  I wish I was making this up.  Alternatively, you could try the free text Google search box and hope that your choice from ‘records management’, ‘information management’, ‘data management’, ‘research data management’, ‘managing research data’ or ‘research information’ comes up trumps.

Elsewhere, we find some webpages aimed at library users.  These pages, naturally, take the reader through using the library, how to find things, how to get hold of your subject librarian, should you still be lucky enough to have one, and any special collections or galleries the library may be attached to.  This is often – but not always – where you find any mention of the institutional repository.

Yes, the institutional repository, or ‘IR’: often as not, it’s not linked from anywhere except maybe an obscure corner of the IT services website, or maybe a dusty by-way on the library webpages.  Sometimes we only know it exists because the SHERPA list tells us so.  Sometimes even then, it doesn’t turn up online.  When this is the case, you could be forgiven for resorting to your university website’s ‘A-Z’ index – but wait!   It turns out that the IR is very, very unlikely to be listed there under ‘institutional’ or ‘repository’ or ‘archive’ or even ‘research’.  Most university IRs seem to be called something cute, often a name from classical mythology which nobody can remember the relevance of, or a witty acronym from which a highly unlikely title has been tortuously back-formed.  Sometimes they’re just plain baffling and you may as well just search the whole site for ‘EPrints’ and hope for the best.

My point is this – if you are a researcher in need of data management guidance (in the widest, ‘lifecycle’, understanding of the term), you need a little bit of input from each of these places, throughout the life of your project.

  • You need to know from the library where to find the resources you need for your work, if don’t want to trust your review of literature to the likes of Google.
  • You need the staff training or development service to provide you with training on the research software or methods you want to use, and which will allow you to preserve your data in a meaningful way.
  • You need the records management people to let you know what the university thinks you should be keeping, what you should be getting rid of and what the best ways to do these things are.
  • You need to know from the institutional repository how you can submit your work, what format it should be in, what your rights are if you do submit a piece of research to them, and how other people are going to find your work.
  • You need the research support people for funder-specific data management requirements, and to let you know if there’s a research-specific data management policy that differs from the general, institutional records management and/or retention policy.
  • You need to know from IT support what your IT people are prepared to offer you in terms of access to specialist software, equipment, data storage, back-up services and the rest of it.
  • And – crucially – when you’re writing that last-minute bid for funding, you need smooth interaction between these departments to answer questions like, ‘What’s the best way to record my findings during the project and share them with the rest of the team?’, ‘Where and how should I store my data?, ‘Are IT services responsible for backing-up my research data?’, ‘Will my funder pay for the cost of a new server and staff time to administer it?’, ‘Will my funder let me publish my findings in the institutional repository?’, ‘Should I keep my research data once I’ve published or submitted my findings, and if so, where?’ and probably ‘What is a technical appendix anyway?’

The information needed to reliably answer such questions often falls between the realms of IT services and research support services, or research support and the institutional repository, or research support and the training people, or – well, you get the idea.

Help with managing research data is provided by many institutions, but delivery is fragmented and inconsistent.  In many institutions, these resources or pieces of guidance are separate islands, with no crossing points between them. This is no good to researchers – it makes finding guidance much more difficult and time-consuming than it needs to be. You may have found contacts through your personal network or the protocol of your department to help you with this stuff but if you’re new, out of the loop or just not so lucky, bids can be faulty or delayed, funding missed out on and, as a result, research careers damaged.

I say all this based, as mentioned at the start, on a survey I recently undertook of the websites of twenty leading UK universities whose websites I, as a random visitor, studied.  I found evidence of just under a third offering any kind of researcher-specific data management advice online (although it should be noted that I didn’t have access to staff-only intranets).  The other two-thirds of university websites apparently provided only records management advice for either unspecified types of records or specifically for administrative records only (although of course a lot of the practice outlined was still highly relevant to research data).

I gave myself five minutes on each site to get to the research data management advice, if it existed, by navigation of likely-looking links.  After that time, I resorted to the free text search box.   In ninety percent of cases, I had to use the all-site search in order to find any records management or information management guidance at all.   Only one of the twenty university websites appeared to offer any link between the data management advice pages and the IR.  (I’d be interested to know what percentage of university research staff at each institution know a) what an institutional repository is; b) whether they have one; c) what it’s called and d) where it is online. Hey, I think I’ll find that out …)

Only fifteen percent of websites visited listed their IR in the website A-Z index in a way that you’d be able to find it without knowing its cute, in-house name, and a quarter of institutions listed it only under this name.

So, in short, to improve matters, universities need to consider the pieces of guidance they already supply their research staff about data management, and draw them together to form comprehensive, simple resources that will make sense from the working researcher, with little time and no data-management specialist knowledge.  These resources should act as crossing points between previously-separate realms.  And this is where the opportunity is for Incremental to make things better for researchers.

If we can find good practice in UK university research data management guidance, whether that’s in a well-written list of FAQs, or a well-organised website pulling together guidance from across a university website into one accessible, obvious place, then all to the good.  If we can’t find this, or find enough of it, we need to start making it and positioning it on our respective university websites in a way that is prominent and intuitive for research staff of that university. These connections can be the crossing points to help researchers get to the guidance they need, when they need it, and if we manage that, I think Incremental’s job is done!

Does your university offer meaningful help with data management?  Or are you struggling to find the assistance you need to look after your data?  Are you responsible for promoting one of these services at your institution?  Let us know in the comments.


Vocabulary/jargon/terminology: synonyms and specialist language

July 14, 2010

As a non-specialist in data preservation research, I’m finding that my ignorance about a lot of sector-specific jargon (or perhaps ‘terminology’ is a bit more friendly : ) ) can actually work in our favour on this project.  Whilst those who know their data preservation/records management vocabulary employ it – as in any specialist field – for its role in accurate and concise communication, and this is crucial in order to advance specialist research in the field, the Incremental project is all about making research data management make sense to the users, whose primary activity is researching, say, medieval manuscripts or heart disease or Caravaggio’s painting methods or deep-sea mammal life.

The uncomfortable truth is that a lot of people – including those who work in research in universities – don’t even know records management or data curation (or whatever we think it should be called) even exists as a field.  Given this fact, it should come as no surprise that terms in regular use across information curation, data management and related fields mean little or nothing to a lot of people who, nevertheless, need to know about the ideas, methods and processes these baffling terms describe, in order to have a chance of accessing their valuable research data now or in the future.

So we know we have to translate research data management vocabulary from specialist to non-specialist – but how?  We need to find words and phrases that at least have a chance of making sense to researchers who are not specialists in information management / records management / data preservation / data curation.  See?  Even trying to get across this simple point leads me into a linguistic minefield.  Archivists, librarians, IT specialists and other professionals involved in information preservation don’t have an agreed vocabulary for activities and roles in this area, so we on this project really have our work cut out trying to make sense to anyone else!

As part of work this month to scope out the existing data management guidance resources in UK universities, I’ve stumbled across an example: in a few UK universities, ‘research data management’ is used to mean how a researcher goes about gathering information in the first place, i.e the nuts and bolts of the formulation of questionnaires and interview schema, of which software to use to record and share this data, and how researchers can order and manipulate their information as they work on their project.  To me, a lot of this is ‘research methods’, not ‘research data management’.  However, that’s how quite a few UK universities use the term on their guidance pages.  In other institutions, the term is used to mean how one looks after the data after the work is completed, and how one takes care of its longevity, access, integrity and security.  Some other institutions include both these processes in their advice, but that’s rare.  The Digital Curation Centre, a prominent force in, well, digital curation, offers a useful (if slightly intimidating, the first time you see it) lifecycle model (http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf) which supports this third, inclusive view, i.e. that data exists within a lifecycle, beginning with the creation or receipt of the data, and finishing with how to look after it in perpetuity (or until it should be disposed of).  One phrase: three related but distinct views of what it might mean.

There are probably researchers in UK universities with dozens of other ideas about what the phrase ‘research data management’ might mean or involve – and they probably don’t like the sound of any of them.  So how do we find vocabulary that makes sense to non-research-data-management-specialists, and actually makes them feel alright about engaging in good data management practices?  Is it all about sticking with an agreed established term and hoping that if we say it enough, users will eventually get their heads around what it means?  Is it about contextualising data management guidance and advice in researcher-friendly areas of institutional websites, alongside a whacking great glossary of what we mean by each of the terms we use?  Or should we proceed by enthusiastic promotion of the benefits to researchers of good data management, with the awkward vocabulary and frightening names for things tagged on – again with glossary – as unattractive but necessary addenda, like the copious small print at the bottom of loan adverts?