Digital repositories: Dealing with the digital deluge

By David Davies June 6, 2007

I’m at the JISC digital repositories conference. So far a real mixed bag. As you might expect there’s a lot of waffle about web 2.0 and how disruptive technologies are changing the nature of repositories. I say waffle because the discussion about web 2.0 largely seems to be limited by web 1.0 or 0.1 thinking. It’s no longer sufficient I think to put up a slide listing so-called 2.0 services, you know, the usual suspects, Flickr, MySpace, YouTube … etc. You don’t get insight into how these technologies are changing the teaching & learning landscape just by naming them. Sadly the conclusion after one of the keynotes was to the effect “we need to learn what makes web 2.0 services so successful and apply that to repositories of the future”. Sorry mate, the world won’t wait for you and by the time you’ve figured it out it’ll have moved on.

A question was asked in one of the sessions. “What will a web 2.0 repository look like?” I don’t know what the repository will look like, but for me the interface will be my RSS aggregator.

Oh and also at the conference the JISC launched Depot. Another repository, this time for ePrints. It’s aimed at UK academics who don’t have access to their own institutional ePrints repository. The FAQs are great:

4. What can I put in the Depot?

“Typically this will be an electronic duplicate of a peer-reviewed journal article […] this is the version of your article after all of the changes due to the peer review process have been incorporated into the text.”

5. Will my publisher allow me to do this?

“This depends on the Copyright Transfer Agreement that you have signed with your publisher. Most publishers allow some sort of self archiving. […] If your publishing agreement does not allow you to deposit in a repository, it might be possible to negotiate with the publisher to give you a right to archive your article.”

So that’s likely a ‘no’, then, at least if you want to share your work with others via the repository.

By David Davies

19 thought on “Digital repositories: Dealing with the digital deluge”

Peter Burnhill says:

June 7, 2007 at 12:39 pm

You seem to be selective in your reporting, not sure why. In particular, you have edited out the sentence that says ” Most publishers allow some sort of self archiving”.

The FAQ for the Depot actually says:

***
5 Will my publisher allow me to do this?
This depends on the Copyright Transfer Agreement that you have signed with your publisher. Most publishers allow some sort of self archiving. If you do not have a convenient copy of the contract that you signed with your publisher, then consult RoMEO, http://www.sherpa.ac.uk/romeo. This is a service that is run by SHERPA which lists the details of standard Copyright Transfer Agreements as they are given by different publishers. The database can be searched by publisher name or by journal title.
“If your publishing agreement does not allow you to deposit in a repository, it might be possible to negotiate with the publisher to give you a right to archive your article. In the first instance write to the editor of the journal in which you published your article and ask their permission.”

But thanks for the publicity, and your comments are useful feedback perhaps that the FAQ needs looking at again in order to avoid misunderstanding from such selective reading.
David says:

June 7, 2007 at 12:57 pm

I guess experiences vary and I speak as I find. Our institution is piloting an ePrints repository and the experience to date has been that it’s very difficult to get approval for the final or near final versions of submitted manuscripts. Our pilot has therefore had to use less useful pre-editorial versions, which is not terribly useful.
Peter Burnhill says:

June 7, 2007 at 1:11 pm

If you are from an institution in the ac.uk domain, why not try the Depot, both for your own material, or even promote the facility among your colleagues until you get beyond the pilot phase.

Two things to note:

1. You can always deposit your work, its the access that might be constrained – but read through all the FAQ, especially the part about ‘closed access’ and the Request button.

2. “Put it in the Depot” is good advice to give even to those who do have an IR, as there is mechanism in the Depot to re-direct them to the homepage of their Insititional Repository where this exists.

Note also that there is facility for you to give other detailled feedback that might help improve the utility of the Depot, for you and for others.

Re-reading your initial comments, might I ask that you edit these – at least to add the missing sentence: “Most publishers allow some sort of self archivingâ€.

The omission, even if signalled by [..] does mislead.
David says:

June 7, 2007 at 1:22 pm

I’ll pass your comments on to our library group. As we do have an institutional repository, albeit in pilot, it makes little sense to dilute that by submitting e-prints elsewhere, but I’ll certainly make sure colleagues are aware of that as an option. I have linked to the Deport FAQs for readers to read the full version. The spirit of my piece was that in my experience and in that of many colleagues they are not allowed by [many/some/a number] of publishers to deposit electronic copies of published papers for sharing with colleagues, thereby rendering a repository less useful that it might be. This is not a criticism of repositories, but of copyright restrictions, of which there clearly need to be some. The point about having an institutional repository pilot is to explore these issues and how we manage our e-prints and IP.
Peter Burnhill says:

June 7, 2007 at 2:09 pm

Thanks, but to repeat, or maybe to re-state:

(i) The Depot has both a re-direct function and a repository function: if you have an IR (pilot or not) that you want populated, then register with OpenDOAR and promoting the Depot has the effect of directing traffic to your IR website; otherwise the researcher/author can use the Depot as a keep-safe that exposes the article under OA and is there for the IR to take over when beyond the pilot stage.

Have a look at
http://www.opendoar.org/countrylist.phpcContinent=Europe#United%20Kingdom

and see whether your institution is there. Better still, go to
depot.edina.ac.uk
and try it out.

(ii) And, note that you are mis-quoting the FAQ by excluding a statement that is counter to your concluding remark.

It would cost little to quote the sentence in full: â€œMost publishers allow some sort of self archivingâ€. You are entitled to your concluding remark, but it is surely not proper to continue to mis-quote someone else as eveidence in support of your remark.

All that said, I sympathise with the frustration that you express. Incidentally, you may find the following of interest:

http://listserver.sigmaxi.org/sc/wa.exe?A2=ind07&L=american-scientist-open-access-forum&D=1&O=D&F=l&S=&P=67920
David says:

June 7, 2007 at 2:37 pm

I guess we’ll just have to see how our institutional repository evolves, and how the Depot evolves alongside this. Incidentally one strategy that we have considered is choosing where to publish taking into account any copyright restrictions placed upon authors, but clearly in an RAE-driven world it’s not always that simple.

In a spirit of fairness and balance I have taken into account your request to edit the FAQ quote, and additionally to make more clear my concluding remark I’ve added “at least if you want to share your work with others via the repository.”

Thanks for taking the time to leave a comment.
Tom says:

June 8, 2007 at 11:14 am

David, I’m a bit confused by your negative take on the opening speech. You say “Sadly the conclusion after one of the keynotes was to the effect â€œwe need to learn what makes web 2.0 services so successful and apply that to repositories of the futureâ€.

I’m not sure why it’s sad that the speaker was saying that those involved in the design and creation of repositories need to find ways of making repositories more succesful. Or were you suggesting repositories should be continually doing this anyway, and that kind of response is too slow?

I took his comments to mean that we need to be aware of what users of repositories (directly or indirectly) are used to, and what we can do to make repositories fit more fluently and accurately to the user’s needs.

Do you have your own thoughts on how repositories may develop, or are you taking the view that they’re doomed because of restrictive publisher policies?
David says:

June 8, 2007 at 11:55 am

Hi Tom. I’ve now been to too many presentations about repositories where the same old now slightly hackneyed list of so-called web 2.0 services are trotted out as exemplars of where ‘we’ should be ‘going’ because ‘the kids’ are using these services while we’re still struggling to make any significant progress with over engineered systems bereft of any significant content. Seldom if ever have I seen any insight gained by simply listing these, yet the obligatory web 2.0 slide remains in most ed tech presentations. The point I made in my post was that the world will not stand still waiting for educational technologists to catch up (and catch on) and now more than ever unless we start seeing some next generation repositories that are successful and show a benefit then the whole concept will become moribund. So yes, I do think repository development has generally been slow but I did not say or imply that trying to find a way of making repositories more useful was sad and it’s clearly ridiculous to suggest so.

By and large many repositories have been built they ‘they’ have not come. I am actually a great fan of repositories, or at least the concept of putting stuff where other people can find/use it, but for some reason that I’ve yet been able to fathom the presentations that have irked the most tend to be at JISC meetings. On the first day at the Manchester meeting I heard about project after project that had built something, but was now languishing due to lack of content or interest. The answer, if one were needed, to why have very few repositories made a significant impact is not to build more of the same.

I’m not sure if you were making any points specific to ePrints but here’s a extra observation anyway. Our institution is not alone in being frustrated, in relation to ePrints for example, that we have not been able to use the materials we would like because of publisher restrictions. Is it because we didn’t ask the right questions of the right publishers, maybe but unlikely. Is it because everyone else is having great success with their repository but we’re not, no clearly that’s not the case as our experience is shared by many. Is it because there are publisher restrictions holding back the free use of intellectual property paid for in many cases by public money or charitable funds, yes! If we weren’t so driven by the RAE that allows publishers to monopolise the custom and practice of the current peer reviewed literature, then I am sure our institution would encourage its academics to publish in open, less restrictive channels.
Tom says:

June 8, 2007 at 7:22 pm

David, thanks for the clarifications – I’m broadly in agreement with most of what you say.

A couple of extra points, I think the chances of encouraging academics to publish in less restrictive channels are pretty slim (at least for the moment). From my experience, most academics would be very dismissive of any suggestions from others as to where they should publish. One suggestion was raised in one of the conference sessions regarding universities claiming ownership of academic output, and therefore choosing where it should be published. I suspect any attempt at that would be futile and may cause uproar.

My personal view is that a lot of repositories struggle for initial content because they target academics’ previous papers, rather than waiting for their future output. If you ask an author for ‘author final versions’ of their papers from the previous x amount of years, they’re likely to: A – not have them, B – fail to see why they should go to the effort of putting them in the repository.

If, on the other hand, the encouragement was to pass a suitable version to the repository during the publishing process, perhaps there would be more participation (although it would mean many repositories would have to wait to start growing faster). Another alternative might be for repositories to volunteer to help with the submission of articles that have been funded by the funding councils. This could keep researchers happy by reducing their administrative workloads, while ensuring the repository got hold of a suitable copy at the time of submission.

Obviously the suggestions above wouldn’t help with those publishers who allowed no archiving whatsoever. I guess the other option is to wait for academics to produce a system amongst themselves that encourages sharing of research, rather than trying to predict the best way of doing so!
David says:

June 11, 2007 at 5:21 am

Another option is to recognise that many institutions already have a database of published papers through something like a research support service as part of preparation for the RAE. These databases are not repositories in that they don’t contain an electronic copy of the published work, but instead are more like referatories because they contain the citation and other biographical details, and some even contain a link (or DOI reference) to the published work. Many institutions have had these for some years and as a result are already well populated, comprehensive and authoritative. When we discussed our institution’s repository the research database came up as a potential source for (meta)data. Unfortunately this is a standalone system with limited interoperability.
David says:

June 12, 2007 at 10:22 am

A question for anyone reading these comments. What/who is driving the move towards ePrints repositories? In my experience it’s not academics at least not in my discipline (biomedical science), nor is it librarians. So who?
Bill Hubbard says:

June 12, 2007 at 10:54 am

I share many of your frustrations with the current situation regarding repository population.

As far as publisher policies blocking deposition and open access, this does vary between disciplines, as different publishers with different policies concentrate in different areas. (Of course, there are some larger publishers that publish across disciplines.)

However, overall the picture is brighter. It maybe interesting to note that 61% of publishers will allow authors to self-archive their final version of their article. This is often the author’s own final version of the article, but can also sometimes be the publishers’ pdf. (Source: RoMEO statistics:http://www.sherpa.ac.uk/romeo.php?stats=yes) The authors final version is that which is “signed off” by the author, so should be the same content as that published.

That is 61% of publishers: as this contains some publishers with very large number of journals, the percentage of journals allowing self-archiving of final versions is higher – over 80%

A significant difficulty with populating repositories that many have found is that authors do not keep copies of their own final versions. Sometimes publishing processes are done with on-line edits so the author would have to specifically save their own final version, which they are not used to doing, or which may be tricky to do. Without the author keeping a copy of their work, many publishers have restrictions on the use of the final published versions and so the result is often that although the publisher allows archiving, there is no suitable version to use.

Of course, there are more factors that this that affect repository population: some authors are unwilling to archive anything apart from the published version; the RAE process is seen to prefer the publishers’ version, and other things like support, changing working habits etc.

As far as embargoes go: these are an unfortunate and recent addition to the scene. Just a couple of years ago, very few publishers had embargoes, but some major publishers have now introduced them. For RoMEO purposes, any embargo counts as precluding open access archiving (though of course we do note that a delayed form of archiving is possible) so the statistic quoted above accounts for embargoes.

I just wanted to point out that:
* if your remark about the answer being no if you want to share your work with others is based on the idea that articles will be embargoed, then the situation is more positive than that, as the stats refer to unembargoed archiving
* as far as publishers’ archiving *permissions* go, the percentages suggest that instead of the answer being likely a â€˜noâ€™ – the answer is actually likely a ‘yes’

Your remark about your research publications database is interesting. Any system which allows a “type-once” process to work is valuable. A number of places are trying to synchronise their bibliographic records with their archive, pre-populating metadata fields for the author. Then all that would be needed for the record to go live would be the supply of a suitable version – but thus to the difficulty above.
David says:

June 12, 2007 at 12:19 pm

Bill, thanks for the comment and the hard data, which certainly makes interesting reading. Your observation matches my experience, in that papers I or my colleagues have submitted have been changed by the review process by time they get published, and that as authors we often have no easy access to an electronic copy of the final version, other than to the PDF that appears in the online journal. There is a kind of Catch-22 in this, something like, you can self archive if you can get a final copy, but as it’s difficult to get an electronic final copy you can’t easily archive in practice.

This makes me wonder about the motivation for ePrint repositories:

1. To allow individual academics or groups keep electronic copies of their own papers.
2. To share published research with others including academia and the public.
3. For institutions to keep track of its research activity and output and IP.
4. As an alternate to publishing in conventional journals.

Option 1 sounds reasonable but to date the practice has been limited by technology (including skills to use available technology), copyright, and lack of interest (or awareness) by individual academics.

Option 2 appears to be increasing as funding bodies encourage more open access to research output, especially for publicly funded research. But there is a tension with the commercial interests of publishers who have costs associated with services such as editorial and peer review, printing, and marketing, etc. This has generated discussion in alternate ways of funding publishing including pay-to-publish rather than pay-to-read.

Option 3 doesn’t require an electronic copy of the research paper itself, rather as a minimum a record in a research database including details of the publication, funding source and other data.

Option 4 may be more attractive to some disciplines than to others, and there are often still costs that need to be met with alternative approaches to publishing.

This is almost certainly a simplistic analysis, but then again this is a weblog opinion piece rather than a peer reviewed publication. Browsing through a selection of the SHERPA repositories (picking out some of the big named institutions) shows a few well populated ePrint repositories but a lot more that are less well or even poorly populated. Maybe it’s still early days. I’m still curious to know who’s driving this if it’s not academics or their institutions.
Peter Burnhill says:

June 12, 2007 at 2:31 pm

Have you yet tried to ‘put it in the Depot’? That way you can test the conclusion you have come to about “shar[ing] your work with others via the [Depot] repository”.

It’s likely that when you attempt to do so you will be given opportunity to be re-directed to your local IR, but note that you can use the Depot – see the link at the bottom. Try depositing some of your own material, you certainly can deposit; then using the RoMEO database – which as Bill points out does indicate permissions for about 80% of journals listed there, you should be able to make these Open Access and so “share your work with others via the repository”. For those journals/publishers that dont permit such sharing, you can still expose the metadata, and who knows, someone might request a copy from you.

Here’s hoping to help increase content, as you say you would wish.

As to who are the drivers, well it would seem that you say you are.
David says:

June 12, 2007 at 4:08 pm

Funnily enough I have two publications in preparation, both of which I intend to submit to our institutional repository. And of course I’ll be sharing my experience of how I get on. But will publishing a copy in an ePrint repository increase my readership rather than through the usual journal route or citation indices, nobody will want to read anything I publish, surely? Oh wait, that’s why you’re here 😉
Andy Powell says:

June 15, 2007 at 3:51 pm

Hi David,
I gave the keynote you refer to in your open post… in your terms, I guess that I was the one waffling on about Web 2.0 services? Naturally, I beg to differ and to be honest, I’m struggling to see why you gave such negative treatment of my talk, since we basically appear to be in complete agreement.

I only listed two Web 2.0 services – neither of them were any of the ones you mention. I chose services that I thought were directly relevant to the repositories space – Slideshare and Scribd. You end with

“Sorry mate, the world wonâ€™t wait for you and by the time youâ€™ve figured it out itâ€™ll have moved on.”

Well, sorry mate, what are we supposed to do, carry on building services that nobody wants to use and that are sitting around empty of content? 🙂

I’m arguing that we need a significant change of approach – I think you are saying the same thing? So I really don’t understand what you found so objectionable in my talk??
David says:

June 15, 2007 at 4:31 pm

Hi Andy. Had I found your talk objectionable then I would have said so, and would have said I disagree with Andy Powell. However that was not my point. In the end I used one of your closing lines as a the crux of my piece. Had you had more time, which is what I suspect you ran out of from my recollection of the presentation, you would no doubt have gone on to set out an update to the repositories roadmap that took into account what we might be able to learn from so-called social repositories.

I purposefully didn’t mention you directly or the fact that it was your keynote because I did not intend an ad hominem criticism of you. I intended, but clearly failed judging by the initial reaction in comments here, to make a general point about how I have observed many commentators look to web 2.0 as salvation for less than successful web 1.0 learning technologies. However I don’t believe you can easily look at what makes service A or service B successful, then try to replicate it, at least not as a generalisation. What causes any one service to go beyond the tipping point and achieve what Flickr has achieved or any of the other usual 2.0 suspects has eluded most educational technologists. My post was a response to a disappointing conference (for me) hearing reports from unsuccessful development projects with little or no uptake.

Yes I do agree with you that it is pointless to continue to invest in technologies that nobody wants to use and that are sitting around empty of content. Let’s hope future funding panels agree with us. Ok mate? 🙂

Comments are closed.

Digital repositories: Dealing with the digital deluge

By David Davies

Related Post

19 thought on “Digital repositories: Dealing with the digital deluge”

You Missed

Flickr extinguished

Like the idea of Facebook University?

Reflections on the 1:AM altmetrics conference

Institutional VLEs, why bother?