As part of the initial research for the Keepers Extra project we have been speaking to archiving agencies about their use of the Keepers Registry and about the global digital preservation landscape more generally. Several common themes have emerged from these discussions. We were delighted to hear that the Keepers Registry is highly regarded among the Keeper agencies and potential Keepers, and viewed as an important service and a way to increase the visibility of work in the field of digital preservation. There is wide recognition that the Keepers Registry occupies a unique position in having established productive working relationships with many major archiving agencies, and that this is a positive position from which to facilitate communication and collaboration.
There was also broad consensus on the need for more discussion between the Keepers, particularly around the areas of standardisation of data and tackling the long tail. Many of the Keeper agencies wish to use the Keepers Registry in order to analyse gaps and overlaps in what is being preserved. For some, this would be a way of analysing their own collections with a view to working at the title level to complete runs of particular journals. For others, it would offer a way to identify material ‘at risk of loss’ and therefore a way to prioritise publishers or titles for preservation. In both cases, doing such analyses quickly and efficiently depends on being able to access easily comparable data, so a better standardisation of data would be very helpful. This would also assist the sharing of data and impact on the ways in which an API could be used to integrate the Keepers Registry information into other systems and processes.
A further common theme was the challenge of preserving the ‘long tail’ of e-journals produced by small publishers and bodies such as academic societies or university departments. The key issues here are funding, scalability (or lack thereof), and division of labour. Reaching out to small publishers takes a lot of resources, human and financial, so this work is expensive. For every publisher that an agency works with, there are negotiations around a contract and costs around setting up technology and establishing protocols. If that publisher produces 300 journals, there is an economy of scale that justifies the cost. However, if that publisher produces only one journal, it suddenly becomes a very expensive process indeed. In such a context, having multiple agencies spend those resources on the same material seems illogical, yet there is no established way for agencies to cooperate to ensure as broad a coverage as possible. So there seem to be two potential ways of approaching this challenge: on the one hand finding ways to scale up the work and, on the other, finding ways to meet or lower the costs.