Let’s talk data
Data is at the heart of all activities related to open GLAM – and that is why its quality is crucial. A good reason to start our webinar series “Open GLAM now!” with this topic! You are going to learn from our three speakers Dr. Karin Glasemann (Nationalmuseum), Bengt Nordqvist (Jamtli) and Marcus Smith (Swedish National Heritage Board) about different aspects of and perspectives on data quality.
Why does the series start with this topic? As Karin Glasemann put it in her presentation, institutions should start “thinking of data and not of applications”. Open GLAM is about making the digital and physical collections of cultural heritage institutions more accessible – and becoming more open institutions at large. The data and its quality is at the heart of this process, as we’ve learned in the presentations of our speakers during this session: It’s the basics to make other activities and tools possible.
Recorded session with all presentations, available (only in Swedish) on Youtube (CC BY).
Perspectives on data and its quality
Marcus Smith started with an introduction about what data, metadata, and paradata actually are (presentation in Swedish). As he works with SOCH, the national aggregator to Europeana and a database for cultural heritage data in Sweden, he could share insights into what data quality is and what parameters are relevant for assessing it.
“We have to talk about data quality according to the context. There is no neutral way to measure it for all kinds of data sets. When we want to clarify what we mean by data quality, we have to measure it in relation to the aim and purpose that we want to use our data for.” Marcus Smith
Nevertheless, there are some characteristics that indicate your data quality level, such as accuracy, resolution, licensing, structure, usability, machine-readable formats and links to other data, authorities and controlled vocabularies. Marcus described one example of measuring quality with the Europeana Publishing Framework which states different parameters for measuring data quality in the context of sharing data with Europeana. It demonstrates that the better your data quality is (according to this framework), the more users and organisations can do with it – which helps institutions to reach their goals in publishing data in Europeana. He also presents a tool developed at the Swedish National Heritage Board to help institutions who share their data with SOCH to increase their data quality in working on their licensing.
How to increase data quality with documentation and optimised internal processes
From what data quality is and how it is assessed, we turn to a perspective on the institutional level. Dr. Karin Glasemann shared her insights into the development of the National Museum of Fine Arts (Nationalmuseum) in Stockholm (presentation in Swedish). As a Digital Coordinator, she managed the process of digitising and documenting the museum’s artworks during the move and renovation of the museum. She described the difficulties and challenges of digitising large quantities of objects during a short period. A crucial aspect to increase their metadata and data quality was to develop internal routines and processes to collect all information about the collection at one access point. Another valuable lesson was that your data will never be perfect – every institution has to start to digitise at some point with the resources at hand and define their “good enough” level of data quality in order to get it out there in the world.
What the Nationalmuseum experienced after releasing their data with open licences both via their own website and on Wikimedia Commons and Europeana, was that their collections are now used in all possible contexts, from articles and apps to social media and memes. By making their data accessible, digitised collections can spread and find users outside of the institution:
“What we learned during this process was it actually means to be accessible in a digital world. I think it’s most important that the digitised collections that museums manage are at the heart of all activities. […] But at the same time […] it’s not museums that are the main users of these resources, but it’s the world outside. You cannot obtain the biggest effects without daring to release your material.” Karin Glasemann
In the last presentation of this session, Bengt Nordqvist talked about the museum’s activities in digitising their collections and catalogs and how those evolved. He presented how they work with those registers and catalogs they have. Jamtli has an own digital archive on its website: It enables people to enrich the museum’s knowledge (and hence metadata) about their own collections, as users can leave comments with information for example on people or places visible in photographs. He also shared insights into the challenges of working with the institution’s digital collections and digitising the physical collections.
Take-away lessons of this session
- There are many different parameters to measure data quality – you have to focus on those that help your institution to reach the goals you want to obtain with your data.
- Defining internal processes and routines can help to include your whole institution in understanding why making digital collection data and metadata openly available is important to reach your goals.
- Your data will never be perfect. Find your “good enough” level to publish it, so others and your institution can use it.
- As soon as your data is published, you can profit from your audience supporting you to increase your data quality, for example by enriching your metadata.
Albin Larsson (2019): An Actionable Approach to Data Quality for Cultural Heritage Institutions.
Archaeology Data Service: Guides to Good Practice
Measuring Metadata Quality and the Europeana use case. 4th Linked Data Quality Workshop.