Let’s talk data
Data is at the heart of all activities related to open GLAM – and that is why its quality is crucial. We think that this is a good reason to start our webinar ”Open GLAM now!” with this topic! In this summary of our session ”Let’s talk data”, you are going to learn from our three speakers Dr. Karin Glasemann, Bengt Nordqvist and Marcus Smith about different aspects of and perspectives on data quality. If you want to start with an introduction to the topic, have a look here.
Why does “Open GLAM now!” start with this topic? As Karin Glasemann puts it in her presentation, institutions should start “thinking of data and not of applications”. Open GLAM is about making the digital and physical collections of cultural heritage institutions more accessible – and becoming more open institutions at large. The data and its quality is at the heart of this process, as we’ve learned in the presentations of our speakers during this session: It’s the basics to make other activities and tools possible.
Recorded session with all presentations, available (only in Swedish) on Youtube (CC BY).
Marcus Smith from the Swedish National Heritage Board started with an introduction about what data, metadata, and paradata actually is. As he is working with SOCH, the national aggregator to Europeana and a database for cultural heritage data in Sweden, he could share insights into what data quality is and what parameters are relevant for assessing it.
“We have to talk about data quality according to the context. There is no neutral way to measure it for all kinds of data sets. When we want to clarify what we mean by data quality, we have to measure it in relation to the aim and purpose that we want to use our data for.” Marcus Smith
Nevertheless, there are some characteristics that indicate your data quality level, such as accuracy, resolution, licensing, structure, usability, machine-readable formats and links to other data, authorities and controlled vocabularies. Marcus describes one example of measuring quality with the Europeana Publishing Framework which states different parameters for measuring data quality in the context of sharing data with Europeana. It demonstrates that the better your data quality is (according to this framework), the more users and organizations can do with it – which helps institutions to reach their goals in publishing data in Europeana. He also presents a tool developed at the Swedish National Heritage Board to help institutions who share their data with SOCH to increase their data quality in working on their licensing.
From what data quality is and how it is assessed, we turn to a perspective on the institutional level. Dr. Karin Glasemann shared her insights into the development of the National Museum of Fine Arts (Nationalmuseum) in Stockholm. As a Digital Coordinator, she managed the process of digitizing and documenting the museum’s artworks during the move and renovation of the museum. She describes the difficulties and challenges of digitizing large quantities of objects during a short period. A crucial aspect to increase their metadata and data quality was to develop internal routines and processes to collect all information about the collection at one access point. Another valuable lesson was that your data will never be perfect – every institution has to start to digitize at some point with the resources at hand and define their “good enough” level of data quality in order to get it out there in the world.
What Nationalmuseum experienced after releasing their data with open licenses both via their own website and on Wikimedia Commons and Europeana, was that their collections now are used in all possible contexts, from articles and apps to social media and memes. By making their data accessible, digitized collection can spread and find users outside of the institution:
“What we learned during this process was it actually means to be accessible in a digital world. I think it’s most important that the digitized collections that museums manage are at the heart of all activities. […] But at the same time […] it’s not museums that are the main users of these resources, but it’s the world outside. You cannot obtain the biggest effects without daring to release your material.” Karin Glasemann
In the last presentation of this session, Bengt Nordqvist from Jamtli talked about the museum’s activities in digitizing their collections and catalogs and how those evolved. He presents how they work with those registers and catalogs they have. Jamtli has an own digital archive on its website: It enables people to enrich the museum’s knowledge (and hence metadata) about their own collections, as users can leave comments with information for example on people or places visible in photographs. He also shared insights into the challenges of working with the institution’s digital collections and digitizing the physical collections.
Take-away lessons of this session
- There are many different parameters to measure data quality – you have to focus on those that help your institution to reach the goals you want to obtain with your data.
- Defining internal processes and routines can help to include your whole institution in understanding why making digital collection data and metadata openly available is important to reach your goals.
- Your data will never be perfect. Find your “good enough” level to publish it, so others and your institution can use it.
- As soon as your data is published, you can profit from your audience supporting you to increasing your data quality, for example by enriching your metadata.
In the next session, we are exploring how to work together with other institutions and networks, such as SOCH, Europeana and Wikimedia – and how this can support you in reaching your institutions’ goals.
Marcus’ recommendations for further reading
- Publishing Guide: https://pro.europeana.eu/post/publication-policy
- Publishing Framework: https://pro.europeana.eu/post/publishing-framework
- Archaeology Data Service
- Guides to Good Practice: http://guides.archaeologydataservice.ac.uk/