Link your data
Linking cultural heritage data and publishing it openly allows for diverse types of re-use and research. Linked open data refers to interconnecting data and datasets in a way that is not only readable by humans, but also computers. If metadata and data are published openly, they can be connected and enriched by others. One of the major advantages of linked open data is that it becomes searchable and enables queries.
In this session, we first got an introduction to linked open data (even called structured open data) from David Haskiya (formerly Swedish National Heritage Board). Alicia Fagerving (Wikimedia Sweden) presented Wikidata, an open and free knowledge base and central storage for linked data for Wikimedia sister projects.
David Haskiya (slides/presentation in Swedish) started his introduction with a visualisation to explain what linked data is all about. He used the example of August Strindberg, a Swedish author, and the relations between him and other places, persons and topics. Linked data comes in triples, with a subject, an object and a predicate describing the relation between the two. Following the example, a triple could be August Strindberg (subject) studied at (predicate) Uppsala University (object). Each of those is unambiguously identified by a uniform resource identifier (URI), linking the concept of “August Strindberg” to a clarifying authority file.
David also explained the four principles for linked data and how linked data becomes linked open data. That is the case when the author or owner of the data releases it with an open licence, such as CC0 for metadata. This is actually a condition to allow others creating links, editing content and re-using the information. David also presented examples for linked open data and how it used in cultural heritage contexts. Europeana, for example, uses linked open data for their search suggestions, collecting data from and about specific artists, authors or others on dedicated sites, and allowing the platform to become multilingual. Other examples were the Open Library, SOCH, and Kringla as well as generous user-interfaces and the possibilities for data visualisation. Some of these, such as SOCH, also allow the user to create own links and collaborate on linked data:
“We have a certain amount of knowledge about the content of our collections. Still, it’s often the case that people outside of our institutions who know even more about a specific object or person. It can be helpful to open up and give them the opportunity to enrich your data.” David Haskiya
In the last part of his presentation, David turned to the first steps cultural heritage institutions can take if they want to dive deeper into linked open data. The most important step is to make sure that working with linked open data is actually the right means to connect to your audience and fulfill your own mission. If so, institutions should take a look around, talk to colleagues and learn about the experiences of others. David recommended taking part in existing cooperations and platforms to learn more about linked open data in practice. As data is at the heart of this activity, investing in data quality and cataloging should become a priority; working with vocabularies and authorities is a good way to start (David provided a list here; data quality was also in the focus of our session “Let’s talk data”).
Wikidata as a linked open data opportunity for GLAM institutions
From a more general perspective on linked open data, we turned to a specific example of linked open data: Alicia Fagerving (presentation in Swedish) presented Wikidata, a free and open platform for structured data. Created in 2012, Wikidata is a sister project of Wikipedia.
Alicia explained the possibilities to get involved with Wikidata: Everyone can edit its more than 70 million objects. The data is available under CC0 and hence re-usable in all kinds of contexts:
“You can re-use data from Wikidata without attribution. This is of huge relevance for the practice: If you use mixed data from a range of different sources, it is not easy to do so if you have to state the licence for every source and every data object.” Alicia Fagerving
As a central database for the information displayed on Wikipedia, it is also the solution for updating non-language-based information on different Wikipedia pages at the same time. Alicia presented infoboxes on Wikipedia as an example, which contain information such as birth dates and inhabitant numbers.
On Wikidata, every object has its unique Q number. Alicia presented how triples help to assemble facts about the item. To prove that something is important enough for Wikidata and is identifiable, those are linked for example to external authority files such as Libris or inventory numbers of cultural heritage institutions. With the help of the semantic query language SPARQL, users can ask questions about all the data in Wikidata – and well-modeled linked data allows users to find exactly the answers they need.
As Alicia explained, Wikidata is a resource for Wikipedia, but you can use it in other contexts, too, as it offers open information about all kinds of topics. Cultural heritage institutions can use Wikidata to spread knowledge about historical people and objects in their area of expertise. A start could be to add details about an artist or an artwork in your collection or updating the information concerning your institution in Wikidata (this will also influence what search engines such as Google display in their search results about your institution). They also highlighted the fact that a large share of the data in Wikidata actually comes from imports from other databases – and cultural heritage institutions can work with Wikimedia and do not have to work on uploading their data to Wikidata manually.
- Linked open data is information stored in triples, with a subject and object linked by a predicate describing their specific relationship, released under an open licence. External authority files allow unique identification of items.
- It enables users to query the data and ask specific questions about it.
- Cultural heritage institutions can get involved by joining existing initiatives and organisations, such as Europeana, Wikidata or others, sharing their data openly or contributing references and knowledge from their collections to Wikidata.
- Data quality and rich metadata are especially important activities institutions can begin with, as it not only helps them in linked open data projects but in almost all digital endeavours.
Go to the next session on user-generated content (UGC)!
Association of Research Libraries (2019): White Paper on Wikidata. Opportunities and Recommendations.
Europeana: Linked Open Data.
Manu Sporny: What is Linked Data? on Youtube
Mariana Ziku (2020): Digital Cultural Heritage and Linked Data: Semantically-informed conceptualisations and practices with a focus on intangible cultural heritage. LIBER Quarterly, 30(1), pp.1–16. DOI: http://doi.org/10.18352/lq.10315
Urban Complexity Lab, University of Applied Sciences Potsdam: Examples for generous user-interfaces and visualisation.
Wikidata: Introduction, including tours for new users.
Wikidata – SPARQL:
- the Wikidata Query Service
- a SPARQL tutorial, provided by Wikidata
- Examples for SPARQL queries
Wikipedia: Linked data.