Skip to contents

Introduction

This vignette provides an overview of the R package europeanaR. The package can be used to query Europeana’s Search and Records API endpoints.

This version of the Europeana API gives users access to massive amounts of cultural heritage data. See the Europeana Search API documentation for more information on the Europeana API, including how to apply for access to the Europeana endpoints.

The Europeana search API

The Europeana search API permits the user to access large volumes of data, and provides some control on the richness of metadata returned.

“The Search API provides a way to search for metadata records and media on the Europeana repository, for instance give me all results for the word”Vermeer“. Besides the ability to directly search on Europeana, this API also provides an auxiliary method for translating queries and support for the OpenSearch.RSS protocol.”.

The new Europeana API refers to the version 2 API and is fully backwards compatible with the version 1 API. The v2 API is available with the request signature shown below.

api.europeana.eu/record/v2/search.json

In summary the Europeana Search API allows authorized user:

  1. Interact with the full data model of millions of cultural heritage items
  2. Refine your search with more advanced queries like Boolean Searches
  3. Filter out parts of the results advanced filtering
  4. Choose to only return objects which have certain copyright statements
  5. Return the results in a language of your choice
  6. Use auxiliary method for translating queries and support for the OpenSearch.RSS protocol

The Europeana Data Model (EDM) design principles are based on the core principles and best practices of the Semantic Web and Linked Data efforts to which Europeana contributes. The user can interact with the EDM via a variety of API endpoints. Europeana is not so much a portal defined by sheer volume as it is by its ability to make rich data and functionality available via API.

The search API, which is similar to the search on the website, is the most intuitive and simple way to interact with EDM. The search API returns results in JSON format and has a maximum item limit of 100. It does, however, support a cursored search that allows for bulk metadata downloads.

Setting up the API access

In order to get access to the Europeana API endpoints you must register in order to receive your free key. You should set it as environmental variable named as EUROPEANA_KEY in your R session. In the package there is simple helper function.

set_key(api_key = "ENTER_YOUR_PRIVATE_KEY")

Alternatively you can set the environmental variable EUROPEANA_KEY equal to your private key by typing usethis::edit_r_environ() and then editing the .Renviron file.

Because Europeana uses the Apache Solr platform to index its data, the Search API supports the Apache Lucene Query Syntax by default. To get the most out of the Europeana repository, advanced users should follow the Lucene and Apache SOLR guides. For others, basic querying guide can be found in the Europeana portal, and in the documentation of this package.

first_five_results <- query_search_api(query = "Mona Lisa", rows = 5)
item_metadata <- first_five_results$content

In the query parameter the user have access to rich search functionalities. Some examples are,

  • Simple search: query="Mona Lisa"
  • Boolean Search: query="mona AND lisa"
  • Range Search" query="[a TO Z]"
  • Timestamp Search: query="timestamp_created:"2013-03-16T20:26:27.168Z"

So far, we’ve only looked at cases where there was a single query term. It is sometimes useful to divide a query into a variable and a constant part. For example, for an application that only accesses objects in London, the constant part of the query can pre-select London-based objects and the variable part can select objects within this pre-selection.

resp <- query_search_api(query = "Westminster", qf = "where:London")

Query Translation

The Search API provides an auxiliary method for translating search terms into various languages so that they can be used in conjunction with the main Search API method. The Wikipedia API is used to perform the actual translation in this method. The method’s signature is as follows:

“europeana.eu/api/v2/translateQuery.json?wskey=YOUR_KEY&term=TERM&languageCodes=LANGUAGE_CODES”

*Example:" Get translations for “Notre Dame” in Dutch, English and Hungarian

resp <- query_search_api(languageCodes = "nl,en,hu",
                         term = "notre dame",
                         path = "/api/v2/translateQuery.json")
translations <- resp$content$translations

Notice: The APIs are served under a unified and dedicated address “https://api.europeana.eu/”. The path parameter can be used to access the required services.

Bulk Download

To get bulk downloads of specific subsets of items, including both their metadata and their associated media europeanaR provides the two functions tidy_cursored_search() and download_media() respectively.

Example: Download up to 500 items and their associated media using the keyword “animal”, created in the 17th century, and tagged as paintings.

res_bulk <- tidy_cursored_search(query = "animal",
                                 max_items = 500,
                                 qf = "when:17 AND what:painting")
path_to_download_folder <- download_media(res_bulk)

The ellipsis ... in the tidy_cursored_search() arguments passed given parameters to the underlying GET requests.

Records API

The Record API gives users direct access to Europeana data, which is modeled using EDM. While EDM is an open flexible data model with many different types of resources and relationships, the Record API allows for the retrieval of a segment of EDM for practical purposes (a subgraph, to use strict terminology). These “atomic” EDM segments typically contain one Cultural Heritage Object (CHO), the aggregation information that connects the metadata and digital representations, and a number of contextual resources pertaining to the CHO, such as agents, places, concepts, and time.

The id of the object is required by the Records API.

Example: Return three CHO for the query “animal” and augment their associated metadata from the Records API.

res <- query_search_api("animal", rows = 3)
#tidy data
dat <- tidy_search_items(res)
#get records for each item
cho_record_meta <- lapply(dat$id, query_record_api)

There is also an online console interface here: https://pro.europeana.eu/page/api-rest-console.