EOSC Research Software APIs and Connectors (RSAC)

RSAC

Overview

The Research Software APIs and Connectors (RSAC) ensure the long-term preservation of research software in different disciplines. APIs and connectors will be developed to interconnect research outputs infrastructures with the Software Heritage universal source code archive, using the CodeMeta standard, and the Software Heritage intrinsic identifiers (SWHID). The development of the sub-components is divided by the type of the infrastructure: scholarly repositories, publishers, and aggregators.

In 2021, the Scholarly Infrastructures of Research Software (SIRS) report[1], was published including a set of recommendations to allow EOSC to include software, in the scholarly ecosystem, next to publications and data. The SIRS report was built upon a survey and documentation of a representative panel of notable operational infrastructures across Europe, comparing their scopes and approaches. A subset of these Research Software infrastructures joined forces to turn the SIRS recommendations into reality through the FAIRCORE4EOSC project.

One of the WP6 objectives is to develop tools and services for archival, reference, description, and citation of research software artifacts, by implementing the key recommendations of the EOSC SIRS report to interconnect scholarly repositories, publishers, and aggregators with the Software Heritage universal source code archive, using the CodeMeta standard, and the Software Heritage intrinsic identifiers (SWHID).

API and Connectors Between Scholarly Repositories, Publishers, Aggregators, and Software Heritage will be integrated into the following operational infrastructures that are used by research communities:

Scholarly repositories:

  • InvenioRDM - SWH (CERN)
  • DataVerse - SWH (KNAW-DANS)

Publishers:

  • Dagstuhl - SWH (LZI)
  • Episciences - SWH (INRIA)

Aggregators:

  • swMATH - SWH (FIZ)
  • OpenAire - SWH (OPENAIRE)

1 European Commission, Directorate-General for Research and Innovation, Scholarly infrastructures for research software: report from the EOSC Executive Board Working Group (WG) Architecture Task Force (TF) SIRS, Publications Office, 2020, https://data.europa.eu/doi/10.2777/28598

APPLY TO TEST FAIRCORE4EOSC COMPONENTS

  • Develop API and connectors between Scholarly Repositories and Software Heritage to support archival, reference, description, and citation as follows:
    • deposit in Software Heritage research software artifacts uploaded into scholarly repositories;
    • obtaining the corresponding SWHID;
    • expose SWHID in the artifact record maintained by the scholarly repositories;
    • enable scholarly repositories to deposit and/or retrieve (curated) metadata of software artifacts from Software Heritage;
    • export citation information in one or more of the common open citation formats (BibLaTeX, CSL, codemeta.json).
  • Develop API and connectors between Open Access publishers and Software Heritage to support archival, reference, description, and citation as follows:
    • automate the archival in Software Heritage of the source code of artifacts associated with research articles;
    • expose the corresponding SWHID on the journal’s publication record;
    • enable the deposit and retrieval in Software Heritage of curated metadata for software associated with publications;
    • deposit and retrieve the preferred citation information for software associated with publications;
    • export citation information in one or more of the common open citation formats (BibLaTeX, CSL, codemeta.json).
  • Develop API and connectors between aggregators and Software Heritage to support archival, reference, description, and citation as follows:
    • extract references to software from research articles and trigger archival of source code known to aggregators and missing in Software Heritage;
    • expose the corresponding SWHID in the software artifact record maintained by the aggregator;
    • deposit and retrieve metadata for the software artifacts in Software Heritage;
    • deposit and/or retrieve preferred citation information for the software artifacts in Software Heritage;
    • export citation information in one or more of the common open citation formats (BibLaTeX, CSL, codemeta.json).

The RSAC (Research Software APIs and Connectors) component will improve interoperability between various infrastructures catering to research software. This component has a significant impact on the four pillars of the SIRS report, namely Archive, Reference, Describe, and Cite.

  • Archive: Creating interoperability between the infrastructures to preserve research software artifacts and their metadata in the universal source code archive, Software Heritage.
  • Reference: Adoption of the SWHID (Software Heritage Identifier) to identify software artifacts accurately and reference specific versions of the software. 
  • Describe: Exchange metadata about research software using the CodeMeta vocabulary, thereby enabling an interoperable ecosystem.
  • Cite: Align the citation export formats for research software following the biblatex-software specialised entry types for software 

We believe that the development of the RSAC component is a significant step towards creating appropriate means, tailored for research software, in the scholarly ecosystem.