Skip to content

Commit 0954852

Browse files
authored
Merge branch 'master' into add-spdx3
2 parents 8efe196 + 452c526 commit 0954852

17 files changed

Lines changed: 89 additions & 60 deletions

codemeta.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
"@context": ["http://purl.org/codemeta/2.0", "http://schema.org"],
33
"@type": "SoftwareSourceCode",
44
"identifier": "codemetar",
5-
"description": "The 'Codemeta' Project defines a 'JSON-LD' format for describing\n software metadata, as detailed at <https://codemeta.github.io>. This package\n provides utilities to generate, parse, and modify 'codemeta.jsonld' files \n automatically for R packages, as well as tools and examples for working with\n 'codemeta' 'JSON-LD' more generally.",
5+
"description": "The 'CodeMeta' Project defines a 'JSON-LD' format for describing\n software metadata, as detailed at <https://codemeta.github.io>. This package\n provides utilities to generate, parse, and modify 'codemeta.jsonld' files \n automatically for R packages, as well as tools and examples for working with\n 'codemeta' 'JSON-LD' more generally.",
66
"name": "codemetar: Generate CodeMeta Metadata for R Packages",
77
"issueTracker": "https://github.com/ropensci/codemetar/issues",
88
"license": "https://spdx.org/licenses/MIT",

content/_index.md

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,19 @@
22
title: "The CodeMeta Project"
33
---
44

5-
65
## Motivation
76

87
Research relies heavily on scientific software, and a large and growing fraction of researchers are engaged in developing software as part of their own research ([Hannay et al 2009](https://doi.org/10.1109/SECSE.2009.5069155 "How do scientists develop and use scientific software?")). Despite this, _infrastructure to support the preservation, discovery, reuse, and attribution of software_ lags substantially behind that of other research products such as journal articles and research data. This lag is driven not so much by a lack of technology as it is by a lack of unity: existing mechanisms to archive, document, index, share, discover, and cite software contributions are heterogeneous among both disciplines and archives and rarely meet best practices ([Howison 2015](https://doi.org/10.1002/asi.23538 "Software in the scientific literature: Problems with seeing, finding, and using software mentioned in the biology literature")). Fortunately, a rapidly growing movement to improve preservation, discovery, reuse and attribution of academic software is now underway: a recent [NIH report](http://softwarediscoveryindex.org), conferences and working groups of [FORCE11](https://www.force11.org/), [WSSSPE](http://wssspe.researchcomputing.org.uk/) & [Software Sustainability Institute](http://www.software.ac.uk/), and the rising adoption of repositories like [GitHub](https://github.com), [Zenodo](https://zenodo.org), [figshare](https://figshare.com) & [DataONE](https://www.dataone.org) by academic software developers. Now is the time to improve how these resources can talk to each other.
98

10-
119
## What can software metadata do for you?
1210

1311
What metadata you want from software is determined by your use case. If your primary concerns are credit for academic software, then you're most interested in _citation_ metadata. If you're trying to replicate some analysis, you worry more about versions and dependencies than about authors and titles. And if you seek to discover software you don't already know about that is suitable for a particular task, well then you are interested more in keywords and descriptions. Frequently, developers of scientific software, repositories that host that software, and users themselves are interested in more than one of these objectives, and others besides.
1412

1513
Different software repositories, software languages and scientific domains denote this information in different ways, which makes it difficult or impossible for tools to work across these different sources without losing valuable information along the way. For instance, a fantastic collaboration between GitHub and figshare provides researchers a way to import software on the former into the persistent archive of the latter, getting a permanent identifier, a DOI in the process. To assign a DOI, figshare must then pass metadata about the object to DataCite, the central DOI provider for all repositories. While this makes DataCite a powerful aggregator, the lack of a crosswalk table means that much valuable metadata is currently lost along the way, such as the original software license, platform, and so forth. Any tool or approach working across software repositories faces similar challenges without a crosswalk table to translate between these.
1614

17-
1815
For more detail, [visit the project on GitHub](https://github.com/codemeta/codemeta) or check back here soon.
1916

20-
#### Special thanks to our supporters
17+
## Special thanks to our supporters
2118

2219
<img width="50px" src="/img/nsf.jpg"/>
2320
<img width="50px" src="/img/datacite.png"/>

content/crosswalk/cargo.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
---
2+
title: "Crosswalk for Rust Package Manager"
3+
image: "/img/cargo.png"
4+
date: "2020-03-05"
5+
---
6+
7+
[Cargo](https://doc.rust-lang.org/cargo/) is the Rust package manager.
8+
9+
The `Cargo.toml` file for each package is called its manifest. Cargo uses the metadata in the manifest to download Rust package’s dependencies, compile packages, make distributable packages, and upload them to crates.io, the Rust community’s package registry.
10+
11+
The manifest format is described in [the Cargo Book](https://doc.rust-lang.org/cargo/reference/manifest.html).
12+
13+
{{< crosswalk name="Rust Package Manager" >}}

content/crosswalk/dcat-2.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: "Crosswalk for DCAT 2 vocabulary"
3+
image: "/img/w3c.png"
4+
date: "2019-04-23"
5+
---
6+
7+
Data Catalog Vocabulary (DCAT) is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
8+
9+
[DCAT 2](https://www.w3.org/TR/vocab-dcat-2/) is superseded by DCAT 3,
10+
but DCAT 3 does not make DCAT 2 obsolete.
11+
DCAT 3 terms preserve backward compatibility with DCAT 2.
12+
13+
{{< crosswalk name="DCAT-2" >}}
14+
15+
See crosswalk for [DCAT 3](dcat-3).

content/crosswalk/dcat-3.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: "Crosswalk for DCAT 3 vocabulary"
3+
image: "/img/w3c.png"
4+
date: "2025-04-24"
5+
---
6+
7+
Data Catalog Vocabulary (DCAT) is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web.
8+
9+
[DCAT 3](https://www.w3.org/TR/vocab-dcat-3/) is a W3C Recommendation.
10+
DCAT 3 supersedes DCAT 2, but DCAT 3 does not make DCAT 2 obsolete.
11+
DCAT 3 terms preserve backward compatibility with DCAT 2.
12+
13+
{{< crosswalk name="DCAT-3" >}}
14+
15+
See crosswalk for [DCAT 2](dcat-2).

content/crosswalk/trove.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ title: "Crosswalk for Trove Software Map"
33
image: "/img/sourceforge.png"
44
---
55

6-
[Trove Software Map](https://sourceforge.net/p/easyhtml5/tracinst/Software%20Map%20and%20Trove/#what-is-trove). Trove is the system used by SourceForge.net to classify software projects.
7-
6+
[Trove Software Map](https://sourceforge.net/p/easyhtml5/tracinst/Software%20Map%20and%20Trove/#what-is-trove). Trove is the system used by SourceForge.net to classify software projects.
87

98
{{< crosswalk name="Trove Software Map" >}}

content/developer-guide.md

Lines changed: 18 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ methodology for creating software package descriptions. For example, this guide
77
designing software to generate or read CodeMeta JSON software descriptions.
88

99
Users that only require instructions for manually creating CodeMeta software descriptions may wish to
10-
review the upcoming [User Guide](/user-guide/).
10+
review the upcoming [User Guide](/user-guide/).
1111

1212
## CodeMeta Overview
1313

@@ -47,43 +47,46 @@ An example usage of the CodeMeta document is for the author of research software
4747

4848
The producer of an CodeMeta Document, i.e. the creators of the software, must use the JSON names from the CodeMeta context file. The consumer of the CodeMeta Document can use these same JSON names from the CodeMeta Document for any necessary processing tasks.
4949

50-
As an alternative to using the producer supplied JSON names, the consumer can use the [JSON-LD API](https://www.w3.org/TR/json-ld-api/) to translate the JSON names to their own local JSON names that may be in use by their local processing scripts. This is done by first using the JSON-LD *expand* function that replaces each JSON name in the CodeMeta Document with it's corresponding IRI from the CodeMeta context file. For example, the producer's CodeMeta Document may contain the following line:
50+
As an alternative to using the producer supplied JSON names, the consumer can use the [JSON-LD API](https://www.w3.org/TR/json-ld-api/) to translate the JSON names to their own local JSON names that may be in use by their local processing scripts. This is done by first using the JSON-LD *expand* function that replaces each JSON name in the CodeMeta Document with it's corresponding IRI from the CodeMeta context file. For example, the producer's CodeMeta Document may contain the following line:
5151

52+
```json
5253
"codeRepository": "https://github.com/DataONEorg/rdataone"
54+
```
5355

5456
Using the JSON-LD API *expand* function, this is converted to:
5557

56-
"http://schema.org/codeRepository": "https://github.com/DataONEorg/rdataone
58+
```json
59+
"http://schema.org/codeRepository": "https://github.com/DataONEorg/rdataone"
60+
```
5761

5862
Next, the consumer can use their own context file that maps from each IRI to their own local JSON names. For example, the consumer may have a context that maps the local JSON name 'repository' (as in `package.json` documents used by NPM, see [/crosswalk/node/]) to "http://schema.org/codeRepository", so using the JSON API *compact* function would result in a new CodeMeta Document with the entry:
5963

64+
```json
6065
"repository": "https://github.com/DataONEorg/rdataone"
66+
```
6167

6268
When the CodeMeta Document has been compacted, it can then be used by the consumer for any necessary processing, using the local JSON names.
6369

6470
Note that this expansion and compaction process assumes that both the producer and consumer JSON-LD context files share overlapping sets of IRIs.
6571

66-
6772
## Crosswalk Table
6873

6974
## Tools and Integrations
7075

71-
7276
To facilitate automated ingest of research software into repositories such as [figshare](https://figshare.com/), [Zenodo](https://zenodo.org/), the [Knowledge Network for Biocomplexity](https://knb.ecoinformatics.org/) and others, these repositories will update
7377
their submission processes to use CodeMeta Document which will provide the metadata necessary for the submission and indexing of the software.
7478

7579
Tools will be created that assist in the generation of CodeMeta documents. For example, a tool written in the R language would generate a CodeMeta document from an R package that was authored to support a research project, automatically collecting available metadata and possibly prompting the user for any additional required metadata. The CodeMeta document would then be used to assist in publishing the software to a repository. An example CodeMeta document is shown in Appendix C.
7680

7781
## Generating Citations from a CodeMeta documents
7882

79-
[ TBD ]
80-
83+
[ TBD ]
8184

8285
## Extending the CodeMeta Context
8386

84-
CodeMeta explicitly defines the terms it uses from <http://schema.org>, rather than merely extending <http://schema.org> with a few additional terms. To use additional terms from <http://schema.org> not listed on the [terms page](/terms/) (or terms from any other context), you must extend your context appropriately. For instance, to combine codemeta (v2.0-rc) with all terms available in schema.org, you would do:
87+
CodeMeta explicitly defines the terms it uses from <http://schema.org>, rather than merely extending <http://schema.org> with a few additional terms. To use additional terms from <http://schema.org> not listed on the [terms page](/terms/) (or terms from any other context), you must extend your context appropriately. For instance, to combine codemeta (v2.0-rc) with all terms available in schema.org, you would do:
8588

86-
```
89+
```json
8790
"@context": ["https://raw.githubusercontent.com/codemeta/codemeta/2.0-rc/codemeta.jsonld", "http://schema.org/"]
8891
```
8992

@@ -96,7 +99,7 @@ The intent of JSON-LD is to provide a mechanism to represent linked data using s
9699
For example, in the CodeMeta document, the JSON-LD "@id" keyword is used to associate an IRI with a JSON object. When the JSON-LD CodeMeta document is serialized to RDF, this becomes the graph node identifier, or the subject of the resulting RDF triple. If an @id is not specified for a JSON object, then a blank node identifier is assigned to the resulting graph node for the output RDF graph. The JSON object `role` from the example
97100
CodeMeta document:
98101

99-
```
102+
```json
100103
"roleCode":[
101104
"originator",
102105
...
@@ -105,19 +108,19 @@ CodeMeta document:
105108

106109
is serialized to RDF as:
107110

108-
```
111+
```n3
109112
_:b1 <https://codemeta.github.io/terms/roleCode> "originator" .
110113
```
111114

112115
When the JSON-LD "@type" keyword is applied to a simple JSON type, the serialized RDF will have that type appended to the object, for example, the following entry from the example CodeMeta document:
113116

114-
```
117+
```json
115118
"dateCreated":"2016-05-27"
116119
```
117120

118121
is serialized to the following RDF ([N-Triples format](https://www.w3.org/TR/n-triples/)):
119122

120-
```
123+
```n3
121124
_:b0 <http://schema.org/dateCreated> "2016-05-27"^^<http://www.w3.org/2001/XMLSchema#dateTime> .
122125
```
123126

@@ -126,7 +129,7 @@ In this case, the "@type" was specified in the context file.
126129
When the JSON-LD "@type" is applied to a JSON object, the type information is serialized to RDF with
127130
an RDF type statement, for example, this JSON object from the sample CodeMeta document:
128131

129-
```
132+
```json
130133
"author":[
131134
{
132135
"@id":"http://orcid.org/0000-0002-3957-2474",
@@ -139,10 +142,9 @@ an RDF type statement, for example, this JSON object from the sample CodeMeta do
139142

140143
is serialized to RDF as:
141144

142-
```
145+
```n3
143146
<http://orcid.org/0000-0002-3957-2474> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://schema.org/Person> .
144147
145148
```
146149

147150
This example shows the "@type" keyword being used in the CodeMeta document.
148-

content/jsonld.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "The Codemeta JSON-LD Representation"
2+
title: "The CodeMeta JSON-LD Representation"
33
---
44

55

content/terms.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
---
2-
title: Codemeta Terms
3-
2+
title: CodeMeta Terms
43
---
54

65
## Terms from Schema.org
@@ -13,7 +12,7 @@ These terms are all recognized properties of <https://schema.org/SoftwareSourceC
1312

1413
{{< properties-description matchParentType="schema:(Person|Thing|Review|Role)">}}
1514

16-
## Codemeta terms
15+
## CodeMeta terms
1716

1817
The CodeMeta project also introduces the following additional properties, which lack clear equivalents in <https://schema.org> but can play an important role in software metadata records covered by the CodeMeta crosswalk.
1918

content/tools.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
### Tools
1+
# Tools
22

33
This page lists some existing tools to help with CodeMeta files
44

5-
#### File Generation
5+
## File Generation
66

7-
Some of the early tools still need a little updating to use the latest version of the codemeta context.
7+
Some of the early tools still need a little updating to use the latest version of the codemeta context.
88

99
{.table .table-striped}
1010

@@ -23,26 +23,21 @@ tool | language | codemeta version | maintainer | notes
2323
[AutoCodemeta Generator](https://w3id.org/autocodemeta) | Javascript | 3.0.0 | [dgarijo](http://github.com/dgarijo) | Optimized version of CodeMeta Generator that automatically creates a codemeta file from a given repository
2424
[Somef](https://github.com/KnowledgeCaptureAndDiscovery/somef) | Python | OEG developers | [dgarijo](http://github.com/dgarijo) | Tool that automatically extracts software metadata from repositories and scientific publications.
2525

26-
#### Integrations
27-
26+
## Integrations
2827

2928
Integrations indicate existing platforms & services which understand CodeMeta descriptions. These do not provide a user-facing software tool for generating codemeta.json, but can ingest
3029
existing codemeta.json files automatically.
3130

3231
{.table .table-striped}
3332

34-
Name | Description | Authors | Language | Codemeta Version
33+
Name | Description | Authors | Language | CodeMeta Version
3534
-----|-------------|----------|----------|--------------------
3635
[Fidgit](https://github.com/arfon/fidgit): | An ungodly union of GitHub and Figshare | Arfon Smith, Kaitlin Thaney, Mark Hahnel | Ruby | 0.1.0
3736
[Software Heritage](https://docs.softwareheritage.org/devel/swh-indexer/metadata-workflow.html#adding-support-for-additional-ecosystem-specific-metadata)|The metadata indexers | SWH team | Python | 2.0
3837

39-
4038
Pending:
4139

42-
4340
- JOSS
4441
- Zenodo
4542
- DataCite
4643
- Figshare
47-
48-

0 commit comments

Comments
 (0)