Skip to content

Add sizes of software artifact#1259

Open
bact wants to merge 2 commits into
spdx:developfrom
bact:software-artifact-size
Open

Add sizes of software artifact#1259
bact wants to merge 2 commits into
spdx:developfrom
bact:software-artifact-size

Conversation

@bact
Copy link
Copy Markdown
Collaborator

@bact bact commented Apr 22, 2026

Size of a software artifact can be measures in different ways.

Apart from conventional bytes, audiovisual content is more meaningfully measured in duration (time) or textual AI training data in number of tokens.

These addition is to align with regulatory requirements, for example, EU AI Act.

Property names are closely aligned with other recognized standards/vocabs.

Note that the renaming of /Software/artifactSize to /Software/byteSize is not a breaking change as the property is introduced in 3.1 and 3.1 is not released yet.

To resolve #1258 (see more background and rationale there)

Size of a software artifact can be measures in different ways.

Apart from conventional bytes, audiovisual content is more meaningfully measured in duration (time) or textual AI training data in number of tokens.

These addition is to align with regulatory requirements, for example, EU AI Act.

Property names are closely aligned with other recognized standards/vocabs.

Signed-off-by: Arthit Suriyawongkul <arthit@gmail.com>
@bact bact added Profile:Software Software profile and related matters Profile:Dataset Dataset profile and related matters labels Apr 22, 2026
@bact
Copy link
Copy Markdown
Collaborator Author

bact commented Apr 22, 2026

@bact
Copy link
Copy Markdown
Collaborator Author

bact commented Apr 22, 2026

@JPEWdev this PR will renamed /Software/artifactSize (introduced by your PR #966) to /Software/byteSize (closely resemble to the name in that PR's first commit sizeInBytes), as the size of artifact can be measured in different ways.

Please see if the new name make sense for you. Thank you.

@JPEWdev
Copy link
Copy Markdown
Contributor

JPEWdev commented Apr 22, 2026

How would you know what the units are?

@bact
Copy link
Copy Markdown
Collaborator Author

bact commented Apr 22, 2026

How would you know what the units are?

Apart from the rename of /Software/artifactSize to /Software/byteSize to specify the unit in the property name (since byte is no longer the only unit we deal with), this PR also introduce 3 new properties.

  • tokenCount -- unit is token count
  • itemCount -- unit is item count
  • contentDuration -- unit is by defintion in xsd:duration

See details in #1258

@JPEWdev
Copy link
Copy Markdown
Contributor

JPEWdev commented Apr 22, 2026

Oh, I'm sorry I read this completely backwards. My apologies. This LGTM

@bact bact requested review from kestewart and stevenc-stb April 23, 2026 00:45
Signed-off-by: Arthit Suriyawongkul <arthit@gmail.com>
Copy link
Copy Markdown
Collaborator

@stevenc-stb stevenc-stb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is some text in Description of itemCount that is unclear.

Constituent items can be stored within a database, embedded in a container
format, or represented as encoded binaries within a single file.

The unit of count is not encoded within this property.
Copy link
Copy Markdown
Collaborator

@stevenc-stb stevenc-stb Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

itemCount is inherently a dimensionless count. Your description already states:
property records the total number of discrete constituent items contained within a software artifact.
That already defines what is being counted.
Are you meaning "discrete constituent item type" as unit of count? Can we add a example that show this. like ex. "itemCount type: files" or " itemCount type: records ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I put an example for "files" here #1258 (comment)

When the property is used with a `/Software/File` element where the `fileKind`
is "directory", and the unit of count is not otherwise specified in the
`description` property, the unit of item shall be "file" and the property shall
record the number of immediate child files (including regular files,
symbolic links, and subdirectories) contained within that directory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Profile:Dataset Dataset profile and related matters Profile:Software Software profile and related matters

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add non-byte size for software artifact

3 participants