software_normalized retains case variants (e.g. MATLAB / Matlab)

The model extraction appears to have multiple different software_normalized names that all refer to the same software. For example, in the 5% dataset, I found several case variants of MATLAB

<img width="199" height="259" alt="Image" src="https://github.com/user-attachments/assets/0bac5643-6b1c-4de4-895c-0f380fc85bc0" />

These all clearly refer to the same software. A quick fix could be lower-casing the software_normalized values during extraction, which would collapse many of these variants into a single name and make downstream grouping/filtering more reliable.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

software_normalized retains case variants (e.g. MATLAB / Matlab) #45

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

software_normalized retains case variants (e.g. MATLAB / Matlab) #45

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions