Skip to content

Newlines and whitespace in XML literals are removed since v1.0.0 #619

@spacid

Description

@spacid

In versions 1.0.0 (and 1.1.0), the XML triplifier removes newlines and whitespace within element text nodes, even when trim-strings=false is provided as property. This behavior worked correctly in versions 0.8.2 (and 0.9.0).

To reproduce

Using the following XML file:

test.xml

<root>
    <description>Line 1
    Line 2</description>
</root>

With the following SPARQL query:

test.sparql

PREFIX fx: <http://sparql.xyz/facade-x/ns/>
CONSTRUCT {
  ?s ?p ?o
} WHERE {
  SERVICE <x-sparql-anything:location=test.xml,trim-strings=false> {
    ?s ?p ?o .
    FILTER(isLiteral(?o))
  }
}

A small Python script to test the behavior:

import subprocess

JARS = {
    "v0.8.2": "sparql-anything-0.8.2.jar",
    "v0.9.0": "sparql-anything-0.9.0.jar",
    "v1.0.0": "sparql-anything-v1.0.0.jar",
    "v1.1.0": "sparql-anything-v1.1.0.jar",
}


def run_sparql_anything(version: str, jar_path: str):
    print(f"{version}\n-------")
    cmd = ["java", "-jar", jar_path, "-q", "test.sparql", "-f", "TTL"]
    try:
        result = subprocess.run(cmd, capture_output=True, text=True, check=True)
        print(result.stdout)
    except Exception as e:
        print(f"Error in {version}: {e}")


if __name__ == "__main__":
    for version, jar_path in JARS.items():
        run_sparql_anything(version, jar_path)

The output:

v0.8.2
-------
@prefix fx: <http://sparql.xyz/facade-x/ns/> .

[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
          "Line 1\n    Line 2" ] .

v0.9.0
-------
@prefix fx: <http://sparql.xyz/facade-x/ns/> .

[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
          "Line 1\n    Line 2" ] .

v1.0.0
-------
PREFIX fx: <http://sparql.xyz/facade-x/ns/>

[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
          "Line 1Line 2" ] .

v1.1.0
-------
PREFIX fx: <http://sparql.xyz/facade-x/ns/>

[ <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1>
          "Line 1Line 2" ] .

Metadata

Metadata

Assignees

Labels

BugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions