XSD2SHACL: Capturing RDF Constraints from XML Schema

SHACL shapes describe the constraints of RDF subgraphs which are constructed from heterogeneous data, such as RDBs, JSONs, XMLs, etc. These heterogeneous data often already have constraints defined in their schemes,e.g., JSON Schema for JSON or XSD for XML, but this information is ignored when the RDF graph is constructed, as there are currently no many works that translate such schemes into SHACL. In this paper, we focus on the incorporation of XSD constraints for XML data sources in SHACL shapes. We define a translation from XSD to SHACL, and provide a corresponding system. We compare our solution with XMLSchema2ShEx which translates XSD constraints to ShEx and validate our solution against two use cases. Our solution provides the desired SHACL shapes in a reasonable time. This allows us to automatically derive SHACL shapes for some original raw data without any manual effort.

keywords: SHACL, XML Schema, Validation, RDF Shapes