fix(json-schema): fix bugs, add missing conversions, and improve parity#3109
fix(json-schema): fix bugs, add missing conversions, and improve parity#3109AbdelrahmanHafez wants to merge 39 commits into
Conversation
…ies, and preferences re hapijs#3108
…d preferences
- Fix alternatives.match('all') producing anyOf instead of allOf
- Skip rule jsonSchema handlers when args contain Refs
- Handle _invalids as not: { enum: [...] }
- Exclude forbidden keys and stripped keys (output mode) from properties
- Add all object dependency handlers (with, without, and, nand, or, xor, oxor)
- Add string.alphanum pattern and string.domain pattern with options
- Add number.precision as multipleOf
- Add date.timestamp with ECMA-262 range limits
- Register id'd child schemas in $defs for Joi.link() without .shared()
- Respect preferences: allowUnknown, stripUnknown, presence, noDefaults
re hapijs#3108
…tripUnknown edge case re hapijs#3108
…own object check re hapijs#3108
…ng, and domain TLD handling re hapijs#3108
|
I'm stress testing this still to find any potential edge cases, will work on it in the upcoming few days. Marking as draft for now, will convert back to ready-for-review when done. |
|
Just judging by the size of your PR, is your AI leading you astray? That's really a lot of additional code. I'm grateful, but I'm starting to wonder if all of it is vital for a proper json-schema. |
|
Hi @Marsup, agreed the scope grew past what #3108 strictly asked for. I used Codex GPT 5.4 on this, mostly for filling in parity coverage once I started chasing edge cases. Happy to split this into smaller PRs if that helps review, just let me know which subset you'd want to tackle first (the original #3108 fixes, the refactor into For context on the size: I tried to make the JSON Schema input/output behave as closely to Joi's runtime as possible, and documented the per-decision rationales in the PR body. The conversion code is isolated under |
re #3108
This PR addresses the bugs and missing conversions reported in #3108, plus several additional gaps I found while working through the codebase.
alternatives.match('all')was mapped toanyOfinstead ofallOfjsonSchemahandlers were called even when rule args contained refs, which could produce invalid schemas (e.g.string.min(Joi.ref('x'))throwing instead of being gracefully skipped)Missing conversions added:
number.precision()->multipleOfstring.alphanum()->pattern: '^[a-zA-Z0-9]+$'string.token(),string.hex(),string.base64(), andstring.dataUri()-> validatingpatternoutput instead of non-standardformatvalues, with option-aware parity forhex()/base64()/dataUri()string.ip()-> full regexpatternmatching@hapi/addressbehavior, including CIDR/version options and IPvFuture literalsstring.hostname()-> hostname-or-IPpatternmatching Joi runtime instead of bareformat: 'hostname'string.domain()-> full regex pattern matching@hapi/addressbehavior, with support for all domain options (minDomainSegments,maxDomainSegments,allowUnderscore,allowFullyQualified,allowUnicode,tldsallow/deny). Uses pattern instead offormat: 'hostname'since hostname accepts single-label names while Joi'sdomain()requires at least 2 segments. This now also covers astral Unicode labels and punycode-backed Unicode TLD display variants that round-trip to the same canonical ASCII TLD.date.timestamp('javascript')anddate.timestamp('unix')->type: 'number'with ECMA-262 §21.4.1.1 range bounds (±100 million days)binaryencoding flag ->contentEncoding, with standard transfer-encoding mapping (hex->base16,base64/base64urlpreserved, charset-style Node encodings omitted)invalid()values ->not: { enum: [...] }, composable viaallOfwhen combined with othernot-based constraints, includinginvalid(null)when the schema could otherwise acceptnull.example()values ->examplesarray.meta()-> supported JSON Schema annotation keywords (title,format,readOnly,writeOnly,deprecated,examples,$comment,contentEncoding,contentMediaType,contentSchema), with deduplication when meta examples overlap with.example()valuesObject dependencies, all 7 dependency types are now converted:
with()->dependentRequiredwithout()->dependentSchemaswithproperties: { peer: false }and()-> bidirectionaldependentRequired(each peer requires all others)nand()->not: { properties: { ...peers: true }, required: [...peers] }or()->anyOfwithrequiredper peerxor()->oneOfwithrequiredper peeroxor()->oneOfwith a "none present" branch + one branch per peerMultiple dependencies of the same type compose via
allOf. All representations are AJV strict mode compatible (strictRequired,strictTypes).Preferences support, schema-level preferences (via
.prefs()) now propagate to JSON Schema output:presence: 'required'-> marks all properties as requiredpresence: 'forbidden'-> root schema emitsfalse; nested property presence uses the child schema's effective prefs so nested forbidden-presence is preserved correctlyallowUnknown: true-> omitsadditionalProperties: falsestripUnknown: true-> emitsadditionalProperties: falsein output mode only (input mode accepts unknowns, output mode has them stripped). Correctly handlesstripUnknown: { arrays: true }without affecting object properties.noDefaults: true-> suppresses required marking for properties with defaults in output modeOther fixes:
falseproperty schemas, so they stay forbidden even when unknown keys are otherwise allowedallow()/valid()exceptions now emit enum branches whenever the base schema would otherwise reject an explicitly allowed value, including same-type conflicts likestring().min(5).allow('abc')andobject().min(1).allow({})allOfinstead of overwriting each other, so combinations like.pattern(/foo/).hostname()preserve every active constraintstripresult flag -> output schema usesfalseproperty schemas for stripped declared keys, while still keeping$defsintact for linked child schemasJoi.link('#id')without.shared()now correctly registers the linked schema in$defs(previously produced broken$refpointing to nonexistent$defsentry)when()conditionals insideJoi.object({...})now hoist to object-levelif/then/else(orallOffor multiple conditionals), which preserves cross-field linkage for literal sibling refs, simple object paths likesettings.mode, and literalswitchbranches on hoistable ref paths instead of widening them to child-localanyOf; more complex shapes such as schema conditions, non-literalswitchpredicates, adjusted/mapped refs, and fixed array-index refs still intentionally fall back to the lossyanyOfapproximation for nowJoi.extend()now inherit the base JSON Schematype, so renamed built-in types likebase: Joi.string()/number()/array()/object()still emit the correct standard JSON Schema type, including when nested inside other schemas or combined with prefsvalid()/only()output now preserves Joi semantics more closely:nullstays in exclusive enums, conflicting base validators are only retained when every allowed value still satisfies them, and mixed enums that include objects fall back to enum-only output instead of emitting unsoundtypeconstraintsJoi.date()valid()/allow()/invalid()values now emit canonical JSON-native enum values (date-timestrings, millisecond timestamps, or unix timestamps depending on format), which keeps the emitted schema portable and aligned with Joi's accepted date inputsarray().ordered(...)now emitsminItemsfrom the last explicitly required ordered position instead of the total ordered length, preserving Joi's optional ordered-slot semantics while still capping tuple-only arrays withmaxItemsAll JSON Schema output is covered by tests against AJV with Draft 2020-12. Standard JSON Schema formats are exercised through
ajv-formats, while explicit custom test-only formats still go through the helper's custom format allowlist. Most schemas are validated with strict mode enabled; optionalordered()tuple positions intentionally usestrictTuples: false, since they are valid JSON Schema but AJV's strict tuple lint expects fully required tuples. 100% code coverage maintained.I realize this might be a lot of changes, but I think the conversions make sense and add support for more complex schemas. Feedback welcome.
One note on philosophy: I tried to keep the emitted JSON Schema as standard as possible while still matching Joi runtime behavior as closely as possible. There are a few places where JSON Schema is inherently a lossy target compared to Joi, so in those cases I preferred the smallest honest approximation over something overly clever. One concrete example is raw
Joi.date(), which is more permissive thanformat: 'date-time'because its string acceptance follows JS date parsing semantics.I also tried to keep the built-in output standard for the selected target. The one built-in OpenAPI-ism I found while working through this was
format: 'binary'onJoi.binary(), which is now gone for thedraft-2020-12target. I did keepx-constraintfor date comparisons. It's not standard vocabulary, but it is still valid JSON Schema as an extension keyword, and I think it's worth keeping because it preserves useful Joi semantics that the standard vocab can't express cleanly.Edit: more follow-up parity work landed after the original description.
Joi.date()now emitstype: ['string', 'number']with ECMA-262 timestamp bounds, which matches Joi's default acceptance of both ISO-ish strings and JS millisecond timestamps more closelyJoi.date().iso()now emits a Joi-derivedpatterninstead offormat: 'date-time', since Joi's ISO acceptance is not identical to RFC 3339 / JSON Schemadate-timedefault()/example()/meta({ examples / contentSchema / ... })annotations now canonicalizeDateinstances into JSON-native values instead of leaking live JSDateobjects into the emitted schemaJoi.date().default('now')and date function defaults are omitted from JSON Schema output, because they do not have a faithful portable JSON Schema representationDateinstances throughvalid()/invalid()JSON Schema output either; those values are either canonicalized underJoi.date()or dropped when there is no sound JSON representationJoi.binary()no longer emits the OpenAPIformat: 'binary'for thedraft-2020-12target, it staystype: 'string', mapshexto RFC 4648base16, preservesbase64/base64url, and omits charset-style Node encodings that do not have an honest JSON SchemacontentEncodingvalueajv-formatsfor standard format keywords, so emittedemail,uuid,uri,date-time, anddurationformats are exercised as real format validators rather than compile-only passthroughsOpen question:
@standard-schema/specdefinesoptions.targetas required on bothjsonSchema.inputandjsonSchema.output, but the current runtime silently defaults to'draft-2020-12'when it is omitted. That splits the contract between TS callers (compile error) and JS callers (silent default). Strict is spec-faithful and guards against silent version pinning if more targets land later; permissive is ergonomic and preserves current behavior. Worth a conscious call either way, happy to tighten it if preferred.Edit: the stacked cleanup/refactor from #3110 is now folded into this branch as well.
lib/json-schema/