Skip to content

Commit 86337dc

Browse files
committed
Improved external entity resolution
1 parent d61e95d commit 86337dc

9 files changed

Lines changed: 3330 additions & 77 deletions

File tree

tests/README.md

Lines changed: 190 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,190 @@
1+
# TypesXML W3C Test Suite Integration
2+
3+
This directory contains the comprehensive test suite for validating the TypesXML parser against the W3C XML Test Suite.
4+
5+
## Test Files
6+
7+
- **`comprehensive-test-suite.js`** - 🆕 **Main comprehensive test runner** for all W3C XML test cases (includes canonicalizer validation)
8+
- **`setup-test-suite.js`** - Helper script to download and validate the W3C test suite
9+
- **`README.md`** - This documentation file
10+
11+
## Prerequisites
12+
13+
1. **W3C XML Test Suite**: Download and extract the test suite to `../xmltest` relative to the TypesXML project root
14+
- Available at: <https://dev.w3.org/XInclude-Test-Suite/2001-cpy/XML-Test-Suite/xmlconf/xmltest/>
15+
- The test suite contains 531 XML test files organized into three categories:
16+
- `valid/` - Valid XML documents (should parse successfully and match canonical output)
17+
- `invalid/` - Well-formed but invalid XML documents (should fail validation)
18+
- `not-wf/` - Not well-formed XML documents (should fail parsing)
19+
20+
## Running Tests
21+
22+
### Quick Setup
23+
24+
```bash
25+
# Setup test suite (first time only)
26+
npm run test:setup
27+
28+
# Run comprehensive test suite (recommended)
29+
npm test
30+
```
31+
32+
### Manual Execution
33+
34+
```bash
35+
# Build the project first
36+
npm run build
37+
38+
# Navigate to tests directory
39+
cd tests
40+
41+
# Run the comprehensive test suite (processes ALL test files)
42+
node comprehensive-test-suite.js
43+
44+
# Setup/validate test suite
45+
node setup-test-suite.js
46+
```
47+
48+
## Comprehensive Test Suite Features
49+
50+
The new `comprehensive-test-suite.js` provides:
51+
52+
### 🎯 **Complete Coverage**
53+
54+
- Tests **all** W3C XML test files (500+ tests)
55+
- Validates against canonical XML output
56+
- Tests all three categories: valid, invalid, not-well-formed
57+
58+
### 📊 **Advanced Progress Indicators**
59+
60+
- Real-time progress bars during execution
61+
- ETA (Estimated Time of Arrival) for long-running tests
62+
- Performance metrics and timing analysis
63+
64+
### 📈 **Comprehensive Reporting**
65+
66+
- Detailed statistics by category
67+
- Error analysis and categorization
68+
- Performance benchmarks
69+
- XML compliance summary
70+
- Saves detailed JSON report (`test-report.json`)
71+
72+
### 🚀 **Smart Execution**
73+
74+
- Automatic test suite validation
75+
- Handles large test sets efficiently
76+
- Memory-efficient batch processing
77+
- Graceful error handling
78+
79+
## Sample Output
80+
81+
```
82+
╔══════════════════════════════════════════════════════════════╗
83+
║ TypesXML W3C Comprehensive Test Suite ║
84+
║ ║
85+
║ Testing against the complete W3C XML Test Collection ║
86+
║ This may take several minutes to complete... ║
87+
╚══════════════════════════════════════════════════════════════╝
88+
89+
🔍 Validating test suite availability...
90+
✓ Valid documents: 185 tests found
91+
✓ Invalid documents: 209 tests found
92+
✓ Not-well-formed documents: 137 tests found
93+
📊 Total test files: 531
94+
95+
📋 Testing Valid Documents
96+
Expected: Parse successfully and match canonical output
97+
─────────────────────────────────────────────────────────
98+
Processing 185 valid documents...
99+
Progress: [████████████████████████████████████████] 100% (185/185) - 2341ms
100+
✅ Results: 178/185 passed (96.2%)
101+
102+
📊 OVERALL STATISTICS
103+
════════════════════
104+
Total Test Files: 531
105+
Tests Passed: 492
106+
Tests Failed: 39
107+
Success Rate: 92.65%
108+
Execution Time: 8.45 seconds
109+
```
110+
111+
## Current Status
112+
113+
Based on initial testing with the quick test suite:
114+
115+
### ✅ Working Well
116+
117+
- Basic XML parsing and DOM building
118+
- XML canonicalization (passes W3C canonical XML specification)
119+
- Most simple valid XML documents
120+
- Detection of not-well-formed documents
121+
122+
### 🔧 Areas for Improvement
123+
124+
1. **Attribute Value Escaping**: Some issues with quote handling in attribute values
125+
2. **Processing Instruction Formatting**: Spacing in PI content needs refinement
126+
3. **Entity Handling**: Some complex entity scenarios need work
127+
4. **DTD Support**: External DTD processing could be enhanced
128+
5. **Error Reporting**: Some malformed documents should be rejected but currently parse
129+
130+
### 📊 Test Results Summary
131+
132+
- **Canonicalizer**: 4/4 tests passing (100%)
133+
- **Quick Validation**: ~15/20 valid documents passing (~75%)
134+
- **Not-Well-Formed Detection**: Partial success (some edge cases missed)
135+
136+
## Integration Strategy
137+
138+
### Phase 1: Fix Critical Issues ✅
139+
140+
- [x] Implement XML canonicalizer
141+
- [x] Set up test infrastructure
142+
- [ ] Fix attribute value escaping
143+
- [ ] Fix processing instruction formatting
144+
145+
### Phase 2: Enhanced Compliance
146+
147+
- [ ] Improve entity handling
148+
- [ ] Enhance DTD processing
149+
- [ ] Strengthen not-well-formed detection
150+
- [ ] Run full test suite validation
151+
152+
### Phase 3: Repository Integration
153+
154+
- [ ] Achieve >90% success rate on representative tests
155+
- [ ] Add tests to CI/CD pipeline
156+
- [ ] Document test coverage and limitations
157+
158+
## Usage Example
159+
160+
```javascript
161+
const { XMLCanonicalizer } = require('../dist/XMLCanonicalizer.js');
162+
const { DOMBuilder } = require('../dist/DOMBuilder.js');
163+
const { SAXParser } = require('../dist/SAXParser.js');
164+
165+
// Parse an XML document
166+
const parser = new SAXParser();
167+
const builder = new DOMBuilder();
168+
parser.setContentHandler(builder);
169+
parser.parseString('<doc>Hello World</doc>');
170+
171+
// Get canonical form
172+
const document = builder.getDocument();
173+
const canonical = XMLCanonicalizer.canonicalize(document);
174+
console.log(canonical); // Output: <doc>Hello World</doc>
175+
```
176+
177+
## Contributing
178+
179+
When modifying the XML parser:
180+
181+
1. Run `node quick-test.js` to check for regressions
182+
2. Address any new test failures
183+
3. Consider running the full test suite for major changes
184+
4. Update this README with any significant changes to test results
185+
186+
## References
187+
188+
- [W3C XML Test Suite](https://dev.w3.org/XInclude-Test-Suite/2001-cpy/XML-Test-Suite/xmlconf/xmltest/readme.html)
189+
- [Canonical XML Specification](https://dev.w3.org/XInclude-Test-Suite/2001-cpy/XML-Test-Suite/xmlconf/xmltest/canonxml.html)
190+
- [XML 1.0 Specification](https://www.w3.org/TR/xml/)

0 commit comments

Comments
 (0)