|
1 | | -# YourFunctionName |
2 | | -*Brief description of what this evaluation function does, from the developer perspective* |
| 1 | +# FSA Evaluation Function |
3 | 2 |
|
4 | | -## Inputs |
5 | | -*Specific input parameters which can be supplied when the `eval` command is supplied to this function.* |
| 3 | +Evaluation function for validating Finite State Automata (FSA) against language specifications. |
6 | 4 |
|
7 | | -## Outputs |
8 | | -*Output schema/values for this function* |
| 5 | +## Overview |
| 6 | + |
| 7 | +This evaluation function: |
| 8 | +1. Validates FSA structure (states, transitions, initial/accept states) |
| 9 | +2. Checks language correctness against answer specification |
| 10 | +3. Provides detailed feedback with **UI element highlighting** for errors |
| 11 | +4. Supports multiple answer formats (test cases, reference FSA, regex, grammar) |
| 12 | + |
| 13 | +## Schemas |
| 14 | + |
| 15 | +The evaluation function uses four main Pydantic schemas in `evaluation_function/schemas/`: |
| 16 | + |
| 17 | +### 1. FSA Schema (`fsa.py`) |
| 18 | + |
| 19 | +The student's FSA submission - represents a 5-tuple (Q, Σ, δ, q0, F): |
| 20 | + |
| 21 | +```json |
| 22 | +{ |
| 23 | + "states": ["q0", "q1", "q2"], |
| 24 | + "alphabet": ["a", "b"], |
| 25 | + "transitions": [ |
| 26 | + {"from_state": "q0", "to_state": "q1", "symbol": "a"}, |
| 27 | + {"from_state": "q1", "to_state": "q2", "symbol": "b"} |
| 28 | + ], |
| 29 | + "initial_state": "q0", |
| 30 | + "accept_states": ["q2"] |
| 31 | +} |
| 32 | +``` |
| 33 | + |
| 34 | +| Field | Type | Required | Description | |
| 35 | +|-------|------|----------|-------------| |
| 36 | +| `states` | `string[]` | Yes | Q: Set of state identifiers | |
| 37 | +| `alphabet` | `string[]` | Yes | Σ: Input alphabet symbols | |
| 38 | +| `transitions` | `Transition[]` | Yes | δ: Transition function | |
| 39 | +| `initial_state` | `string` | Yes | q0: Starting state | |
| 40 | +| `accept_states` | `string[]` | Yes | F: Accepting/final states | |
| 41 | + |
| 42 | +### 2. Answer Schema (`answer.py`) |
| 43 | + |
| 44 | +How the correct answer is specified. Supports four types: |
| 45 | + |
| 46 | +#### Regex |
| 47 | +```json |
| 48 | +{ |
| 49 | + "type": "regex", |
| 50 | + "value": "(a|b)*ab" |
| 51 | +} |
| 52 | +``` |
| 53 | + |
| 54 | +#### Test Cases |
| 55 | +```json |
| 56 | +{ |
| 57 | + "type": "test_cases", |
| 58 | + "value": [ |
| 59 | + {"input": "ab", "expected": true}, |
| 60 | + {"input": "ba", "expected": false} |
| 61 | + ] |
| 62 | +} |
| 63 | +``` |
| 64 | + |
| 65 | +#### Reference FSA |
| 66 | +```json |
| 67 | +{ |
| 68 | + "type": "reference_fsa", |
| 69 | + "value": { /* FSA object */ } |
| 70 | +} |
| 71 | +``` |
| 72 | + |
| 73 | +#### Grammar |
| 74 | +```json |
| 75 | +{ |
| 76 | + "type": "grammar", |
| 77 | + "value": { |
| 78 | + "start": "S", |
| 79 | + "productions": {"S": ["aS", "bS", "ab"]} |
| 80 | + } |
| 81 | +} |
| 82 | +``` |
| 83 | + |
| 84 | +### 3. Params Schema (`params.py`) |
| 85 | + |
| 86 | +Evaluation configuration: |
| 87 | + |
| 88 | +```json |
| 89 | +{ |
| 90 | + "evaluation_mode": "lenient", |
| 91 | + "expected_type": "DFA", |
| 92 | + "feedback_verbosity": "standard", |
| 93 | + "highlight_errors": true, |
| 94 | + "show_counterexample": true |
| 95 | +} |
| 96 | +``` |
| 97 | + |
| 98 | +| Field | Type | Default | Description | |
| 99 | +|-------|------|---------|-------------| |
| 100 | +| `evaluation_mode` | `strict\|lenient\|partial` | `lenient` | Evaluation strictness | |
| 101 | +| `expected_type` | `DFA\|NFA\|any` | `any` | Required automaton type | |
| 102 | +| `feedback_verbosity` | `minimal\|standard\|detailed` | `standard` | Feedback detail level | |
| 103 | +| `check_minimality` | `boolean` | `false` | Check if FSA is minimal | |
| 104 | +| `check_completeness` | `boolean` | `false` | Check if DFA is complete | |
| 105 | +| `highlight_errors` | `boolean` | `true` | Include element IDs for UI | |
| 106 | +| `show_counterexample` | `boolean` | `true` | Show distinguishing string | |
| 107 | +| `max_test_length` | `integer` | `10` | Max generated test length | |
| 108 | + |
| 109 | +### 4. Result Schema (`result.py`) |
| 110 | + |
| 111 | +Evaluation result with structured feedback: |
| 112 | + |
| 113 | +```json |
| 114 | +{ |
| 115 | + "is_correct": false, |
| 116 | + "feedback": "Your FSA rejects 'ab' but it should accept.", |
| 117 | + "score": 0.8, |
| 118 | + "fsa_feedback": { |
| 119 | + "summary": "Language mismatch", |
| 120 | + "errors": [ |
| 121 | + { |
| 122 | + "message": "Missing transition", |
| 123 | + "code": "MISSING_TRANSITION", |
| 124 | + "severity": "error", |
| 125 | + "element_type": "transition", |
| 126 | + "from_state": "q0", |
| 127 | + "symbol": "b", |
| 128 | + "suggestion": "Add transition from q0 on 'b'" |
| 129 | + } |
| 130 | + ], |
| 131 | + "language": { |
| 132 | + "are_equivalent": false, |
| 133 | + "counterexample": "ab", |
| 134 | + "counterexample_type": "should_accept" |
| 135 | + }, |
| 136 | + "structural": { |
| 137 | + "is_deterministic": true, |
| 138 | + "num_states": 3, |
| 139 | + "unreachable_states": [] |
| 140 | + } |
| 141 | + } |
| 142 | +} |
| 143 | +``` |
| 144 | + |
| 145 | +## UI Highlighting |
| 146 | + |
| 147 | +Errors include a `highlight` field that tells the frontend exactly which FSA element to highlight: |
| 148 | + |
| 149 | +### Example: Highlight Invalid Transition |
| 150 | +```json |
| 151 | +{ |
| 152 | + "message": "Transition points to non-existent state 'q5'", |
| 153 | + "code": "INVALID_TRANSITION_DEST", |
| 154 | + "severity": "error", |
| 155 | + "highlight": { |
| 156 | + "type": "transition", |
| 157 | + "from_state": "q0", |
| 158 | + "to_state": "q5", |
| 159 | + "symbol": "a" |
| 160 | + }, |
| 161 | + "suggestion": "Change destination to an existing state or add state 'q5'" |
| 162 | +} |
| 163 | +``` |
| 164 | + |
| 165 | +**Frontend should:** Highlight the transition arrow from q0 to q5 on symbol 'a' in red. |
| 166 | + |
| 167 | +### Example: Highlight Missing State |
| 168 | +```json |
| 169 | +{ |
| 170 | + "message": "State 'q1' is unreachable from the initial state", |
| 171 | + "code": "UNREACHABLE_STATE", |
| 172 | + "severity": "warning", |
| 173 | + "highlight": { |
| 174 | + "type": "state", |
| 175 | + "state_id": "q1" |
| 176 | + }, |
| 177 | + "suggestion": "Add a path from the initial state to 'q1' or remove this state" |
| 178 | +} |
| 179 | +``` |
| 180 | + |
| 181 | +**Frontend should:** Highlight state circle q1 in orange/warning color. |
| 182 | + |
| 183 | +### Highlight Types |
| 184 | + |
| 185 | +| Type | Fields | Use Case | |
| 186 | +|------|--------|----------| |
| 187 | +| `state` | `state_id` | Highlight a specific state | |
| 188 | +| `transition` | `from_state`, `to_state`, `symbol` | Highlight a specific transition arrow | |
| 189 | +| `initial_state` | `state_id` | Highlight the initial state marker | |
| 190 | +| `accept_state` | `state_id` | Highlight the accept state indicator | |
| 191 | +| `alphabet_symbol` | `symbol` | Highlight a symbol in the alphabet | |
| 192 | + |
| 193 | +## Error Codes (Enum) |
| 194 | + |
| 195 | +Error codes are defined as a type-safe enum (`ErrorCode`) in `evaluation_function/schemas/result.py`. |
| 196 | + |
| 197 | +```python |
| 198 | +from evaluation_function.schemas import ErrorCode, ValidationError |
| 199 | + |
| 200 | +# Type-safe error creation |
| 201 | +error = ValidationError( |
| 202 | + message="State 'q5' does not exist", |
| 203 | + code=ErrorCode.INVALID_STATE, # Enum value |
| 204 | + severity="error" |
| 205 | +) |
| 206 | +``` |
| 207 | + |
| 208 | +### Available Error Codes |
| 209 | + |
| 210 | +| Code | Highlight Type | Description | |
| 211 | +|------|----------------|-------------| |
| 212 | +| `INVALID_STATE` | state | State ID not in states list | |
| 213 | +| `INVALID_INITIAL` | initial_state | Initial state not in states list | |
| 214 | +| `INVALID_ACCEPT` | accept_state | Accept state not in states list | |
| 215 | +| `INVALID_SYMBOL` | alphabet_symbol | Symbol not in alphabet | |
| 216 | +| `INVALID_TRANSITION_SOURCE` | transition | Source state doesn't exist | |
| 217 | +| `INVALID_TRANSITION_DEST` | transition | Destination state doesn't exist | |
| 218 | +| `INVALID_TRANSITION_SYMBOL` | transition | Symbol not in alphabet | |
| 219 | +| `MISSING_TRANSITION` | state | DFA missing transition from this state | |
| 220 | +| `DUPLICATE_TRANSITION` | transition | Non-deterministic transition | |
| 221 | +| `UNREACHABLE_STATE` | state | State not reachable from initial | |
| 222 | +| `DEAD_STATE` | state | State cannot reach accept state | |
| 223 | +| `WRONG_AUTOMATON_TYPE` | general | Wrong automaton type (NFA when DFA expected) | |
| 224 | +| `NOT_DETERMINISTIC` | general | Has non-deterministic transitions | |
| 225 | +| `NOT_COMPLETE` | general | DFA missing some transitions | |
| 226 | +| `LANGUAGE_MISMATCH` | general | Accepts wrong language | |
| 227 | +| `TEST_CASE_FAILED` | general | Failed specific test case | |
| 228 | +| `EMPTY_STATES` | general | No states defined | |
| 229 | +| `EMPTY_ALPHABET` | general | No alphabet symbols | |
| 230 | +| `EVALUATION_ERROR` | general | Internal evaluation error | |
9 | 231 |
|
10 | 232 | ## Examples |
11 | | -*List of example inputs and outputs for this function, each under a different sub-heading* |
12 | 233 |
|
13 | 234 | ### Simple Evaluation |
14 | 235 |
|
| 236 | +**Input:** |
15 | 237 | ```python |
16 | | -{ |
17 | | - "example": { |
18 | | - "Something": "something" |
19 | | - } |
| 238 | +response = { |
| 239 | + "states": ["q0", "q1"], |
| 240 | + "alphabet": ["a"], |
| 241 | + "transitions": [{"from": "q0", "to": "q1", "symbol": "a"}], |
| 242 | + "initial_state": "q0", |
| 243 | + "accept_states": ["q1"] |
20 | 244 | } |
| 245 | + |
| 246 | +answer = { |
| 247 | + "type": "test_cases", |
| 248 | + "value": [ |
| 249 | + {"input": "a", "expected": True}, |
| 250 | + {"input": "", "expected": False} |
| 251 | + ] |
| 252 | +} |
| 253 | + |
| 254 | +params = {"feedback_verbosity": "detailed"} |
21 | 255 | ``` |
22 | 256 |
|
| 257 | +**Output:** |
23 | 258 | ```python |
24 | 259 | { |
25 | | - "example": { |
26 | | - "Something": "something" |
27 | | - } |
| 260 | + "is_correct": True, |
| 261 | + "feedback": "Correct! Your FSA accepts the expected language." |
28 | 262 | } |
29 | | -``` |
| 263 | +``` |
0 commit comments