Skip to content

Commit 3d75910

Browse files
authored
fix(apollo11/readme): correct missing line breaks
1 parent 7eb5da6 commit 3d75910

1 file changed

Lines changed: 30 additions & 30 deletions

File tree

test_dataset_apollo11/README.md

Lines changed: 30 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -13,34 +13,34 @@ retrieval-augmented generation capabilities.
1313

1414
## 📂 Dataset Contents
1515

16-
- **[README.md][readme]** - This file (overview and instructions)
17-
- **[source_text.txt][source]** - Apollo 11 excerpted text (~1,400 words, plain text)
18-
- **[test_prompts.md][prompts]** - 15 test prompts (readable format)
16+
- **[README.md][readme]** - This file (overview and instructions)
17+
- **[source_text.txt][source]** - Apollo 11 excerpted text (~1,400 words, plain text)
18+
- **[test_prompts.md][prompts]** - 15 test prompts (readable format)
1919
- **[test_data.json][json]** - Complete dataset (structured format for automated
20-
testing)
20+
testing)
2121
- **[RATIONALE.md][rationale]** - Detailed explanation of selection decisions
2222

2323
📌 **Process documentation:** For background on dataset creation decisions and
2424
team discussions, see the **[team briefing](https://docs.google.com/document/d/1jAE2Y2BJDx014MAXCxyH0-2EgieL_tCxCEeMK4VWBNQ/edit?usp=sharing)**
2525

26-
[readme]: /test_dataset_apollo11/README.md
27-
[source]: /test_dataset_apollo11/source_text.txt
28-
[prompts]: /test_dataset_apollo11/test_prompts.md
29-
[json]: /test_dataset_apollo11/test_data.json
26+
[readme]: /test_dataset_apollo11/README.md
27+
[source]: /test_dataset_apollo11/source_text.txt
28+
[prompts]: /test_dataset_apollo11/test_prompts.md
29+
[json]: /test_dataset_apollo11/test_data.json
3030
[rationale]: /test_dataset_apollo11/RATIONALE.md
3131

3232
---
3333

3434
## 📄 Source & License
3535

36-
**Source:** Wikipedia - Apollo 11 article
37-
**URL:** <https://en.wikipedia.org/wiki/Apollo_11>
38-
**Permanent Link:** <https://en.wikipedia.org/w/index.php?title=Apollo_11&oldid=1252473845>
39-
**Revision ID:** 1252473845 (Wikipedia internal revision number)
40-
**Date Accessed:** October 22, 2025
36+
**Source:** Wikipedia - Apollo 11 article
37+
**URL:** <https://en.wikipedia.org/wiki/Apollo_11>
38+
**Permanent Link:** <https://en.wikipedia.org/w/index.php?title=Apollo_11&oldid=1252473845>
39+
**Revision ID:** 1252473845 (Wikipedia internal revision number)
40+
**Date Accessed:** October 22, 2025
4141
**Sections:** Excerpted passages from "Lunar landing" and "Lunar surface
42-
operations"
43-
**Word Count:** ~1,400 words
42+
operations"
43+
**Word Count:** ~1,400 words
4444
**Language:** English
4545

4646
**License:** Creative Commons Attribution-ShareAlike 3.0 (CC BY-SA 3.0)
@@ -63,15 +63,15 @@ practical testing while maintaining all information necessary for the 15 test pr
6363

6464
## 🎯 Selection Rationale
6565

66-
**Practical length** - ~1,400 words manageable for all model types including
66+
- **Practical length** - ~1,400 words manageable for all model types including
6767
distilled models with standard chunking
68-
**Rich in specific details** - Ideal for RAG testing (times, names, numbers,
68+
- **Rich in specific details** - Ideal for RAG testing (times, names, numbers,
6969
technical terms)
70-
**Multiple complexity levels** - Both simple recall and complex reasoning can
70+
- **Multiple complexity levels** - Both simple recall and complex reasoning can
7171
be tested
72-
**Narrative structure** - Clear sequence from descent through surface
72+
- **Narrative structure** - Clear sequence from descent through surface
7373
activities
74-
**All prompts answerable** - 15 test prompts verified to work with selected
74+
- **All prompts answerable** - 15 test prompts verified to work with selected
7575
passages
7676

7777
The excerpts cover the dramatic descent and landing sequence, followed by
@@ -90,7 +90,7 @@ reasoning, and RAG tasks.
9090

9191
Tests model's ability to condense and extract key information
9292

93-
**Difficulty:** Easy → Medium → Hard
93+
**Difficulty:** Easy → Medium → Hard
9494
**Examples:** Main events, challenges faced, activities performed, equipment
9595
deployed
9696

@@ -99,15 +99,15 @@ deployed
9999
Tests model's ability to analyze, infer, and make connections
100100

101101
**Types:** Causal reasoning, hypothetical scenarios, interpretation, deep
102-
analysis
102+
analysis
103103
**Examples:** Why did computer alarms occur? What if Armstrong hadn't taken
104104
manual control? What does Margaret Hamilton's statement reveal?
105105

106106
### RAG - Retrieval (5 prompts)
107107

108108
Tests model's ability to retrieve specific information from source text
109109

110-
**Types:** Times, quotes, numbers, lists, complex multi-part facts
110+
**Types:** Times, quotes, numbers, lists, complex multi-part facts
111111
**Examples:** Landing time? Material collected? Scientific instruments deployed?
112112

113113
📌 See [test_prompts.md][prompts] for the readable format, or [test_data.json][json]
@@ -125,11 +125,11 @@ but attempting all prompts provides comprehensive evaluation data.
125125

126126
**Testing Protocol:**
127127

128-
**1.** Use the source text from **[source_text.txt][source]** exactly as provided
129-
**2.** Use all 15 prompts from **[test_prompts.md][prompts]** without modification
128+
**1.** Use the source text from **[source_text.txt][source]** exactly as provided
129+
**2.** Use all 15 prompts from **[test_prompts.md][prompts]** without modification
130130
**3.** *(Optional)* Use **[test_data.json][json]** for automated or scripted
131-
testing workflows
132-
**4.** Record responses for each prompt with model configuration details
131+
testing workflows
132+
**4.** Record responses for each prompt with model configuration details
133133
**5.** Note any errors, failures, or unusual behaviors
134134

135135
---
@@ -138,9 +138,9 @@ but attempting all prompts provides comprehensive evaluation data.
138138

139139
For each prompt, record:
140140

141-
**1. Accuracy** - Is the answer factually correct?
142-
**2. Completeness** - Are all key points covered?
143-
**3. Specificity** - Are specific details included (times, names, numbers)?
141+
**1. Accuracy** - Is the answer factually correct?
142+
**2. Completeness** - Are all key points covered?
143+
**3. Specificity** - Are specific details included (times, names, numbers)?
144144
**4. Reasoning Quality** - For reasoning prompts, is the logic sound and
145145
well-supported?
146146

0 commit comments

Comments
 (0)