You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: generate_synthetic_table/prompts/academic.yaml
+62-20Lines changed: 62 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -94,37 +94,79 @@ generate_qa_from_image: |
94
94
95
95
generate_synthetic_table: |
96
96
You are a Synthetic Data Generator specializing in Academic Data.
97
-
Your task is to generate a new HTML table that mirrors the structure of the provided original table but contains entirely new, realistic synthetic academic data.
97
+
98
+
**⚠️ CRITICAL INSTRUCTION: DO NOT COPY ORIGINAL DATA ⚠️**
99
+
Your task is to generate a new HTML table with the SAME STRUCTURE as the original but COMPLETELY DIFFERENT academic data values.
98
100
99
101
**Inputs:**
100
-
1. **Original Table Structure:**
102
+
1. **Original Table Structure (for structure reference ONLY - DO NOT copy the data values):**
101
103
{html}
102
104
103
-
2. **Table Summary:**
105
+
2. **Table Summary (describes the data patterns to follow):**
104
106
{summary}
105
107
106
108
**Requirements:**
107
-
1. **Structure:** Keep the exact same HTML structure.
108
-
2. **Data:** Replace ALL cell values with new, synthetic academic data.
109
-
- Use realistic Korean student names, university names, course titles, and grades.
110
-
- Contexts: Transcripts, Research Papers, Enrollment Stats, Faculty Lists.
111
-
- Do NOT use real private data.
112
-
3. **Consistency:** Ensure mathematical consistency (e.g., sum of credits, correct GPA calculations if visible).
113
-
4. **Output:** Return ONLY the raw HTML string starting with `<table>` and ending with `</table>`.
109
+
1. **Structure:** Keep the exact same HTML structure (rows, columns, headers, merges, rowspan, colspan).
110
+
2. **Headers:** Keep header text the same (column names, category labels).
111
+
3. **⚠️ Data Transformation - ABSOLUTELY MANDATORY ⚠️:**
112
+
- **ALL data cell values MUST be replaced with completely new synthetic values.**
113
+
- **NEVER copy any original data values** - generate fresh, realistic alternatives.
114
+
- For student/model names: Generate DIFFERENT names
115
+
- For university names: Generate DIFFERENT names
116
+
- For grades/scores: Generate DIFFERENT realistic values
117
+
- For course/research topics: Generate DIFFERENT titles
118
+
- For dates: Generate DIFFERENT plausible dates
119
+
4. **Styling:** Use **Tailwind CSS** classes (NO inline styles). **Observe and mimic the original image's visual style:**
120
+
- Look at the original image's color scheme and design
121
+
- Use appropriate Tailwind color classes to match the original style
6. **Output:** Return ONLY the raw HTML string starting with `<table>` and ending with `</table>`. No markdown code blocks.
128
+
129
+
**Example Transformation (Generic):**
130
+
- Original name: "학생A" → Synthetic: "학생B"
131
+
- Original score: "4.0" → Synthetic: "3.5"
132
+
- Original model: "모델X" → Synthetic: "모델Y"
133
+
134
+
⚠️ If the generated content is identical or very similar to the original, the output is INVALID.
114
135
115
136
generate_synthetic_table_from_image: |
116
137
You are a Synthetic Data Generator specializing in Academic Data.
117
-
Your task is to generate a new HTML table that mirrors the structure of the provided image but contains entirely new, realistic synthetic academic data.
138
+
139
+
**⚠️ CRITICAL INSTRUCTION: DO NOT TRANSCRIBE - GENERATE NEW DATA ⚠️**
140
+
Your task is NOT to OCR/transcribe the image. Instead, you must:
141
+
1. Understand the table's STRUCTURE from the image
142
+
2. Understand it's an ACADEMIC table
143
+
3. Generate COMPLETELY NEW synthetic academic data that fits the domain but uses ENTIRELY DIFFERENT values
118
144
119
145
**Inputs:**
120
-
1. **Image:** An image of an academic table.
146
+
1. **Image:** An image of an academic table. Use this to understand structure and domain ONLY.
121
147
122
148
**Requirements:**
123
-
1. **Structure Preservation:** Accurately reconstruct the table structure.
124
-
2. **Data Generation:** Replace ALL cell values with new, synthetic academic data.
125
-
- Use realistic Korean student names, course titles, grades, research topics.
126
-
3. **Styling:** Use **Tailwind CSS** classes (same as default).
Copy file name to clipboardExpand all lines: generate_synthetic_table/prompts/business.yaml
+73-20Lines changed: 73 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -94,37 +94,90 @@ generate_qa_from_image: |
94
94
95
95
generate_synthetic_table: |
96
96
You are a Synthetic Data Generator specializing in Business Data.
97
-
Your task is to generate a new HTML table that mirrors the structure of the provided original table but contains entirely new, realistic synthetic business data.
97
+
98
+
**⚠️ CRITICAL INSTRUCTION: DO NOT COPY ORIGINAL DATA ⚠️**
99
+
Your task is to generate a new HTML table with the SAME STRUCTURE as the original but COMPLETELY DIFFERENT business data values.
100
+
The goal is to create realistic synthetic business data that looks like it could come from the same domain, but with entirely different companies, employees, products, and metrics.
98
101
99
102
**Inputs:**
100
-
1. **Original Table Structure:**
103
+
1. **Original Table Structure (for structure reference ONLY - DO NOT copy the data values):**
101
104
{html}
102
105
103
-
2. **Table Summary:**
106
+
2. **Table Summary (describes the data patterns to follow):**
104
107
{summary}
105
108
106
109
**Requirements:**
107
-
1. **Structure:** Keep the exact same HTML structure.
108
-
2. **Data:** Replace ALL cell values with new, synthetic business data.
109
-
- Use realistic Korean company names, department names, product lines, and financial metrics.
6. **Output:** Return ONLY the raw HTML string starting with `<table>` and ending with `</table>`. No markdown code blocks.
132
+
133
+
**Example Transformation (Generic):**
134
+
- Original name: "A팀" → Synthetic: "B팀"
135
+
- Original amount: "5억원" → Synthetic: "7.3억원"
136
+
- Original description: "신규 사업 추진" → Synthetic: "해외 시장 진출"
137
+
138
+
⚠️ If the generated content is identical or very similar to the original, the output is INVALID.
139
+
Remember: The synthetic table should look like a completely different business dataset from the same domain.
114
140
115
141
generate_synthetic_table_from_image: |
116
142
You are a Synthetic Data Generator specializing in Business Data.
117
-
Your task is to generate a new HTML table that mirrors the structure of the provided image but contains entirely new, realistic synthetic business data.
143
+
144
+
**⚠️ CRITICAL INSTRUCTION: DO NOT TRANSCRIBE - GENERATE NEW DATA ⚠️**
145
+
Your task is NOT to OCR/transcribe the image. Instead, you must:
146
+
1. Understand the table's STRUCTURE from the image (rows, columns, merged cells, nested structures)
147
+
2. Understand it's a BUSINESS table (기업경쟁력, 시장경쟁력, 매출, 실적 등)
148
+
3. Generate COMPLETELY NEW synthetic business data that fits the domain but uses ENTIRELY DIFFERENT values
118
149
119
150
**Inputs:**
120
-
1. **Image:** An image of a business table.
151
+
1. **Image:** An image of a business table. Use this to understand structure and domain ONLY.
121
152
122
153
**Requirements:**
123
-
1. **Structure Preservation:** Accurately reconstruct the table structure.
124
-
2. **Data Generation:** Replace ALL cell values with new, synthetic business data.
125
-
- Use realistic Korean company names, products, sales figures.
126
-
3. **Styling:** Use **Tailwind CSS** classes (same as default).
0 commit comments