Skip to content

Commit 415e062

Browse files
Improve README with clearer explanations and examples
Updated README to enhance clarity and organization of content related to dictionary-based feature grouping. Signed-off-by: Fabiana ⚡️ Campanari <113218619+FabianaCampanari@users.noreply.github.com>
1 parent 132d2a4 commit 415e062

1 file changed

Lines changed: 19 additions & 2 deletions

File tree

README.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -109,17 +109,21 @@ https://github.com/user-attachments/assets/4ccd316b-74a1-4bae-9bc7-1c705be80498
109109

110110
## Overview
111111

112+
<br>
113+
112114
This repository demonstrates **dictionary-based feature grouping** for tabular data preprocessing, specifically designed for integration with **Large Language Models (LLMs)** and AI/ML pipelines.
113115

114116
The technique allows you to organize related columns (features) in a dataset using **dictionaries**, enabling:
115117

118+
<br>
119+
116120
- Semantic grouping of features
117121
- Efficient preprocessing for LLM-based feature engineering
118122
- Better interpretability of tabular data
119123
- Streamlined data transformation pipelines
120124

121125

122-
<br>
126+
<br><br>
123127

124128
> [!TIP]
125129
>
@@ -129,7 +133,7 @@ The technique allows you to organize related columns (features) in a dataset usi
129133
130134

131135

132-
<br><br>
136+
<br><br><br>
133137

134138

135139
## What is Dictionary-Based Feature Grouping?
@@ -139,12 +143,22 @@ The technique allows you to organize related columns (features) in a dataset usi
139143
### 💡 Simple Explanation (For Beginners)
140144

141145
Imagine you have a dataset about customers with many columns:
146+
147+
148+
<br>
149+
150+
142151
```
143152
age, income, city, state, country, purchase_date, product_name, price, ...
144153
```
145154

155+
<br>
156+
157+
146158
Instead of processing all columns individually, you can **group them** by meaning:
147159

160+
<br>
161+
148162
```python
149163
feature_groups = {
150164
'demographics': ['age', 'income'],
@@ -155,8 +169,11 @@ feature_groups = {
155169

156170
<br>
157171

172+
158173
This makes it easier to:
159174

175+
<br>
176+
160177
1. Apply specific transformations to each group
161178
2. Feed organized data to LLMs
162179
3. Understand your dataset structure

0 commit comments

Comments
 (0)