Skip to content

Commit 7466934

Browse files
authored
Merge pull request #30 from IloBe/29-change-ethical-considerations-section-of-model-card
Fix: add bias insights to ethical considerations part
2 parents 478d689 + 80ee49a commit 7466934

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

model_card.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -97,10 +97,10 @@ Validation results of the GridSearchCV best estimator are:<br>
9797
![best xgb cls eval][image6]
9898

9999
## Ethical Considerations
100-
No ethical consideration topics regarding data, human life, risk and harms and their needed risk mitigation strategies or fraught model use cases are detected.
100+
As ethical consideration topics regarding data, one insight is that the raw data have a bias towards men (twice as many men as women) and regarding race a bias towards white people mainly originally from the U.S., so, scaling activities are mandatory getting appropriate prediction results.<br>
101+
No other human life, risk and harms and their needed risk mitigation strategies or fraught model use cases are detected.
101102

102103
## Caveats and Recommendations
103-
- Regarding the data, have in mind that the raw data have among others a bias towards men (twice as many men as women) and white people mainly originally from the U.S., so, scaling activities are mandatory getting appropriate prediction results.
104104
- Regarding the prediction task, the performance of the grid search cross validation approach is already improved compared to the one of the single instance, but still not the best. As future toDo, final tuning of the XGBoost Classifier via <i>Hyperopt</i> library is recommended.
105105
- Additional, feature importance information of the final resulting XGBoost Classifier model is critical to understand the prediction process. As additional future toDo: usage of <i>SHAP</i> diagrams or simple <i>xgb feature_importances_ parameter</i> bar chart of the X_train columns from the GridSearchCV best model result for identifying which features are most relevant for the target variable.
106106
- Last topic as future toDo is the usage of other classifier types and their evaluation compared to the XGBoost Classifier, even though it was often used by teams that won Kaggle competitions.

0 commit comments

Comments
 (0)