Skip to content

splitting train/test/val before feature transform#7

Open
stephenleo wants to merge 1 commit intoaws-samples:masterfrom
stephenleo:master
Open

splitting train/test/val before feature transform#7
stephenleo wants to merge 1 commit intoaws-samples:masterfrom
stephenleo:master

Conversation

@stephenleo
Copy link
Copy Markdown

Issue #, if available: NA

Description of changes: To prevent data leakage, it's good practice to split the train, val and test sets before applying any data transformations.

fit_transform() is then called only on X_train
X_val and X_test only get transform()

E.g.: https://machinelearningmastery.com/data-preparation-without-data-leakage/

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@stephenleo
Copy link
Copy Markdown
Author

@yevgeniyilyin what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant