Performance IMprovement Suggestions #3

Varunkumar2516 · 2026-03-17T15:56:49Z

Varunkumar2516
Mar 17, 2026
Maintainer

Title: Improving Sentiment Analysis Accuracy (Currently 90%) – Suggestions Needed

Hello

I built a Movie Sentiment Analysis model using TF-IDF + Logistic Regression and achieved 90% accuracy on the IMDB dataset.

Current pipeline:

Text cleaning (HTML removal, contractions, punctuation removal)
Stopword removal (keeping negations)
Lemmatization with POS tagging
TF-IDF (max_features=45000, ngram_range=(1,2))
Models tried: Naive Bayes, KNN, Logistic Regression, SVM, Decision Tree
Goal: Improve accuracy to ~92 to 95%+

What I’ve tried:

Ensemble methods (did not improve significantly)
Hyperparameter tuning
Questions:

Are there better feature engineering techniques I should try?
Would word embeddings (Word2Vec, GloVe) help here?
Any suggestions for handling tricky cases like negations better?
Here is my notebook:
https://github.com/Varunkumar2516/IMDb-Sentiment-Analysis-NLP-Project/blob/master/1%20IMDB_Sentiment_Analyzer_Notebook%20.ipynb

Any suggestions or feedback would be really helpful. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance IMprovement Suggestions #3

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Performance IMprovement Suggestions #3

Uh oh!

Varunkumar2516 Mar 17, 2026 Maintainer

Replies: 0 comments

Varunkumar2516
Mar 17, 2026
Maintainer