Humans are aOected by many kinds of diseases, there are several type of cancers Blood cancer, Skin cancer, Lung cancer, Bladder cancer, Kidney cancer etc. One of them is Cervical Cancer it happens in women’s. It is the fourth most common cancer in women globally with around 600,000 new cases and around 300,000 deaths in 2022.
Cervical cancer is caused by persistent infection with the human papillomavirus (HPV). Women living with HIV are 5 times more likely to develop cervical cancer compared to women without HIV. Human papillomavirus (HPV) is a common sexually transmitted infection which can aOect the skin, genital area and throat. Almost all sexually active people will be infected at some point in their lives, usually without symptoms. In most cases the immune system clears HPV from the body. Persistent infection with high-risk HPV can cause abnormal cells to develop, which go on to become cancer. Countries around the world are working to accelerate the elimination of cervical cancer in the coming decades, with an agreed set of three targets to be met by 2030.
Cervical cancer is one of the main reason of deaths of many women around the world. This is due to (HPV) virus. To solve this issue, we should be able to identify this problem at the early stage. Early detection of cervical cancer can play a crucial role in someone’s life. Because early detection of cervical cancer helps in its cure. And making a machine learning model which can predict the early detection of cervical cancer will be good for its cure. We will build a cervical cancer detection model using machine learning algorithms to predict at a large dataset. We will pick up some real-world data and train it predict it based on some details like biopsy or other details. We will train the data first with some inputs of the dataset and then test the large dataset and see how eOiciently it is working. Cervical cancer prediction using machine learning will also help us to reduce the cost as in traditional methods we had use multiple tests (like HPV test) etc. So, it will help to reduce the cost and it will work eOiciently we can train our data using multiple machine learning algorithm to check it’s eOiciency.
1.Data Collection First, I collected the dataset from the website I have mentioned in the references. It had the risk factors of cervical cancer. 2.Data Preprocessing Many libraries like NumPy, pandas, were used for data cleaning, to remove the data’s which were missing, some mission values etc. 3.Data Analysis Seaborn and matplotlib are used to visualize the features and variables. 4.Use of Models We used a few models like XG-Boost algorithm, KNN algorithm to train and test our dataset. 5.Testing First, we trained our model with some data and checked it’s eOiciency and then we tested our model further more for better eOiciency for a few models.
Results And Discussion The results are: By using XG-Boost Algorithm we achieved a high accuracy of 95.17% on the dataset. By using KNN Algorithm we achieved an accuracy of 93.97% on the dataset. By using Logistic Regression Algorithm, we achieved an accuracy of 95.18% on the dataset. By using Random Forest Algorithm, we achieved an accuracy of 96.38% on the dataset. By using multiple algorithms, we achieved diOerent accuracy on our dataset. While the models performed well, we still have a few false cases. So, we can think of optimizing it in future to minimize the errors.
Conclusion and Future Work This project shows us the use of diOerent kind of machine learning algorithms which are helpful in cervical cancer prediction using machine learning, these algorithms can be used for multiple other disease prediction and for other things which can benefit our society. In future we can work on the optimization and eOiciency of this model even on a large dataset and make a better use of machine learning.