Predict who would be interested in buying a Caravan Insurance Policy
You can access the data from following link.
[login to view URL]
This data set used in the COIL 2000 Challenge contains information on customers of an insurance company. The data consists of 86 features and includes product usage data and socio-demographic data derived from zip area codes. The data was collected to answer the following question: Can you predict who would be interested in buying a caravan insurance policy and give an explanation why?
Three data files are available at given link. [login to view URL] contains training data, 86th column is target value. [login to view URL] is test dataset and [login to view URL] is target value for test data. All the description required for this dataset are available at given link in the beginning.
Wherever needed, write down the observation in comments in your notebook. Best model would be considered based on its f-score on both classes.
1. Which features are relevant for the prediction task? Select top 10 features based on your understanding. Show visualizations or statistics to support your selection.
2. Train a Logistic Regression (LogReg) model with L1 regularization. Find the best model using grid search on C values. Analyze which features have nonzero coefficients for the best model. Are they in synch with your selected features from question 1?
3. Generate polynomial features and use LogReg again with L1. See if accuracy increase.
4. Use any classification model (trees, forests, gradient boosting, SVM) to improve your result. You can (and probably should) change your preprocessing and feature engineering to be suitable for the model. You are not required to try all of these models. Tune parameters as appropriate.
5. Can you create an “explainable” model that is nearly as good as your best model? An explainable model should be small enough to be easily inspected - say a linear model with few enough (<10) coefficients that you can reasonable look at all of them, or a tree with a small number of leaves, less depth etc.
21 freelancere byder i gennemsnit $107 på dette job
hello, i am an expert in machine learning and data science. i have a vast expertise and experience in these areas. i have done a lot of similar projects before. i can assure you best quality work.
Having experience in data science project, earlier done similar project by using machine learning algorithms. Relevant Skills and Experience Python, R, SQL, Random forest, regression, supervised, unsupervised learning
Hello, I can do that for you if you want because I have good knowledge in all things are you looking for, so trust me and you will get good results tomorrow night.