Remarks on Data Quality: One of the major flaws with the dataset is that it does not provide the units for the features. In this example, the dataset is clean and pristine, with no missing values. Import and clean the dataset, analyze features to select the relevant features that correlate with the target variable.Ģ.1 Import dataset and display features and the target variable Objective : The goal of this project is to build a regressor model that recommends the “crew” size for potential cruise ship buyers using the cruise ship dataset cruise_ship_info.csv. What do you want to find out? Do you have the data to analyze? Illustrating the Machine Learning Process. Since the project involves building a machine learning model, the first step is to ensure we understand the machine learning process:įigure 1. Only the final Jupyter notebook has to be submitted, and no formal project report is required. The time allowed for completing this coding assignment was three days. Notice also that the instruction clearly specifies that python must be used as the programming language for model building. So, all that is needed is to follow the instructions and generate your code. The dataset is clean and small (160 rows and 9 columns), and the instructions are very clear. This is an example of a very straightforward problem. Plot regularization parameter value vs Pearson correlation for the test and training sets, and see whether your model has a bias problem or variance problem. What is regularization? What is the regularization parameter in your model?.Describe hyper-parameters in your model and how you would change them to improve the performance of the model.Calculate the Pearson correlation coefficient for the training set and testing datasets.Build a machine learning model to predict the ‘crew’ size.Create training and testing sets (use 60% of the data for the training and reminder for testing).Use one-hot encoding for categorical features.If you removed columns, explain why you removed those.Select columns that will probably be important to predict “crew” size.Calculate basic statistics of the data (count, mean, std, etc) and examine data and state your observations.Please do the following steps (hint: use numpy, scipy, pandas, sklearn and matplotlib) Objective: Build a regressor that recommends the “crew” size for potential ship buyers. Please save your work in a Jupyter notebook and email it to us for review.ĭata file: cruise_ship_info.csv (this file will be emailed to you) You are free to use the internet and any other libraries. This coding exercise should be performed in python (which is the programming language used by the team). Sample 1 Coding Exercise: Model for recommending cruise ship crew size Before delving into the tips, let’s first examine some sample coding exercises. In this article, I will share some useful tips from my personal experience that would help you excel in the coding challenge project. The take-home coding exercise clearly differs from companies to companies, as further described below. For the couple of interviews I had, I worked with 2 types of datasets: one had 160 observations (rows), while the other had 50,000 observations with lots of missing values. That way, you don’t have to worry about mining the data and transforming it into a form suitable for analysis. If you are fortunate, they may provide a small dataset that is clean and stored in a comma-separated value (CSV) file format. Generally, the interview team will provide you with project directions and a dataset.
0 Comments
Leave a Reply. |