Kaggle Competition Part 2
Day 2 Requirements:
By the end of the day, your group should have the following completed:
- Starter code to load your data into a Pandas DataFrame. This code should be pushed to your team's github and optionally submitted to the Kaggle Leaderboard as practice.
- Initial code with comments to organize and clean the data, also pushed to github.
- In your own words, a problem statement which you will be modeling against. This can be similar to the project description on Kaggle. Be sure to discuss this sufficiently with your team. The problem statement should be clearly written down and placed in a readme.md file in your team's github repo.
- If the previous items are fulfilled, then initial EDA on the data related to your problem statement.
Modify your optionally created Trello board as needed.
Get in the habit of submitting pull requests for branches with assignments of team members to review the code. In industry settings this is (nearly) always a required best practice! Pushing to master directly is frowned upon as it leads to bugs in the main branch.
Problem Statement Reminder:
- Start up a new document and describe the following:
- What is our problem statement?
- What can we learn from the data in order to make an educated hypothesis?
- What is our hypothesis?