ML Diaries: Day 3
Data Modelling — Splitting the Data
— a daily log of my learning and projects built as I take up Machine Learning. Welcome to The Mind Palace by Dayo.
ML Diaries is simply a daily log of my learning and projects built as I take up Machine Learning. Stories on The Mind Palace, this blog, will still continue every week.
Date: Aug 20, 2022
About
Day 3 of the ‘Complete Machine Learning & Data Science Bootcamp 2022’ was a short session due to personal reasons, but I started the fifth component of the Machine Learning framework tagged Modelling.
(See the entire framework from Day 2 here.)
History
Again, a machine learning project comprises three phases: Data Collection, Data Modelling and Model Deployment. Data Modelling is a non-linear, iterative process of problem definition, data, evaluation, features, modelling and experiments.
Day 2 explored the problem definition, data, evaluation and features stages of data modelling entailed with a focus on what goal each stage aims to achieve and how. Now onto Modelling.
The Good Stuff — Modelling
For obvious reasons, the modelling stage of a machine learning project is the most crucial. The overarching goal of this stage answers the question “Based on our problem and data, what model should I use?”.
This modelling stage is as large as it is crucial, and so can be categorized into:
- Choosing and training a model;
- Tuning the model; and
- Model comparisons
Choosing and training a model is the crux of machine learning. Unsurprisingly.
And now, onto Day 4.