A model in Machine Learning is a set of assumptions shown in a mathematical or graphical form. Using the model one can express the parameters and variables as random variables. It shows how the variables are interdependent and how changing one variable may affect the value of another variable.
A Machine Learning model is a way to read or represent a process or problem in mathematical terms. To generate a Machine Learning model, we provide training data to the algorithm from which it can learn.
Since the advent of machine learning, researchers and data scientists have created a myriad of machine learning algorithms. Using the traditional approach, the engineer chooses one or more of these algorithms to crack the problem or create an entirely new one.
But this method doesn't scale well in real life because of constraints of the dataset or of the software requirements.
On the contrary, model-based machine learning attempts to create a tailored solution for the given problem. The algorithm here is specifically designed for the problem. This yields more accurate results and offers better efficiency.
Model based machine learning provides a systematic process for creating Machine Learning solutions.
In contrast to the algorithmic way which doesn't have a medium to incorporate prior knowledge, MBML can use the knowledge that has been acquired in the past.
MBML is suitable for taking care of uncertainty, in a systematic and principled manner.
MBML rarely suffers from over-fitting on the data, if we exercise proper diligence while training the model.
Every problem has a custom fit solution.
For a particular problem, we can build several models by tweaking the parameters. Then select the model with the best accuracy.
We can compare the alternative models for the same problem in no time.
Models in Machine Learning are general in purpose - meaning that it is no longer necessary to memorize 1000s of existing ML algorithms.
The training data and model differ from each other and thus can be modified individually without any error or difficulty.
I. Preparing a competent training dataset
Preparing a competent training dataset takes time and mistakes are common. Thus, before building the model the dataset needs to be error free.
II. Slicing the data
The data may contain several slices or categories. Building a separate model for each slice will yield a better solution and greater accuracy.
III. Use simple models
Building complex models to extract maximum information from the data is crucial; however, simple models are easier to deploy. Also, they make explaining results to the key business stakeholders easier. Build simple, white-box models using regression and decision trees, and use a gradient boosting or ensemble model to confirm how your simple models are performing.
IV. Detection of outliers
Machine learning often requires the use of unbalanced data, so correctly classifying rare events can be difficult. To counteract this, construct a biased training data set by over-sampling or under-sampling.
V. Combine models
Data scientists use algorithms like gradient boosting and random forests to automatically build many models. However, these models may generalize, and some algorithms may fit better than others within specific data boundaries. Overcome this hurdle by combining different modeling algorithms.