1597911780

Today we are going to discuss about Performance Metrics, and this time it will be Regression metrics. As in my previous *blog*we have discussed about Classification Metrics, this time it’s Regression.

We are going to talk about 5 most widely used Regression metrics:

Let’s understand one thing first and that is the difference between Classification and Regression Metrics, why we need two different metrics to measure our models.

The first key difference is, Classification as the name suggest gives classes as output, which can be understood as we have few categories of data, say class 1–10, then our output value will be any number in between 1–10. So if the model output matches with my actual output then the result is passed otherwise failed, there is no other condition, you can either be correct or incorrect. While this is not the case of Regression, in regression my model outputs a continuous number, there is no discrete values, it’s continuous, like for example out model tries to predict height of people, we know we cannot classify the height variable as 160cm, or 170cm or etc… it is continuous, hence in this case we consider how close is our model to the actual value, the concept of “how close” gives rise to the term of loss, to put in proper scientific or statistical notation, what is the loss incurred by our model in predicting the value of a data point.

Let’s say, for a certain data point the height is predicted to be 167 cm whereas the actual data point has an actual value of 163 cm, then our model has made a mistake of +5cm in this case, now this I just for 1 data point imagine how to measure for the whole dataset?

Let’s keep one thing in mind, what is an Error?

Any deviation from the actual value is an error,

#data-science #statistical-analysis #statistics #data analysis

1598793360

Hierarchical machine learning models are one top-notch trick. As discussed in previous posts, considering the natural taxonomy of the data when designing our models can be well worth our while. Instead of flattening out and ignoring those inner hierarchies, we’re able to use them, making our models smarter and more accurate.

“More accurate”, I say — are they, though? How can we tell? We are people of science, after all, and we expect bold claims to be be supported by the data. This is why we have performance metrics. Whether it’s precision, f1-score, or any other lovely metric we’ve got our eye on — if using hierarchy in our models improves their performance, the metrics should show it.

Problem is, if we use regular performance metrics — the ones designed for flat, one-level classification — we go back to ignoring that natural taxonomy of the data.

If we do hierarchy, let’s do it all the way. If we’ve decided to celebrate our data’s taxonomy and build our model in its image, this needs to also be a part of measuring its performance.

How do we do this? The answer lies below.

This post is about measuring the performance of machine learning models designed for hierarchical classification. It kind of assumes you know what all those words mean. If you don’t, check out my previous posts on the topic. Especially the one introducing the subject. Really. You’re gonna want to know what hierarchical classification is before learning how to measure it. That’s kind of an obvious one.

Throughout this post, I’ll be giving examples based on this taxonomy of common house pets:

The taxonomy of common house pets. My neighbor just adopted the cutest baby Pegasus.

So we’ve got a whole ensemble of hierarchically-structured local classifiers, ready to do our bidding. How do we evaluate them?

That is not a trivial problem, and the solution is not obvious. As we’ve seen in previous problems in this series, different projects require different treatment. The best metric could differ depending on the specific requirements and limitations of your project.

All in all, there are three main options to choose from. Let’s introduce them, shall we?

**The contestants, in all their grace and glory:**

#machine-learning #hierarchical #performance-metrics #ensemble-learning #metrics

1592023980

Take your current understanding and skills on machine learning algorithms to the next level with this article. What is regression analysis in simple words? How is it applied in practice for real-world problems? And what is the possible snippet of codes in Python you can use for implementation regression algorithms for various objectives? Let’s forget about boring learning stuff and talk about science and the way it works.

#linear-regression-python #linear-regression #multivariate-regression #regression #python-programming

1598352300

Machine learning algorithms are not your regular algorithms that we may be used to because they are often described by a combination of some complex statistics and mathematics. Since it is very important to understand the background of any algorithm you want to implement, this could pose a challenge to people with a non-mathematical background as the maths can sap your motivation by slowing you down.

In this article, we would be discussing linear and logistic regression and some regression techniques assuming we all have heard or even learnt about the Linear model in Mathematics class at high school. Hopefully, at the end of the article, the concept would be clearer.

**Regression Analysis **is a statistical process for estimating the relationships between the **dependent variables ( say Y)** and one or more

#regression #machine-learning #beginner #logistic-regression #linear-regression #deep learning

1601431200

The most glamorous part of a data analytics project/report is, as many would agree, the one where the Machine Learning algorithms do their magic using the data. However, one of the most overlooked part of the process is the preprocessing of data.

A lot more significant effort is put into preparing the data to fit a model on rather than tuning the model to fit the data better. One such preprocessing technique that we intend to disentangle is **Polynomial Regression**.

#data-science #machine-learning #polynomial-regression #regression #linear-regression

1603170000

Generalized Linear Model (*GLM*) is popular because it can deal with a wide range of data with different response variable types (such as *binomial*_, * Poisson*, or _

Before diving into the diagnoses, we need to be familiar with several types of residuals because we will use them throughout the post. In the Gaussian linear model, the concept of residual is very straight forward which basically describes the difference between the predicted value (by the fitted model) and the data.

Response residuals

In the GLM, it is called “response” residuals, which is just a notation to be differentiated from other types of residuals.The variance of the response is no more constant in GLM, which leads us to make some modifications to the residuals.If we rescale the response residual by the standard error of the estimates, it becomes the Pearson residual.

#data-science #linear-models #model #regression #r