InfinityCodeX


Check out our blogs where we cover topics such as Python, Data Science, Machine Learning, Deep Learning. A Best place to start your AI career for beginner, intermediate peoples.

Confusion Matrix & Classification Report




So today we will be discuss about Confusion Matrices & Classification Report.

Our today’s agenda would be divided into 3 parts :

Part 1 : Confusion Matrices


- What is Confusion Matrix?- Confusion Matrix Explanation



Part 2 : Classification Report


- What is Classification Report?



Part 3 : Practical Code


Now let’s get started;


Part 1 : Confusion Matrix


Q.1) What is Confusion Matrix ?


-   - A Confusion Matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known. It allows the visualization of the performance of an algorithm.




Q.2) Confusion Matrix Explanation


- Basically in Confusion Matrix there are 2 important things

(i)  Actual Value
(ii) Predicted Values

(i) Actual Values :

These are the values which are already considered as True. These values can be our Test data(the dataset which we have splitted for original data to test our model is known as Test Data). This Data can be some other data which resembles our original dataset.


(ii) Predicted Values :

Any predicted / random values which is not related to any dataset & which help us to check the accuracy of our model is known as Predicted Values.


Let us dive more deeper into confusion matrix with an example :

From last 2 weeks Johnny is feeling cold & he is not able to breath properly. So he decide to go at Doctors clinic & have a check up. That is he suffering from Coronavirus or not.

So here are our actual values i.e is he infected from coronavirus or not :



Here are our predicted value i.e decision of the Doctor that he really had coronavirus or not :



So now we saw the actual & predicted values new let us see the 4 most import values which are :

(i)   TN
(ii)  FN
(iii) FP
(iv) TP

Now let them see one by one :

(i) TN : True Negative



In this case Johnny went to the doctor & explained his symptoms. After completely examine Johnny, Doctor gave him advice that he is not infected by coronavirus.
In reality Johnny is really not infected by the coronavirus. Which means it is True that he have Negative symptoms of that diseases.


(ii) FN : False Negative



In this case Johnny went to the doctor & explained his symptoms. After completely examine Johnny, Doctor gave him advice that he is not infected by coronavirus.
In reality Johnny is really infected by the coronavirus. Which means Doctors prediction is False & the reality is that he is infected by the coronavirus. So here Doctors False prediction is Negative for his life that’s why it is False Negative.
Note : False Negative is also known as Type Two Error.


(iii) FP : False Positive



In this case Johnny went to the doctor & explained his symptoms. After completely examine Johnny, Doctor gave him advice that he is infected by coronavirus.
In reality Johnny is really not infected by the coronavirus. Which means it is False that he have Positive symptoms of that diseases.
Note : False Positive is also known as Type One Error.

(iv) TP : True Positive



In this case Johnny went to the doctor & explained his symptoms. After completely examine Johnny, Doctor gave him advice that he is infected by coronavirus.
In reality Johnny is really infected by the coronavirus. Which means it is True that he have Positive symptoms of that diseases.

Part 2 : Classification Report :


Q.1) What is a Classification Report?


Classification report is used to measure the quality of predictions from  classification algorithm.

(i)   Recall
(ii)  Accuracy
(iii) Error Rate
(iv) Precision

(i) Recall :

- Ability of model to find all the relevant cases within dataset.


Recall = TP / Actual Yes
           =  100 / 105
           =   0.95

(ii) Accuracy :


- Good Accuracy will be occur when X_test have same number of different values. Example : X_test have same number of Dog's & Cat's.  


Accuracy = TP + TN / Total
                = 100 + 50 / 165
                = 0.91

(iii) Error Rate :


- Frequency of errors occur.


Error Rate = FP + FN / Total
                  = 10 + 5  / 165
                  = 0.09

(iv) Precision :


- Ability of classification model to identify only the relevant data points.


Precision = TP / Predicted Yes
                = 100/110
                =  0.64

Part 3 : Practical Code


#Confusion Matrix

print(metrics.confusion_matrix(y_test,pred))


#Let's print classification report

pred=model.predict(X_test)
from sklearn.metrics import classification_report
print(classification_report(y_test,pred))


Hey guys, we tried to keep math & complexity as minimum as possible. So if you liked our content please share it with our friends & please comment below if you have any doubts.

No comments:

No Spamming and No Offensive Language

Powered by Blogger.