InfinityCodeX


Check out our blogs where we cover topics such as Python, Data Science, Machine Learning, Deep Learning. A Best place to start your AI career for beginner, intermediate peoples.

Coronavirus Detection using Machine Learning. Data Scientist who will save the world


Disclaimer : This blog is completely hypothetical & our only purpose is to
 educate  peoples who are interested in DataScience & MachineLearning.


Welcome Knights, you have arrived at the head-quarter of InfinityCodeX.

This organization ought to unleash the true power of your genius…

Welcome aboard….we need your power to save our planet from chaos named “CORONA” also known as “COVID-19”.

By the way have I introduced my self you yet? I am the founder & your today's Captain in this mission InfinityX. You are acknowledged to be the worthy for an elite army of Super-Heros.

That is enough for introduction…We have no time to spare…Hurry & amplify your power of skills to match the destructive catastrophe that we are facing.

These are the portals which will lead you to the amplification of your powers. Remember the key to true power is complete knowledge of both portals. We will assemble once you are ready.

Now, Go Knights

PORTALS :




Oh! You reported back really quick…Excellent Knights. It seems you are ready to join the battlefield. Before stepping in the war-zone let’s have a look & history of over foe.



This is “CORONAVIRUS”…A formidable opponent which attacked our planet at location Wuhan City of China few months ago. Precisely on December 1St  of year 2019. 





 And this virus is spreading surprisingly fast…




According to our intel’s the cause of this virus is known to be a Bat. Bats are mammals of the order Chiroptera; with their forelimbs adapted as wings, they are the only mammals capable of true and sustained flight. Bats are more manoeuvrable than birds, flying with their very long spread-out digits covered with a thin membrane or patagium.  



Our intel unit received a word from WHO(World Health Organization)…The message was devastating…15000+ Deaths…. Yes you read it right more than 15000 casualties we have to face in this short amount of time.




On 30th of January 2020, WHO(World Health Organization) declared Red Alert i.e Public Health Emergency on an international level.



Shocked? I think you have understood how devastating this virus is… It’s a critical situation… But there is still a hope fort this world… Our Government, Scientists & Doctors are already on the battlefield & now the time is right to show our enemy the power of Data Scientist of InfinityCodeX. Now fate of our home planet lies in your hand.

Here is the DataSet which is provided by elite scientists from the world :


Download the Dataset into your system. Now listen the First mission of Knights named Kill-CORONA is divided into 2 parts :

1.) Identifying the factors which are causing death using power of PORTAL-α.

2.) Generating a Classification model using the power of PORTAL- Ω.

Now come on Knights Do or Die, Failure is not an option…



Listen Knights now I will give you intel about how to execute our mission. So listen care-fully. Remember you hold countless life’s in your hands.

Attention over here, Before diving into the mission’s detail I hope everyone is clear with the knowledge of the portal’s.

Okay then……….

Knights mission 1 part 1 :

Step 1.1 :

At this stage we have to equipped & load all our weapons i.e we have to pre-install all the libraries because in there will be no time for you to reload your weapons while facing enemies.  

Step 1.2 :

Call the Dataset which is given by scientist.

Step 1.3 :


DATA WRANGLING … This is the most important process & remember it is not a joke if you screwed anything here then it’s over. Then surely we all gona die.

This Step is further divided into 3 part’s :

1.) Create a dummy variable of important columns
2.) Concatenate that with our data set
3.) Then drop that column from which we created dummy variables & the last column of that dummy variable columns to avoid dummy variable trap

If anyone is wetting their pants then you can leave. There is no room for scaredy cat’s.

Step 1.4 :

Step 4 includes multiple steps like :

1.) Separating Independent & Dependent variable.
2.) Calling LinearRegression & creating a model.
3.) Training that model.
4.)Finding Co-efficient & Intercept.

After this tough battle we will reach at the enemies head-quarter.

From there Knights your mission 1 part 2 will begin :

Step 2.1 :

After entering enemies head-quarters we will Analyze our data.

Step 2.2 :

Then we will train our model while reaching at the top floor of there head-quarter & test our model into there system. When you reach at the gate of there system control room call me for at once.
That’s an order.

Step 2.3 :

Then we will check how accurate our model is on there system.

Now Go…Go…Go…Go…


Data Scientist Who will Save The World | Coding begins



 Knights mission 1 part 1 :   ( Practicals )

MULTI-VARIABLE LINEAR REGRESSION




#Call the import libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline 


#Call the Data

import pandas as pd
c_df=pd.read_csv(r"E:\New folder\Corona_unofficial.csv")

c_df.head() 


output :




#showing all the information

c_df.info()

output :




#Counting number of null values they Include

c_df.isnull().sum()

output :

Data Wrangling

If the value of the string is important, we can't use it directly. To use that we have to create dummy variable of that column by get_dummies method

We will do 3 steps to use that data in our advantage

1.) Create a dummy variable of important columns
2.) Concatenate that with our data set
3.) Then drop that column from which we created dummy variables & the last column of that dummy variable columns to avoid dummy variable trap


# 1.) Create a dummy variable of important columns

dummies_loc=pd.get_dummies(c_df.location)
dummies_loc
output :


# 2.) Concatenate that with our data set

corona_data=pd.concat([c_df,dummies_loc],axis=1)
corona_data
output :



# 3.) Then drop that column from which we created dummy variables & the last column of that dummy variable columns to avoid dummy variable trap

corona_data=corona_data.drop(['location','Zhuhai'],axis=1)
corona_data
output :



# 1.) Create a dummy variable of important columns

dummies_country=pd.get_dummies(c_df.country)
dummies_country
output :



# 2.) Concatenate that with our data set

corona_data=pd.concat([corona_data,dummies_country],axis=1)
corona_data
output :



# 3.) Then drop that column from which we created dummy variables & the last column of that dummy variable columns to avoid dummy variable trap

corona_data=corona_data.drop(['country','Vietnam'],axis=1)
corona_data
output :



# 1.) Create a dummy variable of important columns

dummies_gender=pd.get_dummies(c_df.gender)
dummies_gender
output :



# 2.) Concatenate that with our data set

corona_data=pd.concat([corona_data,dummies_gender],axis=1)
corona_data
output :



# 3.) Then drop that column from which we created dummy variables & the last column of that dummy variable columns to avoid dummy variable trap

corona_data=corona_data.drop(['gender','Not-specify'],axis=1)
corona_data
output :



# Replacing index with reporting date

corona_data=corona_data.set_index("reporting date")
corona_data
output :



# Deleting unwanted columns

corona_data=corona_data.drop(['hosp_visit_date','visiting Wuhan','from Wuhan'],axis=1)
corona_data
output :



# Separating Independent & Dependent variable

X=corona_data.drop(['death'],axis=1)
y=corona_data['death']


# Calling LinearRegression & creating a model

from sklearn.linear_model import LinearRegression
model1=LinearRegression()

#Training that model

model1.fit(X,y)
output :
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

#This are m in y=m*x+c i.e Co-efficient
model1.coef_
output :
array([-1.78353495e-04,  3.27590733e-04, -6.16209630e-04,  1.86854419e-03,
        6.96720629e-03,  1.39875741e-02,  6.02480354e-02,  8.63267754e-03,
        7.62115033e-03, -4.95165190e-03,  2.91500816e-04,  8.25701819e-03,
        2.03387422e-02, -1.45560264e-02,  5.45053491e+09,  7.48224240e+08,
       -1.93189826e+08, -6.85290102e+07,  7.63533688e+08, -6.82565144e+08,
        7.63533688e+08, -1.87495053e+09,  4.73181305e+08,  1.09167456e+09,
        4.15652371e+09, -6.82565143e+08,  1.09167456e+09, -1.20911670e+08,
        7.64621070e+08, -8.77115118e+08,  7.63533688e+08,  7.63533688e+08,
        7.63533687e+08,  4.73181304e+08, -6.82565143e+08, -6.82565143e+08,
       -6.82565144e+08,  7.48224240e+08, -1.20911669e+08,  2.10716157e+08,
        7.63533688e+08, -1.28024309e+08, -4.06393385e-03,  7.63533688e+08,
        1.09167456e+09, -1.20911670e+08,  7.48224240e+08, -1.20911670e+08,
        7.48224240e+08, -1.20911670e+08, -1.87495053e+09, -1.20911670e+08,
        1.09167455e+09,  7.48224240e+08,  2.20813798e-01, -1.20911670e+08,
       -1.20911669e+08, -1.20911670e+08, -1.20911670e+08,  1.09167456e+09,
        1.83057707e-01,  7.48224240e+08, -4.15330976e-01, -1.20911670e+08,
       -1.20911670e+08,  4.73181304e+08, -1.20911669e+08, -3.63477526e+08,
        7.48224240e+08,  5.66215434e+08,  7.48224240e+08, -1.20911669e+08,
       -1.20911670e+08, -1.20911670e+08, -5.93570509e+08, -1.65880556e+08,
        7.48224240e+08,  2.48698905e+08, -1.93189825e+08, -4.73236592e-01,
        7.48224241e+08,  7.48224240e+08,  2.24578712e+09, -4.93237954e-01,
       -5.44389716e-01,  7.48224240e+08, -5.93570508e+08, -2.09357213e+08,
       -5.49547717e+08, -1.20911670e+08,  7.63533688e+08, -7.89237140e+08,
        7.63533688e+08, -1.20911670e+08, -6.82565143e+08, -5.93570508e+08,
       -6.82565144e+08,  4.64971302e+08,  4.73181305e+08,  7.48224241e+08,
        7.63533688e+08, -8.39851439e+08,  7.48224240e+08,  7.48224240e+08,
        7.63533688e+08,  7.48224240e+08, -9.15541108e-01,  7.63533687e+08,
       -1.20911670e+08, -7.89237140e+08,  1.09167456e+09,  7.48224240e+08,
        7.48224240e+08,  7.63533688e+08,  4.64971303e+08,  4.62723539e+07,
       -1.87495053e+09, -8.39851439e+08,  1.09167456e+09, -2.16213728e+09,
        7.48224240e+08,  7.63533688e+08,  7.48224240e+08,  7.48224240e+08,
       -4.63690059e+08, -1.20911670e+08, -1.20911670e+08, -1.20911670e+08,
       -1.20911670e+08, -1.20911669e+08, -1.20911670e+08, -1.20911670e+08,
       -2.06237650e+08, -8.39851438e+08, -4.63690059e+08, -2.75039763e+08,
        7.63533687e+08, -5.41276745e+09, -1.87495053e+09, -6.82565143e+08,
        4.73181304e+08,  3.95286999e+09,  7.17725877e-01, -1.20911670e+08,
        7.48224240e+08,  1.45133633e+09, -9.70032866e-01,  1.09167456e+09,
        7.08586818e+08, -4.12751959e+09, -7.89237141e+08, -6.82565144e+08,
        1.45133633e+09, -8.39851439e+08,  5.35473978e-01,  7.48224240e+08,
       -7.89237140e+08,  5.81083871e-02,  4.73181304e+08,  4.73181305e+08,
       -1.20911670e+08, -1.20911670e+08, -1.05418910e+00, -7.89237140e+08,
       -1.20911670e+08,  7.08586818e+08, -6.82565143e+08, -1.20911669e+08,
       -5.45053491e+09,  6.85290110e+07,  8.39851439e+08,  3.63477527e+08,
       -4.15652371e+09, -7.64621069e+08, -4.62723532e+07, -1.45133633e+09,
        1.20911670e+08, -2.10716157e+08,  1.28024309e+08,  2.09357214e+08,
       -7.63533688e+08, -1.09167456e+09,  6.76132770e-01,  1.93189826e+08,
        1.87495053e+09, -5.66215434e+08,  2.16213728e+09, -7.48224240e+08,
       -2.24578712e+09,  5.49547718e+08,  5.93570509e+08, -2.48698905e+08,
       -4.64971303e+08, -7.08586818e+08,  2.06237650e+08,  4.63690059e+08,
        6.82565144e+08,  2.75039764e+08,  1.65880556e+08,  8.77115118e+08,
        5.41276745e+09, -3.95286999e+09,  4.12751959e+09,  7.89237141e+08,
       -4.73181304e+08,  4.35575854e-02, -3.45127022e-03])

# This are c in y=m*x+c
model1.intercept_
output :
0.4251844306557958
By inserting values of x i.e independent-variable we will get the output that the person will die or not ( 0 = Not-Die , 1 = Die) but there are multiple values so i am skipping that step
LOGISTIC REGRESSION
corona_data.info()
output :
<class 'pandas.core.frame.DataFrame'>
Index: 1085 entries, 20-01-2020 to 25-02-2020
Columns: 208 entries, Unnamed: 0 to male
dtypes: int64(15), uint8(193)
memory usage: 340.1+ KB

Analyzing our Data
# Comparing number of deaths 0 = Not-dead & 1 = Dead
plt.figure(figsize=(8,7)) sns.countplot(data=corona_data,x="death")
output :
plt.figure(figsize=(7,7)) sns.countplot(x="death",hue="male",data=corona_data)
output :
plt.figure(figsize=(7,7)) sns.countplot(x="death",hue="female",data=corona_data)
output :
Test & Train our model
# Training our model
from sklearn.model_selection import train_test_split X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.2)
#creating a LogisticRegression model & training it with fit
from sklearn.linear_model import LogisticRegression model=LogisticRegression() model.fit(X,y)
output :
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='warn', n_jobs=None, penalty='l2',
                   random_state=None, solver='warn', tol=0.0001, verbose=0,
                   warm_start=False)
#model Score
model.score(X,y)
output :
0.647926267281106
# What our model predicted
pred=model.predict(X_test) pred
output :
array([0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1, 0,
0, 0, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 1,
1, 0, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1,
0, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 0, 1,
1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0,
1, 1, 1, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1,
1, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1,
0, 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1,
1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1,
0, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1], dtype=int64)
from sklearn.metrics import classification_report val=classification_report(y_test,pred) val
output :


# Value counts of our training set
y_test.value_counts()
output :
1    119
0     98
Name: death, dtype: int64
# This is confusion matrix. I'll explain this in our next blog post
from sklearn import metrics print(metrics.confusion_matrix(y_test,pred))
output :
[[63 35]
[35 84]]
print(metrics.recall_score(y_test,pred))
output :
0.7058823529411765
Check Accuracy
# Testing Accuracy of our model
from sklearn.metrics import accuracy_score val2=accuracy_score(y_test,pred) val2
output :
0.6774193548387096
Hey guys...Here is link of my code :
https://github.com/Vegadhardik7/Git_Prac_Repo/blob/master/CORONA_PROJECT_FINAL.ipynb
Hey guy! did you enjoyed our content?....Can we improve more?....Tell us in comment section & please share this with your friends & family.

2 comments:

No Spamming and No Offensive Language

Powered by Blogger.