# Logistic Regression Model Training

## Logistic Regression Training

In this module, we will see the use of Logistic regression training using the Scikit-learn library. We will use a sample data set called ‘pima_indian_diabetes.csv‘ to run our model.

#### Load the diabetes dataset.

```
import pandas as pd
import numpy as np

```

#### Verify dataset.

```
```
 No_Times_Pregnant Plasma_Glucose Diastolic_BP Triceps Insulin BMI Age Diabetes 1 89 66 23 94 28.1 21 0 0 137 40 35 168 43.1 33 1 3 78 50 32 88 31.0 26 1 2 197 70 45 543 30.5 53 1 1 189 60 23 846 30.1 59 1

#### Normalizing continuous features:

Normalization is required to keep all the values on the same scale.

```
df = pima[['No_Times_Pregnant', 'Plasma_Glucose', 'Diastolic_BP', 'Triceps','Insulin', 'BMI', 'Age']]
normalized_df=(df-df.mean())/df.std()
pima = pima.drop(['No_Times_Pregnant', 'Plasma_Glucose', 'Diastolic_BP', 'Triceps','Insulin', 'BMI', 'Age'], 1)
pima = pd.concat([pima,normalized_df],axis=1)
```

Output:

Diabetes No_Times_Pregnant Plasma_Glucose Diastolic_BP Triceps Insulin BMI Age
0 -0.716511 -1.089653 -0.373178 -0.584363 -0.522175 -0.709514 -0.967063
1 -1.027899 0.465719 -2.453828 0.556709 0.100502 1.424909 0.209318
1 -0.093734 -1.446093 -1.653578 0.271441 -0.572662 -0.296859 -0.476904
1 -0.405123 2.409934 -0.053078 1.507603 3.255961 -0.368007 2.169953
1 -0.716511 2.150705 -0.853328 -0.584363 5.805571 -0.424924 2.758143

#### Split the data into train and test data.

```
from sklearn.model_selection import train_test_split

# Putting feature variable to X
X = pima.drop(['Diabetes'],axis=1)

# Putting response variable to y
y = pima['Diabetes']
# Splitting the data into train and test
X_train, X_test, y_train, y_test = train_test_split(X,y, train_size=0.7,test_size=0.3,random_state=100) ```

#### Train the data.

Using the Scikit-learn library.

``` from sklearn.linear_model import LogisticRegression
model = LogisticRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)```

Using the statsmodel library.

```
import statsmodels.api as sm
logm1 = sm.GLM(y_train,(sm.add_constant(X_train)), family = sm.families.Binomial())
logm1.fit().summary() ```

#### Validate the model:

The below example used the ‘confusion matrix‘ metric to validate the model.

```
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
```

Output:

``````[[68 12]
[16 22]]``````

Conclusion:

In this blog, you learned how to create a model with basic Logistic Regression Training. In the next blog, you will find more about confusion-matrix, sensitivity, and specificity.

### One thought on “Logistic Regression Model Training”

• December 3, 2020 at 1:28 pm