LOGISTIC REGRESSION
Goal
-
The goal is to output the probability of Y
-
Based on this, it can be classified into different categories
-
The logistic regression model converts the summation of all the weights * inputs, using the sigmoid function, into a value between 0 and 1
Types of classification in logistic regression
-
Binary (Pass, Fail)
-
Multi (Pizza, Spaghetti, Ravioli)
-
Ordinal (Low, medium, high)
Illustration of the network
​​
​
​
​
​
​
​​​
​
​
​
​
​
​
​
​
​
​
​
​
​​​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​
​​​​
​​
​
​
​
​
​
​
​
​
​
​
​
​
​
2. Write the cost function using the cost equation listed above
3. Write the gradient function using the same code written in linear regression (as both are targeted to find the minimum of the cost function)
4. Clean the data as necessary (with one hot encoding or masking, using a separate method
How to code a Logistic Regression function
Sigmoid
-
Relevancy​
-
Sigmoid is the activation function for a logistic regression algorithm and helps to define this regression
-
An activation function is a mathematical gate between the input and output. For example, the step function from linear regression is an activation function.
-
Sigmoid outputs a number between 0 and 1
-
-
-
Equation
-
This equation involves ​
-
​​
3.
-
Visualization
-
Sigmoids graph can be shown by outputting a number between 0 and 1 where the graph can be shown by (where the y axis is sigmoid). A number between 0 and 1 can be seen as the probability. This probability can then be classified.
-
Cost Function
-
Concept
-
The cost function is also an important characteristic of logistic regression. It can be seen that the closer to the correct prediction, the lower the cost.
-
Instead of mean squared error from linear regression, logistic regression uses an equation known as cross entropy or log loss as its loss function.
-
-
Equation
3. Visualization
Gradient Descent
-
Concept
-
Gradient descent is the algorithm used to get the weights with least cost by moving in the direction of the steepest descent
-
The concept is quite similar to linear regression (finding where the tangent is 0) with different steps and equations.
-
-
Equation
-
For gradient descent we need to calculate the partial derivative of the cost function, like linear regression. For this we should follow the steps involved in multivariable calculus. Since the derivation is long and strenuous, we will display the result from this process: ∂c/∂w= (y-σ)*X
-
-
Pseudocode
-
The steps follow:
-
Calculate the gradient average
-
Multiply by learning rate
-
Subtract from weights
-
-
1. Import any needed libraries
5. Make a create data, where you import the data, call the clean strings method, your data, create a bias column and stack that on. Have a create test data where a similar process is followed, except predict the output of the test data (and categorize it if needed).
6. Create your X and Y from the train data, learning rate, a graph for costs over the amount of iterations, and an array of weights
7. Train the model through calling the cost function and then the calculate gradient function to keep iterating and adjusting weights till a minimum cost is reached.
8. Print any necessary variables, such as initial cost, learning rate, ending costs, weights, scaled x, scaled y, and the first value in the gradient list.