Regression

Concept of Regression in Machine Learning

Basic concepts and mathematics

  • The input or predictor variable is the variable(s) that help predict the value of the output variable. It is commonly referred to as X.
  • The output variable is the variable that we want to predict. It is commonly referred to as Y.
Actual LR Equation
Fit-model
Estimated Parameter
Estimated parameter
  1. It is used to predict the future values of continuous data.
  2. Data Should be linear with a single independent and dependent variable.
Fundamental process of Machine Learning model
###Import necessary libraries
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
### Generate 'random' data
np.random.seed(0)
### Array of 100 values with mean = 1.5, stddev = 2.5
x = 2.5 * np.random.randn(100) + 1.5
# Generate 100 residual terms
res = 0.5 * np.random.randn(100)
# Actual values of Y
Y = 2 + 0.3 * X + res
# Create pandas dataframe to store our X and y values
df = pd.DataFrame( {'x': x, 'Y': Y})
# Show the first five rows of our dataframe
df.head()
Random Data generated from the above code
# Calculate the mean of X and y
xmean = np.mean(x)
ymean = np.mean(Y)
# Calculate the terms needed for the numerator and denominator of beta
df['xycov'] = (df['x'] - xmean) * (df['Y'] - ymean)
df['xvar'] = (df['x'] - xmean)**2
# Calculate beta and alpha
b1 = df['xycov'].sum() / df['xvar'].sum()
b0 = ymean - (beta * xmean)
print(f'b0 = {b0}')
print(f'b1 = {b1}')
Estimated parameters
ypred = b0 + b1 x
Output
# Plot regression against actual data
plt.figure(figsize=(12, 6))
plt.plot(x, ypred) # regression line
plt.plot(x, Y, 'ro') # scatter plot showing actual data
plt.title('Actual vs Predicted')
plt.xlabel('x')
plt.ylabel('Y')
plt.show()
Actual vs predicted plot
  1. Multi Linear Regression is also used to predict the future values of continuous data.
  2. Data Should be linear with multiple independent variables and a single dependent variable.
  1. Polynomial Regression is used to predict the future values of continuous non-linear data.
  2. Data Should be non-linear, with multiple independent variables and a single dependent variable.
  1. Multi-variant Regression is used to predict multiple predictors with help of multiple response data.

“Don’t dream to join a tech giant company, dream to make a company gigantic.” — Sai

--

--

--

A Machine Learning technology researcher commencing a new Blog Series to make clear ML concepts Simple and ease for everyone.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Edwards Deming 14 Case Study

Indicators It Is Time for a New Bedmattress https://t.co/JAeht3XmSf

Building machine learning models to predict Instagram engagement

The Size of The Scuba Diving Industry

Scuba diving statistics, certifications & dive gear sales: USA, Europe, and Worldwide

Top 10 Facebook Accounts For Data Science…

Insights from Seattle Airbnb data

Time Series Analysis and Forecasting with Python

Isdal Woman: Connecting the Scattered Dots

The contents of the first page of the notepad

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
sai krishna

sai krishna

A Machine Learning technology researcher commencing a new Blog Series to make clear ML concepts Simple and ease for everyone.

More from Medium

Classification and Regression

Introduction to Machine Learning

Z-score in detail with examples

How to Predict Housing Prices with Linear Regression?