Memulai dengan Pembelajaran Mesin
'''' One thing to put in mind is that its easy. Once you get
the concept everything will be clear.
NOTE: Some maths are required for the first example, and a background in python programming is required to follow along.
Lets start with a simple model such as a Linear Regression model.
All that is to it is that it's a graph that has a best fit line.
Take this image as an example ( https://pythonguides.com/wp-content/uploads/2021/09/Matplotlib-best-fit-line.png )
Try looking at it while i explain.
You can see a red line and some blue points. Now the LR ( Linear regression ) model
tends to find the best slope for the best fit line.
This helps us in many ways.
Lets say you give the LR model some input y. The model looks at the best fit line and
decides that a certain point x is the answer. Now the accuracy fluctuates according to
the feeding dataset, and it's measurable.
I'll write some python code below just for a quick demo. '''
# At first we need to import some libraries.
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
# Here we import the dataset as CSV file
dataset = pd.read_csv("./house_prices.csv")
''' Let's say we have a dataset for house prices. We have area, floors and price.
Now we want to get the price So out X_training data will be all the dataset excluding the prices
'''
#So we say
X = np.array(dataset.drop(columns=["prices"]))
# And y is the prices
y = dataset["prices"]
# Now we need to train the model
# Enter the variables in hte same order as i did or it won't work
# As you can see here i assigned the X and y variables for training, and you can see something called test_size=0.1. Now we want to train the model more than we wat to test in order to get the most accurate results. 1 = 100% || 0.1 = 10%
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.1 )
model = LinearRegression()
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)
print(accuracy)
# This should print something like 0.942622623 or 0.14663625 according to the dataset validity.
''' Editors note: If there is anything unclear contact me at: [email protected]
And please excuse any grammar mistakes but it's hard to write in grepper's code box :)
Thanks
~ Fouad
'''
Code Sir