ML/AI Algorithms Coded from Scratch
Machine Learning/AI Algorithms Coded From Scratch
Coding machine learning and artificial intelligence algorithms from scratch shows a deep understanding of the algorithms and how they work. The following list of algorithms link to code in an ML/AI GitHub Repository containing the algorithms Barrett Duna wrote.
These algorithms were written in Python 3 from scratch by Barrett Duna. The two main packages they make use of are pandas and numpy. Pandas is a data frame package for loading, describing and manipulating data frames. Numpy is a scientific numeric computing package that works on arrays of all dimensions.
- This is the original single-layer artificial neural network. It's very simple which applies a simple step function and updates the weights proportional to the error between the predicted output and the actual output.
- Adaptive Linear Neurons (Adaline)
- This is a later (in history) modification to the simple perceptron. Instead of applying the step function before training, Adaline applies the weight update step before the step function as compared to Adaline.
- K Nearest Neighbors (KNN)
- This algorithm classifies new data points based on the K closest neighbors as measured by the euclidean distance. For a prediction, each distance is calculated between the new data point and the data points in the dataset and the K closest data points vote on the class.
- Linear Regression with Stochastic Gradient Descent
- This is an implementation of multivariate linear regression using stochastic gradient descent (SGD) to train the model. SGD, as opposed to batch gradient descent or mini-batch gradient descent, updates the weights of the linear regression model one data point at a time. One challenge was getting the model to converge to a solution. The challenge was overcome by standardizing the feature inputs in the fitting process and then unstandardizing them before output.
- Logistic Regression
- Logistic regression uses a sigmoid activation function to squash weighted inputs into probabilities between zero and one. This allows binary classification because you can compare the probability of class 1 to class 0 and choose the class with the highest probability.
- K-Means Clustering Algorithm
- This algorithm differs from the others because it is an unsupervised clustering algorithm as compared to the supervised algorithms above. It randomly chooses K points and iterates over the dataset and assigns each point to one of K clusters based on the closest distance. It then computes the average of the new clusters (called the clusters' centroids) and repeats the process until the centroids remain unchanged.