Gaussian Processes Concrete Predictions

This project aims to understand Gaussian Processes (GP), and apply a Guassian Process Regression to a concrete data set for optimized Baysian Inference search. Using this project we can further abstract a model that queries multivariate parameters to find an optimized concrete samples over a global search spaces. Theoretically allowing us to make stronger and stronger concrete. The learning outcomes of this project are to understand the defining equations of a GP, how a GP can be used in a larger Bayesian optimization workflow, and and what are the key components of utilizing this machine learning model. 

Application Overview

The application portion of this project was done using a multivariate data set about concrete compressive strengths. First I split the data set into training and querying portions, where the querying portions were unlabeled. Then by training a gaussian process on the data and using the mean and covariance functions I was able to calculate depending on the querying function I was able to choose the best sample to label from the unlabeled data set. Doing this iteritively and training the model after each given set of iterations, we can optimally search an N-dimensional space. Using this workflow we can further abstract a model that queries multivariate parameters to create an optimized concrete sample over a global search space. Theoretically allowing us to make stronger and stronger concrete.

This workflow of iteratively training a gaussian process, querying data, and retraining the GP on newly labeled data to optimally search is called bayesian optimization.

Workflow Visualization:

All of this is completed in the Active Learning 1 Jupyter Notebook where I preprocess the data set and complete the iterative workflow to search for the samples with the best concrete compressive strength.

Once the workflow was finished running I plotted the highest compressive strength labels found for each querying method over iterations.

The X-axis is iterations, and the Y axis is compressive strength. Blue line is uncertainty querying function, red line is greedy sampling, yellow line is MCAL sampling, and the green line is random sampling. We can see in the graph that all the strategies except for greedy search performed better than random sampling through the local search space. This graph can be found at the bottom of the Active Learning 1 Jupyter Notebook. The definitions and inner workings of the query strategies are beyond the scope of this project. 

Notebooks

Gradientdescent1.ipynb Gradient Descent for GP Notebook File

In this file I walk through how to use gradient descent to optimize a gaussian process regression model

Gaussianprocess.ipynb Gaussian Process Jupyter Notebook File

In this file I walk through how a gaussian proces regression is created, the hyperparameters we optimize, and how gradient descent affects our model

Activelearning1.ipynb Active Learning Application

In this file we apply the gradient descent workflow for bayesian optimization on a concrete toy dataset

Defining Equations:

A gaussian process regression can be defined by calculating the mean function and the covariance function of the data we are modeling. A graph of this is shown in Gaussian Process Jupyter Notebook File. A formalization of this can be shown as 

        source: http://www.gaussianprocess.org/gpml/chapters/RW2.pdf

Key Components:

To train a Gaussian Process you need the following components:

The equation for Negative Log Likelihood is:

The RBF Kernel I used: 

And gradient descent has an implementation similar to that of Machine Learning Refined Github: 

Adapted from: https://github.com/jermwatt/machine_learning_refined and 

https://nipunbatra.github.io/blog/ml/2020/03/29/param-learning.html

GitHub Link: