UFC Machine Learning | Boyd Gibson

Introduction

This project involves analyzing UFC fight data using machine learning techniques. The dataset used is sourced from Kaggle and includes various statistics and attributes related to UFC fighters. The goal is to predict the winner of a fight based on historical data, including fighter attributes and fight statistics.

Code Breakdown

The code is structured into several parts:

Data Preparation: This section involves loading and cleaning the dataset, and preparing it for analysis and modeling.

Data Visualization: Various graphs are plotted to understand the distribution of features such as age, reach, height, and experience.

Machine Learning Models: Two machine learning models, Decision Tree and Random Forest, are trained and evaluated to predict fight outcomes based on fighter attributes.

Code Samples

Data Preparation

This part of the code focuses on loading the dataset, checking its structure, and preparing it for analysis.

Explanation: This section imports necessary libraries and loads the dataset. It then checks the dataset's structure and unique values of each column. The code then initializes a new column, `A_Winner`, to store the stance of the winner, and updates this column based on the 'Winner' field. Finally, the updated DataFrame is saved to a new CSV file.

Data Visualization

This part of the code generates visualizations to understand feature distributions and their impact on fight outcomes.

Explanation: This section of the code creates graphs to analyze the data. It defines functions to classify fighters' stances and age conditions, updates the dataset accordingly, and plots the distribution of stances against the winner's age condition.

Decision Tree

This section involves training a Decision Tree classifier to predict fight outcomes based on various fighter attributes.

Explanation: This part of the code initializes and trains a Decision Tree classifier. It first prepares the dataset by dropping unnecessary columns and handling missing values. Features are encoded, and the model is trained and evaluated. The final predictions are used to update the dataset and saved to a CSV file.

Random Forest

This section applies a Random Forest classifier to predict fight outcomes and evaluates its performance.

Explanation: This section initializes and trains a Random Forest classifier. Similar to the Decision Tree, it prepares the data, applies one-hot encoding, and evaluates the model's performance. The accuracy of the Random Forest model is then printed.

Solution Visualization

Below is an example of the printed output illustrating how the solution is presented:

Conclusion

The project demonstrates how to use machine learning models to predict UFC fight outcomes based on fighter attributes. The Decision Tree and Random Forest models provide insights into the effectiveness of different features in predicting fight outcomes. By analyzing and visualizing the data, the project helps in understanding the impact of various attributes on the fight results. Further improvements could include tuning model parameters and exploring additional features for better prediction accuracy.

Full Code PDF