Gold Price Prediction Using ML: A Complete Project Guide

Gold price prediction using machine learning displayed on phone and laptop screen with an upward trend and gold coin stacks

Introduction

Gold, often viewed as a safe-haven asset, holds immense value in both investment and economic stability. Forecasting its price can unlock actionable insights for investors, traders, and policymakers.

In this machine learning project, we aim to predict gold prices using historical financial data and deploy the model using Streamlit, a popular Python web framework. The project includes data preprocessing, exploratory data analysis (EDA), model building, evaluation, and web deployment.

Why Predict Gold Prices?

Several macroeconomic factors affect gold prices:

  • Inflation
  • Currency strength (e.g., USD Index)
  • Oil and energy prices
  • Stock indices (S&P 500, Dow Jones)
  • Global market sentiment

Accurate prediction helps:

  • Make informed investment decisions
  • Design hedging strategies
  • Forecast economic trends

Project Overview

  • Goal: Predict gold prices using ML models based on financial and commodity market indicators.
  • Steps:
    1. Collect and clean data
    2. Perform EDA
    3. Feature engineering
    4. Train ML models
    5. Evaluate performance
    6. Deploy using Streamlit

Dataset Description

The dataset (FINAL_USO.csv) contains 1,718 records with 81 columns, including:

  • Gold prices (Open, High, Low, Close, Adj Close, Volume)
  • Market indicators: S&P 500, Dow Jones, Euro, Oil, Silver, Platinum, Palladium
  • Trends & Prices: Daily trends for other financial instruments (USD Index, ETFs like GDX, USO)

πŸ“‚ Dataset Name: FINAL_USO.csv
πŸ”— Download Link: Download Gold Price Prediction Dataset (CSV)

Each row represents daily market data, making it ideal for time-series analysis.

Data Preprocessing

Steps:

  • Convert Date to datetime format
  • Handle missing values (if any)
  • Normalize or scale numeric features
  • Split into features (X) and target (y = Close price of gold)
# Convert Date and set index
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

# Define target and features
y = df['Close']
X = df.drop(columns=['Close', 'Adj Close'])

Exploratory Data Analysis (EDA)

EDA helps uncover relationships between gold and other financial indicators.

  • Correlation heatmap to find influential variables
  • Line plots for trends (Gold vs Oil, Gold vs USD Index)
  • Lag analysis to check temporal dependencies
import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(14,8))
sns.heatmap(df.corr(numeric_only=True)[['Close']].sort_values('Close', ascending=False), annot=True, cmap='coolwarm')
plt.title('Feature Correlation with Gold Close Price')
plt.show()

Feature Engineering

Enhance predictive power with:

  • Rolling means (5-day, 10-day)
  • Price differentials (e.g., High - Low)
  • Trend indicators
  • Lagged features for time-series forecasting
df['Gold_Rolling_Mean_5'] = df['Close'].rolling(window=5).mean()
df['Gold_Diff'] = df['High'] - df['Low']

Model Building

Models Used:

  • Linear Regression
  • Random Forest
  • XGBoost Regressor

Steps:

  1. Split data into train/test
  2. Train model
  3. Evaluate using MAE, RMSE, RΒ²
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)
model = RandomForestRegressor(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

predictions = model.predict(X_test)
print("RMSE:", mean_squared_error(y_test, predictions, squared=False))
print("RΒ² Score:", r2_score(y_test, predictions))

Streamlit Deployment

Streamlit’s user-friendly UI makes deployment straightforward.

Streamlit App Code

import streamlit as st
import numpy as np

st.title("Gold Price Prediction App")

# User input
input_data = st.number_input("Enter S&P Open", min_value=0.0)
pred = model.predict([[input_data] + [0]*(X.shape[1]-1)])  # Dummy example
st.write(f"Predicted Gold Price: ${pred[0]:.2f}")

Run Command:

streamlit run app.py

Code Walkthrough Summary

SectionDescription
Data PreprocessingCleaned and transformed raw data for analysis
EDAIdentified strong correlates and visualized trends
Feature EngineeringCreated rolling means, lags, and price differentials
Model BuildingUsed ML models to train and predict gold prices
DeploymentBuilt a Streamlit app for user-friendly interaction

Conclusion

Gold price prediction using machine learning offers real-time insights into market dynamics. By leveraging financial indicators, time-series techniques, and modern ML algorithms, we can forecast trends and empower better decisions.

Deploying it with Streamlit makes it even more powerful by bringing interactivity to data science.

FAQs
What machine learning algorithm is best for gold prediction?
Random Forest and XGBoost often outperform due to their robustness and ability to handle non-linear data.
Can I use this project in my data science portfolio?
Absolutely! It demonstrates EDA, feature engineering, modeling, and deployment skills.
Do I need deep learning for this project?
Not necessarily. Tree-based models work well for structured financial data.
Can I predict gold prices in real-time?
Yes, with real-time API integration and model retraining, it’s possible.
How do I host this Streamlit app online?
Use platforms like Streamlit Cloud, Heroku, or Render.

πŸš€ Ready to dive into data-driven finance? Start your Gold Price Prediction Project today and level up your data science and machine learning skills.

πŸ‘‰ Follow BiStartX LinkedIn Page
πŸ‘‰ For internships and more projects.

More Machine Learning Project Ideas to Improve Your Capabilities

If you’re eager to grow your machine learning portfolio and dive into hands-on applications, here’s a list of diverse and practical project ideas. These projects are especially valuable for data science learners, machine learning students, and AI enthusiasts looking to stand out.

βš–οΈ Legal Outcome Predictor: Personal Injury Cases

Create a classification model to forecast the verdict of personal injury lawsuits using past court rulings, case attributes, and structured legal datasets.

🚒 Titanic Data Survival Analysis

Leverage the iconic Titanic dataset to conduct a detailed exploratory data analysis (EDA). Build classification models to determine passenger survival based on demographics and ticket details.

πŸ›’ Big Mart Sales Forecasting (2025 Edition)

Apply regression techniques to predict sales for various Big Mart stores. Use features like product type, store size, promotional discounts, and seasonal trends to improve accuracy.

πŸ₯” Deep Learning for Potato Leaf Disease Detection

Develop a deep learning solution to identify diseases in potato leaves using image classification. Implement Convolutional Neural Networks (CNNs) to support precision agriculture.

βœ‹ Real-Time Hand Gesture Recognition

Build an intelligent system that detects and classifies hand gestures using deep learning. Combine CNNs with RNNs or LSTMs for real-time gesture control applications.

πŸ’³ Smart Credit Card Fraud Detection

Design a fraud detection system that identifies suspicious transactions using anomaly detection and supervised learning. Emphasize real-time decision-making and financial security.

πŸ›‘οΈ Predicting Insurance Claim Severity

Model the expected severity of insurance claims using regression techniques. Analyze policyholder data, claim types, and incident history to estimate financial impact with high accuracy.

Leave a Reply

Your email address will not be published. Required fields are marked *