Building and Optimizing a Trading Strategy using Python

28 minute read

Published: September 30, 2024

This is another blog post continuing the series on Mathematical Modeling and Python, the problem comes from the MCM (Mathematical Contest in Modeling) competition. The problem is about building and optimizing a trading strategy about gold and bitcoun using Python. The problem is as follows:

Problem Background

Market traders buy and sell volatile assets frequently, with a goal to maximize their total return. There is usually a commission for each purchase and sale. Two such assets are gold and bitcoin.

Requirements

You have been asked by a trader to develop a model that uses only the past stream of daily prices to date to determine each day if the trader should buy, hold, or sell their assets in their portfolio.

You will start with $$1000$ on $9/11/2016$. You will use the five-year trading period, from $9/11/2016$ to $9/10/2021$. On each trading day, the trader will have a portfolio consisting of cash, gold, and bitcoin $[C, G, B]$ in U.S. dollars, troy ounces, and bitcoins, respectively. The initial state is $[1000, 0, 0]$. The commission for each transaction (purchase or sale) costs α% of the amount traded. Assume $\alpha_{\text{gold}} = 1\%$ and $\alpha_{\text{bitcoin}} = 2\%$. There is no cost to hold an asset.

Note that bitcoin can be traded every day, but gold is only traded on days the market is open, as reflected in the pricing data files LBMA-GOLD.csv and BCHAIN-MKPRU.csv. Your model should account for this trading schedule.

Explanation

The task involves analyzing data from two popular assets: gold and bitcoin. These assets have distinct characteristics, with gold being a traditional safe-haven asset and bitcoin representing the volatile world of cryptocurrencies. The goal is to develop a strategy that balances these assets to achieve optimal portfolio performance.

Analysis

To solve this problem, we first need to understand the problem requirements and constraints. The trader has an initial portfolio consisting of cash, gold, and bitcoin. The trader can buy, hold, or sell these assets based on the daily price data. The trader incurs a commission for each transaction, which affects the overall return. The goal is to maximize the total return over the five-year trading period.

Understanding the Problem Requirements

The problem requires constructing a trading strategy that accounts for various market conditions. We need to analyze historical price data, identify patterns, and develop signals that indicate when to buy or sell each asset.

Mathematical Models

Return Calculation: We need to calculate the return on each asset based on the daily price data and transaction costs.

\[\text{Return} = \frac{\text{Current Value} - \text{Previous Value}}{\text{Previous Value}}\]

Volatility Calculation: We use standard deviation over a rolling window to measure the volatility of each asset, helping assess risk levels.
Correlation Analysis: We analyze the correlation between gold and bitcoin prices to understand their relationship and diversification benefits.

Exploratory Data Analysis

We start by loading the price data for gold and bitcoin and visualizing the price trends over time. We calculate daily returns, volatility, and correlation to gain insights into the asset performance.

Trading Rules

Moving Average Crossover: Implement a strategy where buy/sell signals are generated based on short-term and long-term moving averages. A buy signal is triggered when the short-term average crosses above the long-term average, and a sell signal is triggered when it crosses below. The mathematical expression for the moving average crossover is given by:

\[\text{Signal} = \begin{cases} 1, & \text{if } \text{short}_{ \text{mavg}} > \text{long}_{\text{mavg}} \\ 0, & \text{otherwise} \end{cases}\]

def ma_crossover(prices, short_lookback, long_lookback):
    short_mavg = prices.rolling(window=short_lookback).mean()
    long_mavg = prices.rolling(window=long_lookback).mean()
    signal = np.where(short_mavg > long_mavg, 1, 0)
    return pd.Series(signal, index=prices.index).diff()

Mean Reversion: Develop a strategy that assumes prices will revert to their mean over time. Buy signals are generated when prices fall below a threshold, and sell signals when they rise above it. The mathematical formula for mean reversion is given by:

\[\text{Signal} = \begin{cases} 1, & \text{if } \text{prices} < \text{mavg} - \text{std} \\ -1, & \text{if } \text{prices} > \text{mavg} + \text{std} \\ 0, & \text{otherwise} \end{cases}\]

def mean_reversion(prices, lookback):
    mavg = prices.rolling(window=lookback).mean()
    std = prices.rolling(window=lookback).std()
    signal = np.where(prices < mavg - std, 1, np.where(prices > mavg + std, -1, 0))
    return pd.Series(signal, index=prices.index).fillna(0).diff()

Backtesting: Use historical data to simulate trades based on the generated signals. Track cash flow, asset holdings, and total portfolio value over time. Then we incorporate transaction costs to simulate real-world trading conditions. The backtesting function is defined as follows:

\[\text{Portfolio Value} = \text{Cash} + \text{Bitcoin} \times \text{Bitcoin Price} + \text{Gold} \times \text{Gold Price}\]

where:

$\text{Cash}$ is the amount of cash in the portfolio.
$\text{Bitcoin}$ is the amount of bitcoin held.
$\text{Gold}$ is the amount of gold held.
$\text{Bitcoin Price}$ is the price of bitcoin.
$\text{Gold Price}$ is the price of gold.
The backtest function calculates the portfolio value and returns over time.
It also accounts for transaction costs when buying or selling assets.

def backtest(df, bitcoin_signal, gold_signal, bitcoin_tc=0.02, gold_tc=0.01, initial_cash=1000):
    cash = initial_cash
    bitcoin = 0
    gold = 0
    portfolio = pd.DataFrame(index=df.index, columns=['cash', 'bitcoin', 'gold', 'total', 'returns']).fillna(0.0)
    
    for date, row in df.iterrows():
        # Check if current prices are valid and not NaN
        if pd.isnull(row['bitcoin']) or pd.isnull(row['gold']):
            continue
        
        # Bitcoin trading logic
        if bitcoin_signal.loc[date] == 1:
            allowable_bitcoin = (cash * 0.5) / (1 + bitcoin_tc)
            bitcoin_trade = allowable_bitcoin / row['bitcoin']
            bitcoin += bitcoin_trade
            cash -= (bitcoin_trade * row['bitcoin']) * (1 + bitcoin_tc)
        
        elif bitcoin_signal.loc[date] == -1:
            cash += bitcoin * row['bitcoin'] * (1 - bitcoin_tc)
            bitcoin = 0
            
        # Gold trading logic
        if gold_signal.loc[date] == 1:
            allowable_gold = (cash * 0.5) / (1 + gold_tc)  
            gold_trade = allowable_gold / row['gold']
            gold += gold_trade
            cash -= (gold_trade * row['gold']) * (1 + gold_tc)
        
        elif gold_signal.loc[date] == -1:
            cash += gold * row['gold'] * (1 - gold_tc)
            gold = 0
        
        # Update portfolio values
        portfolio.loc[date, 'cash'] = max(cash, 0)  # Ensure cash is non-negative
        portfolio.loc[date, 'bitcoin'] = bitcoin * row['bitcoin']
        portfolio.loc[date, 'gold'] = gold * row['gold']
        portfolio.loc[date, 'total'] = max(cash + bitcoin * row['bitcoin'] + gold * row['gold'], 0)  # Ensure total is non-negative
    
    # Calculate returns with a small epsilon to avoid division by zero
    portfolio['returns'] = portfolio['total'].pct_change().fillna(0).replace([np.inf, -np.inf], 0)
    return portfolio

Evaluate Strategy: Analyze the performance of the trading strategy by calculating total returns, Sharpe ratio, and drawdowns. Compare the strategy against a buy-and-hold approach to assess its effectiveness. The evaluation function is defined as follows:

\[\text{Sharpe Ratio} = \frac{\text{Mean Returns}}{\text{Standard Deviation of Returns} \times \sqrt{252}}\]

where:

$\text{Mean Returns}$ is the average daily return.
$\text{Standard Deviation of Returns}$ is the standard deviation of daily returns.
$252$ is the number of trading days in a year.
$\text{Cumulative Returns}$ is the product of daily returns over time.
$\text{Max Drawdown}$ is the maximum loss from a peak to a trough in the cumulative returns.

def evaluate_strategy(returns):
    sharpe_ratio = returns.mean() / returns.std() * np.sqrt(252)
    cumulative_returns = (1 + returns).cumprod()
    max_drawdown = ((cumulative_returns.cummax() - cumulative_returns) / cumulative_returns.cummax()).max()
    return sharpe_ratio, max_drawdown

Optimize the Strategy: Experiment with different parameter values (e.g., lookback periods for moving averages) to find the configuration that yields the best performance metrics. The optimization function is defined as follows:

\[\text{Optimize Strategy} = \text{argmax}_{\text{short}, \text{long}} \text{Sharpe Ratio}\]

where $\text{short}$ and $\text{long}$ are the lookback periods for the moving averages.
The optimization function iterates over different combinations of short and long lookback periods to find the optimal strategy.
The best strategy is the one with the highest Sharpe ratio.

# Optimize Strategy
def optimize_strategy(df, short_lookbacks, long_lookbacks, bitcoin_tc=0.02, gold_tc=0.01):
    best_sharpe = -np.inf
    best_strategy = None
    
    for short in short_lookbacks:
        for long in long_lookbacks:
            if short >= long:
                continue
            bitcoin_signal = ma_crossover(df['bitcoin'], short, long)
            gold_signal = ma_crossover(df['gold'], short, long)
            portfolio = backtest(df, bitcoin_signal, gold_signal, bitcoin_tc, gold_tc)
            sharpe, drawdown = evaluate_strategy(portfolio['returns'])
            if sharpe > best_sharpe:
                best_sharpe = sharpe
                best_strategy = (short, long)
                
    return best_strategy, best_sharpe

Analyze Sensitivity: Conduct sensitivity analysis to understand how changes in transaction costs or asset allocation affect the strategy’s performance. The sensitivity analysis function is defined as follows:

# Analyze Sensitivity to Transaction Costs  
def transaction_cost_sensitivity(df, best_strategy, bitcoin_commissions, gold_commissions):
    results = []
    short, long = best_strategy
    for btc_comm in bitcoin_commissions:
        for gold_comm in gold_commissions:
            bitcoin_signal = ma_crossover(df['bitcoin'], short, long)
            gold_signal = ma_crossover(df['gold'], short, long)
            portfolio = backtest(df, bitcoin_signal, gold_signal, btc_comm, gold_comm)
            sharpe, drawdown = evaluate_strategy(portfolio['returns'])
            results.append((btc_comm, gold_comm, sharpe, drawdown))
    return pd.DataFrame(results, columns=['Bitcoin Commission', 'Gold Commission', 'Sharpe Ratio', 'Max Drawdown'])

Solution

Now it is the time to implement the solution to the problem. We will start by loading the price data for gold and bitcoin and performing exploratory data analysis. We will then develop trading strategies based on moving averages and mean reversion. Finally, we will backtest the strategies, optimize the parameters, and evaluate the performance.

Of the greatest importance, we first import the necessary libraries and load the price data for gold and bitcoin.

Step 1: Import Libraries and Load Data

# Essential Libraries
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats as stats
import warnings

# Libraries for Time Series Analysis
from statsmodels.tsa.stattools import grangercausalitytests
import pywt
import talib
from hmmlearn import hmm

# Setting up the matplotlib style
plt.rcParams['figure.figsize'] = (16, 12)
plt.rcParams['font.size'] = 14
plt.rcParams['legend.fontsize'] = 14
plt.rcParams['legend.title_fontsize'] = 16
plt.rcParams['axes.labelsize'] = 16
plt.rcParams['axes.titlesize'] = 18
plt.rcParams['lines.linewidth'] = 2
plt.rcParams['lines.markersize'] = 12
plt.rcParams['axes.grid'] = True
plt.rcParams['axes.linewidth'] = 1.5
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['xtick.major.size'] = 10
plt.rcParams['xtick.major.width'] = 1.5
plt.rcParams['ytick.major.size'] = 10
plt.rcParams['ytick.major.width'] = 1.5

# Ignore warnings
warnings.filterwarnings('ignore')

# Read in data
bitcoin_data = pd.read_csv('BCHAIN-MKPRU.csv', parse_dates=['Date'], index_col='Date') 
gold_data = pd.read_csv('LBMA-GOLD.csv', parse_dates=['Date'], index_col='Date')

Step 2: Data Preprocessing

We then merge the data together and fill any missing values.

# Merge data
df = pd.merge(bitcoin_data, gold_data, left_index=True, right_index=True, how='left')
df.columns = ['bitcoin', 'gold']
df = df.fillna(method='ffill') # Fill missing values with the previous value
df = df.dropna() # Drop remaining missing values

Step 3: Exploratory Data Analysis

Afterwards, we have to calculate the daily returns.

# Calculate daily returns
df['bitcoin_return'] = df['bitcoin'].pct_change()
df['gold_return'] = df['gold'].pct_change()

Next, we will visualize the price trends and daily returns for gold and bitcoin.

# Visualize price series
fig, ax = plt.subplots()
df[['bitcoin', 'gold']].plot(ax=ax, secondary_y='gold')
ax.set_title('Bitcoin vs Gold Price')
ax.set_xlabel('Date')
ax.set_ylabel('Bitcoin Price (USD)')
ax.right_ax.set_ylabel('Gold Price (USD/oz)')
plt.savefig('bitcoin_vs_gold_price.png')
plt.show()

The output is a plot showing the price trends for bitcoin and gold over time.

Bitcoin vs Gold Price

# Plot daily returns
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(16, 14))
fig.suptitle('Daily Returns')
df['bitcoin_return'].plot(ax=ax1, title='Bitcoin Daily Returns')
df['gold_return'].plot(ax=ax2, title='Gold Daily Returns')
plt.savefig('daily_returns.png')
plt.show()

The output is a plot showing the daily returns for bitcoin and gold.

Daily Returns

We thereafter return distributions of daily returns for bitcoin and gold.

# Return distributions
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(18,10))
sns.histplot(df['bitcoin_return'], ax=ax1, kde=True, stat='density', bins=50)  
ax1.set_title('Distribution of Bitcoin Daily Returns')
sns.histplot(df['gold_return'], ax=ax2, kde=True, stat='density', bins=50)
ax2.set_title('Distribution of Gold Daily Returns')
plt.savefig('return_distributions.png')
plt.show()

print('Bitcoin Descriptive Stats:')
print(df['bitcoin_return'].describe())  
print('Gold Descriptive Stats:')
print(df['gold_return'].describe())

The output is a plot showing the distribution of daily returns for bitcoin and gold, along with descriptive statistics.

Return Distributions

Bitcoin Descriptive Stats:
count    1824.000000
mean        0.003247
std         0.041495
min        -0.391404
25%        -0.012492
50%         0.001464
75%         0.019092
max         0.218669
Name: bitcoin_return, dtype: float64
Gold Descriptive Stats:
count    1824.000000
mean        0.000193
std         0.007210
min        -0.051284
25%        -0.001756
50%         0.000000
75%         0.002234
max         0.052675
Name: gold_return, dtype: float64

The results show the descriptive statistics for daily returns of bitcoin and gold, including the mean, standard deviation, and percentiles.

Moreover, we perform the Jarque-Bera goodness of fit test on the daily returns to check for normality. Jarque-Bera test is a statistical test that checks whether the data follows a normal distribution.

# Test for normality
print('\nBitcoin Normality Test:') 
print(stats.jarque_bera(df['bitcoin_return'].dropna()))
print('\nGold Normality Test:')
print(stats.jarque_bera(df['gold_return'].dropna()))

The output shows the Jarque-Bera test results for the daily returns of bitcoin and gold.

Bitcoin Normality Test:
SignificanceResult(statistic=4769.762560557226, pvalue=0.0)

Gold Normality Test:
SignificanceResult(statistic=5755.2708062375905, pvalue=0.0)

The results indicate that the daily returns for bitcoin and gold are not normally distributed, as the p-values are less than 0.05.

Furthermore, we visualize the autocorrelation of daily returns for bitcoin and gold.

# Autocorrelation plots  
fig, (ax1, ax2) = plt.subplots(2,1,figsize=(16,14))
pd.plotting.autocorrelation_plot(df['bitcoin_return'], ax=ax1)
ax1.set_title('Bitcoin Autocorrelation')
pd.plotting.autocorrelation_plot(df['gold_return'], ax=ax2) 
ax2.set_title('Gold Autocorrelation')
plt.savefig('autocorrelation.png')
plt.show()

The output is a plot showing the autocorrelation of daily returns for bitcoin and gold.

Autocorrelation

We then calculate the correlation between bitcoin and gold returns.

# Correlation between assets
print(f"\nCorrelation between bitcoin and gold returns: {df[['bitcoin_return', 'gold_return']].corr().iloc[0,1]:.2f}")

The output shows the correlation between bitcoin and gold returns.

Correlation between bitcoin and gold returns: 0.05

Therefore, the correlation between bitcoin and gold returns is 0.05, indicating a weak positive relationship which we may use for diversification benefits.

We also visualize the rolling correlation between bitcoin and gold returns.

# Rolling correlation
rolling_corr = df[['bitcoin_return', 'gold_return']].rolling(180).corr().unstack()['bitcoin_return']['gold_return'] 
rolling_corr.plot(title='180-Day Rolling Correlation of Bitcoin and Gold Returns')
plt.savefig('rolling_correlation.png')
plt.show()

The output is a plot showing the rolling correlation between bitcoin and gold returns over a 180-day window.

Rolling Correlation

We then have the volatility analysis of bitcoin and gold returns.

# Volatility analysis
df['bitcoin_vol'] = df['bitcoin_return'].rolling(30).std() * np.sqrt(30) 
df['gold_vol'] = df['gold_return'].rolling(30).std() * np.sqrt(30)

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(16,14))  
df['bitcoin_vol'].plot(ax=ax1, title='30-Day Rolling Volatility of Bitcoin Returns')
df['gold_vol'].plot(ax=ax2, title='30-Day Rolling Volatility of Gold Returns')
plt.savefig('rolling_volatility.png')
plt.show()

The output is a plot showing the rolling volatility of bitcoin and gold returns over a 30-day window.

Rolling Volatility

Step 4: Granger Causality Test

Equally important, we perform the Granger causality test to check for causality between bitcoin and gold returns. For one may wonder what Granger causality is, it is a statistical concept that determines whether one time series can predict another time series.

# Perform Granger causality tests
df_clean = df[['bitcoin_return', 'gold_return']].replace([np.inf, -np.inf], np.nan).dropna()
gc_result = grangercausalitytests(df_clean, maxlag=5)
print("Granger causality test results:")
for lag, results in gc_result.items():
    print(f"Lag {lag}:")
    print(f"  Bitcoin -> Gold: p-value = {results[0]['ssr_ftest'][1]:.4f}")  
    print(f"  Gold -> Bitcoin: p-value = {results[0]['ssr_ftest'][1]:.4f}")

The output reveals as follows:

Granger Causality
number of lags (no zero) 1
ssr based F test:         F=13.0574 , p=0.0003  , df_denom=1820, df_num=1
ssr based chi2 test:   chi2=13.0789 , p=0.0003  , df=1
likelihood ratio test: chi2=13.0322 , p=0.0003  , df=1
parameter F test:         F=13.0574 , p=0.0003  , df_denom=1820, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=8.7358  , p=0.0002  , df_denom=1817, df_num=2
ssr based chi2 test:   chi2=17.5198 , p=0.0002  , df=2
likelihood ratio test: chi2=17.4361 , p=0.0002  , df=2
parameter F test:         F=8.7358  , p=0.0002  , df_denom=1817, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=6.6537  , p=0.0002  , df_denom=1814, df_num=3
ssr based chi2 test:   chi2=20.0381 , p=0.0002  , df=3
likelihood ratio test: chi2=19.9286 , p=0.0002  , df=3
parameter F test:         F=6.6537  , p=0.0002  , df_denom=1814, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=5.2568  , p=0.0003  , df_denom=1811, df_num=4
ssr based chi2 test:   chi2=21.1316 , p=0.0003  , df=4
likelihood ratio test: chi2=21.0098 , p=0.0003  , df=4
parameter F test:         F=5.2568  , p=0.0003  , df_denom=1811, df_num=4

Granger Causality
number of lags (no zero) 5
ssr based F test:         F=4.2630  , p=0.0007  , df_denom=1808, df_num=5
ssr based chi2 test:   chi2=21.4446 , p=0.0007  , df=5
likelihood ratio test: chi2=21.3192 , p=0.0007  , df=5
parameter F test:         F=4.2630  , p=0.0007  , df_denom=1808, df_num=5
Granger causality test results:
Lag 1:
  Bitcoin -> Gold: p-value = 0.0003
  Gold -> Bitcoin: p-value = 0.0003
Lag 2:
  Bitcoin -> Gold: p-value = 0.0002
  Gold -> Bitcoin: p-value = 0.0002
Lag 3:
  Bitcoin -> Gold: p-value = 0.0002
  Gold -> Bitcoin: p-value = 0.0002
Lag 4:
  Bitcoin -> Gold: p-value = 0.0003
  Gold -> Bitcoin: p-value = 0.0003
Lag 5:
  Bitcoin -> Gold: p-value = 0.0007
  Gold -> Bitcoin: p-value = 0.0007

To illustrate the results, the Granger causality test shows that there is a statistically significant relationship between bitcoin and gold returns at different lag periods.

Step 5: Wavelet Decomposition

We then perform wavelet decomposition to analyze the time-frequency characteristics of bitcoin and gold returns.

# Perform wavelet decomposition
bitcoin_coeffs = pywt.wavedec(df['bitcoin'], wavelet='db4', level=3)
gold_coeffs = pywt.wavedec(df['gold'], wavelet='db4', level=3)

# Plot wavelet coefficients
fig, axs = plt.subplots(2, 4, figsize=(18, 10))
for i in range(4):
    axs[0,i].plot(bitcoin_coeffs[i], label=f"Level {i}")
    axs[1,i].plot(gold_coeffs[i], label=f"Level {i}")
    axs[0,i].legend()
    axs[1,i].legend()
axs[0,0].set_title("Bitcoin Wavelet Coefficients")    
axs[1,0].set_title("Gold Wavelet Coefficients")
plt.tight_layout()
plt.savefig('wavelet_decomposition.png')
plt.show()

The output is a plot showing the wavelet coefficients for bitcoin and gold returns at different levels.

Wavelet Decomposition

Step 6: Technical Indicators

We then calculate technical indicators such as moving averages and relative strength index (RSI) for bitcoin and gold prices.

SMA (Simple Moving Average): A simple moving average is calculated over a specified time period to smooth out price fluctuations.
RSI (Relative Strength Index): RSI is a momentum oscillator that measures the speed and change of price movements.

# Calculate technical indicators
df['SMA_50'] = talib.SMA(df['bitcoin'], timeperiod=50)
df['SMA_200'] = talib.SMA(df['bitcoin'], timeperiod=200)
df['RSI'] = talib.RSI(df['bitcoin'])

# Define trading rules
df['SMA_Signal'] = np.where(df['SMA_50'] > df['SMA_200'], 1, 0)  
df['RSI_Signal'] = np.where(df['RSI'] < 30, 1, np.where(df['RSI'] > 70, -1, 0))

# Backtest each rule
sma_returns = df['bitcoin_return'] * df['SMA_Signal'].shift(1)
rsi_returns = df['bitcoin_return'] * df['RSI_Signal'].shift(1)

print(f"SMA strategy return: {sma_returns.sum():.2%}")
print(f"RSI strategy return: {rsi_returns.sum():.2%}")

The output shows the returns generated by the moving average (SMA) and RSI trading strategies for bitcoin prices.

SMA strategy return: 455.46%
RSI strategy return: -195.98%

The results indicate that the SMA strategy generated a positive return of 455.46%, while the RSI strategy resulted in a negative return of -195.98%.

Step 7: Hidden Markov Model

In this step, we implement a Hidden Markov Model (HMM) to predict the state of bitcoin prices based on historical data.

# Train HMM on returns
clean_returns = df[['bitcoin_return', 'gold_return']].dropna()
model = hmm.GaussianHMM(n_components=2, covariance_type="full", n_iter=100) 
model.fit(clean_returns)

# Decode most likely sequence of regimes
regimes = model.predict(clean_returns)
df.loc[clean_returns.index, 'Regime'] = regimes

# Evaluate strategy performance by regime
for r in range(model.n_components):
    regime_returns = df[df['Regime']==r]['bitcoin_return']
    print(f"Regime {r} bitcoin returns: {regime_returns.mean():.2%}")

The output shows the average bitcoin returns for each regime identified by the Hidden Markov Model.

Regime 0 bitcoin returns: 0.21%
Regime 1 bitcoin returns: 0.89%

The results indicate that regime 1 has higher average bitcoin returns compared to regime 0. To further analyze the performance, we can backtest trading strategies based on the identified regimes.

Step 8: Stress Testing

Before we implement our trading strategy, we perform stress testing to evaluate the robustness of the trading strategy under different market conditions.

# Define stress scenarios  
scenarios = [
    ('Crash', df['bitcoin_return'].min()),  
    ('Rally', df['bitcoin_return'].max()),
    ('Spike', df['bitcoin_return'].mean() + 3*df['bitcoin_return'].std())
]

# Simulate portfolio performance under each scenario
for name, shock in scenarios:
    scenario_returns = df['bitcoin_return'].copy()
    scenario_returns[df.sample(frac=0.01).index] = shock  
    portfolio = 1000 * (1 + scenario_returns).cumprod()
    print(f"{name} scenario:")
    print(f"  Final portfolio value: ${portfolio[-1]:.2f}")  
    print(f"  Max drawdown: {(portfolio / portfolio.cummax() - 1).min():.2%}")

The output shows the final portfolio value and maximum drawdown under different stress scenarios.

Crash scenario:
  Final portfolio value: $12.83
  Max drawdown: -99.72%
Rally scenario:
  Final portfolio value: $2828341.59
  Max drawdown: -65.05%
Spike scenario:
  Final portfolio value: $614031.33
  Max drawdown: -66.57%

The results indicate the final portfolio value and maximum drawdown under crash, rally, and spike scenarios.

We also visualize the portfolio performance under different stress scenarios.

# Plot portfolio performance under each scenario
fig, ax = plt.subplots()
for name, shock in scenarios:
    scenario_returns = df['bitcoin_return'].copy()
    scenario_returns[df.sample(frac=0.01).index] = shock
    portfolio = 1000 * (1 + scenario_returns).cumprod()
    portfolio.plot(ax=ax, label=name)
ax.legend()
ax.set_title('Portfolio Performance Under Stress Scenarios')
ax.set_xlabel('Date')
ax.set_ylabel('Portfolio Value ($)')
plt.savefig('portfolio_stress_scenarios.png')
plt.show()

The output is a plot showing the portfolio performance under crash, rally, and spike scenarios.

Portfolio Stress Scenarios

Step 9: Trading Strategy

Finally, we implement the trading strategy based on moving averages and backtest the strategy using historical data.

But before that, we have to do feature engineering by calculating the moving averages for bitcoin and gold prices.

# Feature Engineering
def create_features(df):
    df['gold_return'] = df['gold'].pct_change()
    df['bitcoin_return'] = df['bitcoin'].pct_change()
    
    for n in [1, 5, 20]:
        df[f'gold_return_lag{n}'] = df['gold_return'].shift(n)
        df[f'bitcoin_return_lag{n}'] = df['bitcoin_return'].shift(n)
        
    df['gold_volatility'] = df['gold_return'].rolling(20).std()
    df['bitcoin_volatility'] = df['bitcoin_return'].rolling(20).std()
    
    df['gold_bitcoin_corr'] = df[['gold_return', 'bitcoin_return']].rolling(20).corr().unstack()['gold_return']['bitcoin_return']
    
    return df

Then we define the universe of trading rules as mentioned earlier.

# Define Universe of Trading Rules
def ma_crossover(prices, short_lookback, long_lookback):
    short_mavg = prices.rolling(window=short_lookback, min_periods=1).mean()
    long_mavg = prices.rolling(window=long_lookback, min_periods=1).mean()
    signal = np.where(short_mavg > long_mavg, 1, 0)
    return pd.Series(signal, index=prices.index).diff()

def mean_reversion(prices, lookback):
    mavg = prices.rolling(window=lookback, min_periods=1).mean()
    std = prices.rolling(window=lookback, min_periods=1).std()
    signal = np.where(prices < mavg - std, 1, np.where(prices > mavg + std, -1, 0))
    return pd.Series(signal, index=prices.index).fillna(0).diff()

We also implement the backtesting function to simulate trades based on the generated signals.

# Backtest Trading Rules
def backtest(df, bitcoin_signal, gold_signal, bitcoin_tc=0.02, gold_tc=0.01, initial_cash=1000):
    cash = initial_cash
    bitcoin = 0
    gold = 0
    portfolio = pd.DataFrame(index=df.index, columns=['cash', 'bitcoin', 'gold', 'total', 'returns']).fillna(0.0)
    
    for date, row in df.iterrows():
        # Check if current prices are valid and not NaN
        if pd.isnull(row['bitcoin']) or pd.isnull(row['gold']):
            continue
        
        # Bitcoin trading logic
        if bitcoin_signal.loc[date] == 1:
            allowable_bitcoin = (cash * 0.5) / (1 + bitcoin_tc)
            bitcoin_trade = allowable_bitcoin / row['bitcoin']
            bitcoin += bitcoin_trade
            cash -= (bitcoin_trade * row['bitcoin']) * (1 + bitcoin_tc)
        
        elif bitcoin_signal.loc[date] == -1:
            cash += bitcoin * row['bitcoin'] * (1 - bitcoin_tc)
            bitcoin = 0
            
        # Gold trading logic
        if gold_signal.loc[date] == 1:
            allowable_gold = (cash * 0.5) / (1 + gold_tc)  
            gold_trade = allowable_gold / row['gold']
            gold += gold_trade
            cash -= (gold_trade * row['gold']) * (1 + gold_tc)
        
        elif gold_signal.loc[date] == -1:
            cash += gold * row['gold'] * (1 - gold_tc)
            gold = 0
        
        # Update portfolio values
        portfolio.loc[date, 'cash'] = max(cash, 0)  # Ensure cash is non-negative
        portfolio.loc[date, 'bitcoin'] = bitcoin * row['bitcoin']
        portfolio.loc[date, 'gold'] = gold * row['gold']
        portfolio.loc[date, 'total'] = max(cash + bitcoin * row['bitcoin'] + gold * row['gold'], 0)  # Ensure total is non-negative
    
    # Calculate returns with a small epsilon to avoid division by zero
    portfolio['returns'] = portfolio['total'].pct_change().fillna(0).replace([np.inf, -np.inf], 0)
    return portfolio

Evaluate the performance of the trading strategy by calculating total returns, Sharpe ratio, and drawdowns.

def evaluate_strategy(returns):
    sharpe_ratio = returns.mean() / returns.std() * np.sqrt(252)
    cumulative_returns = (1 + returns).cumprod()
    max_drawdown = ((cumulative_returns.cummax() - cumulative_returns) / cumulative_returns.cummax()).max()
    return sharpe_ratio, max_drawdown

We then optimize the trading strategy by experimenting with different parameter values to find the best configuration.

# Optimize Strategy
def optimize_strategy(df, short_lookbacks, long_lookbacks, bitcoin_tc=0.02, gold_tc=0.01):
    best_sharpe = -np.inf
    best_strategy = None
    
    for short in short_lookbacks:
        for long in long_lookbacks:
            if short >= long:
                continue
            bitcoin_signal = ma_crossover(df['bitcoin'], short, long)
            gold_signal = ma_crossover(df['gold'], short, long)
            portfolio = backtest(df, bitcoin_signal, gold_signal, bitcoin_tc, gold_tc)
            sharpe, drawdown = evaluate_strategy(portfolio['returns'])
            if sharpe > best_sharpe:
                best_sharpe = sharpe
                best_strategy = (short, long)
                
    return best_strategy, best_sharpe

Lastly, we analyze the sensitivity of the trading strategy to transaction costs.

# Analyze Sensitivity to Transaction Costs  
def transaction_cost_sensitivity(df, best_strategy, bitcoin_commissions, gold_commissions):
    results = []
    short, long = best_strategy
    for btc_comm in bitcoin_commissions:
        for gold_comm in gold_commissions:
            bitcoin_signal = ma_crossover(df['bitcoin'], short, long)
            gold_signal = ma_crossover(df['gold'], short, long)
            portfolio = backtest(df, bitcoin_signal, gold_signal, btc_comm, gold_comm)
            sharpe, drawdown = evaluate_strategy(portfolio['returns'])
            results.append((btc_comm, gold_comm, sharpe, drawdown))
    return pd.DataFrame(results, columns=['Bitcoin Commission', 'Gold Commission', 'Sharpe Ratio', 'Max Drawdown'])

Step 10: Implement Trading Strategy

We then run the trading strategy by optimizing the parameters and evaluating the performance.

# Run the analysis
df = create_features(df)
short_lookbacks = [1, 5, 10, 15, 20, 25, 30, 35, 40]  # Define a range of lookback periods for the short moving average
long_lookbacks = [20, 40, 60, 80, 100, 120, 240] # Define a range of lookback periods for the long moving average
bitcoin_tc = 0.02 # Provided by the problem statement
gold_tc = 0.01 # Provided by the problem statement

best_strategy, best_sharpe = optimize_strategy(df, short_lookbacks, long_lookbacks, bitcoin_tc, gold_tc)
print(f"Best strategy: Short lookback {best_strategy[0]}, Long lookback {best_strategy[1]}, Sharpe Ratio {best_sharpe:.2f}")

bitcoin_commissions = [0.01, 0.02, 0.03, 0.04] 
gold_commissions = [0.005, 0.01, 0.015, 0.02]
sensitivity_results = transaction_cost_sensitivity(df, best_strategy, bitcoin_commissions, gold_commissions)
print(sensitivity_results)

The output shows the best strategy parameters and the sensitivity analysis results for different transaction costs.

Best strategy: Short lookback 15, Long lookback 80, Sharpe Ratio 1.27

And we have the following table for the sensitivity analysis results:

   Bitcoin Commission  Gold Commission  Sharpe Ratio  Max Drawdown
               0.01            0.005      1.334810      0.541443
               0.01            0.010      1.316392      0.547868
               0.01            0.015      1.298116      0.554155
               0.01            0.020      1.279986      0.560306
               0.02            0.005      1.290132      0.555051
               0.02            0.010      1.271645      0.561316
               0.02            0.015      1.253305      0.567445
               0.02            0.020      1.235117      0.573442
               0.03            0.005      1.245208      0.568161
               0.03            0.010      1.226665      0.574272
              0.03            0.015      1.208274      0.580249
              0.03            0.020      1.190039      0.586096
              0.04            0.005      1.200122      0.580795
              0.04            0.010      1.181534      0.586757
              0.04            0.015      1.163103      0.592588
              0.04            0.020      1.144835      0.598291

The results show the Sharpe ratio and maximum drawdown for different transaction costs for bitcoin and gold. While the Sharpe Ratio is higher than 1, the maximum drawdown is less than 60%, indicating a robust trading strategy.

We then implement our optimized trading strategy and evaluate its performance.

# Optimal strategy
short, long = best_strategy
bitcoin_signal = ma_crossover(df['bitcoin'], short, long)
gold_signal = ma_crossover(df['gold'], short, long)
portfolio = backtest(df, bitcoin_signal, gold_signal, bitcoin_tc, gold_tc)
sharpe, drawdown = evaluate_strategy(portfolio['returns'])

# Plot portfolio performance
fig, ax = plt.subplots()
portfolio['total'].plot(ax=ax)
ax.set_title('Portfolio Performance')
ax.set_xlabel('Date')
ax.set_ylabel('Portfolio Value ($)')
ax.legend()
plt.savefig('portfolio_performance.png')
plt.show()

The output is a plot showing the portfolio performance of the optimized trading strategy.

Portfolio Performance

Step 11: Calculate the Final Results

Finally, we calculate the total returns, Sharpe ratio, and maximum drawdown of the optimized trading strategy.

# Print strategy performance
print(f"Sharpe Ratio: {sharpe:.2f}")
print(f"Max Drawdown: {drawdown:.2%}")
print(f"Final Portfolio Date: {portfolio.index[-1]}")
print(f"Final Portfolio Value: ${portfolio['total'][-1]:.2f}")

The output shows the Sharpe ratio, maximum drawdown, final portfolio date, and value of the optimized trading strategy.

Sharpe Ratio: 1.27
Max Drawdown: 56.13%
Final Portfolio Date: 2021-09-10 00:00:00
Final Portfolio Value: $15789.84

Therefore, based on our analysis, the optimized trading strategy achieved a Sharpe ratio of $1.27$ with a maximum drawdown of 56.13% and a final portfolio value of $15789.84$. Since our initial investment was $$1000$, the results indicate a significant improvement in portfolio value.

Discussion

The results of our trading strategy highlight several important aspects of algorithmic trading using historical data:

Performance Metrics: The optimized strategy achieved a Sharpe Ratio of 1.27, indicating a good balance between risk and return. This metric suggests that the strategy is capable of generating returns that adequately compensate for the risk taken.
Drawdown Analysis: With a maximum drawdown of 56.13%, the strategy encountered significant periods of decline, which is a critical consideration for risk management. While this level of drawdown is substantial, it remains within acceptable bounds for high-volatility assets like bitcoin.
Transaction Costs Sensitivity: The sensitivity analysis demonstrated how varying transaction costs impact strategy performance. As transaction costs increase, both the Sharpe Ratio and overall returns decrease, underscoring the importance of minimizing trading expenses to enhance profitability.
Asset Characteristics: The weak correlation between gold and bitcoin returns (0.05) suggests potential diversification benefits, allowing for risk reduction without sacrificing returns. However, their distinct volatility profiles require careful balancing in the portfolio.
Model Robustness: The strategy’s robustness was tested through stress scenarios, revealing its capacity to withstand extreme market conditions like crashes and rallies. This aspect is crucial for ensuring that the strategy can adapt to real-world market fluctuations.
Limitations and Further Research: Despite its success, the strategy relies heavily on historical data and assumptions about future market behavior. Future research could explore more sophisticated models, such as machine learning algorithms, to enhance predictive accuracy and adaptability.

Conclusion

In conclusion, this project illustrates the power of systematic trading strategies in managing risk and optimizing returns across volatile markets like gold and bitcoin. By leveraging historical data, mathematical models, and rigorous backtesting, we developed a robust trading strategy that significantly increased portfolio value from an initial $$1,000$ to $$15,789.84$ over five years.

The journey from data preprocessing to strategy optimization underscores the importance of thorough analysis and careful parameter tuning in achieving desirable outcomes. While our approach yielded positive results, continuous monitoring and adaptation are essential to maintain competitiveness in dynamic financial markets.

Overall, this exercise demonstrates how combining traditional financial principles with modern computational tools can lead to effective investment strategies that cater to diverse market conditions. As financial markets evolve, so too must our strategies—embracing innovation while adhering to sound risk management practices will be key to future success in algorithmic trading.

Share on

Twitter Facebook LinkedIn

Sok Kin Cheng