backtesting.py/doc/examples/Quick Start User Guide.py at 7ffef3ffb2fb852ac294b9a0f80b98d5b4e458a5 · kernc/backtesting.py · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
# ---
# jupyter:
#   jupytext:
#     text_representation:
#       extension: .py
#       format_name: percent
#       format_version: '1.3'
#       jupytext_version: 1.17.1
#   kernelspec:
#     display_name: Python 3
#     language: python
#     name: python3
# ---

# %% [markdown]
# _Backtesting.py_ Quick Start User Guide
# =======================
#
# This tutorial shows some of the features of *backtesting.py*, a Python framework for [backtesting](https://www.investopedia.com/terms/b/backtesting.asp) trading strategies.
#
# _Backtesting.py_ is a small and lightweight, blazing fast backtesting framework that uses state-of-the-art Python structures and procedures (Python 3.6+, Pandas, NumPy, Bokeh). It has a very small and simple API that is easy to remember and quickly shape towards meaningful results. The library _doesn't_ really support stock picking or trading strategies that rely on arbitrage or multi-asset portfolio rebalancing; instead, it works with an individual tradeable asset at a time and is best suited for optimizing position entrance and exit signal strategies, decisions upon values of technical indicators, and it's also a versatile interactive trade visualization and statistics tool.
#
#
# ## Data
#
# _You bring your own data._ Backtesting ingests _all kinds of
# [OHLC](https://en.wikipedia.org/wiki/Open-high-low-close_chart)
# data_ (stocks, forex, futures, crypto, ...) as a
# [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/10min.html)
# with columns `'Open'`, `'High'`, `'Low'`, `'Close'` and (optionally) `'Volume'`.
# Such data is widely obtainable, e.g. with packages:
# * [pandas-datareader](https://pandas-datareader.readthedocs.io/en/latest/),
# * [Quandl](https://www.quandl.com/tools/python),
# * [findatapy](https://github.com/cuemacro/findatapy),
# * [yFinance](https://github.com/ranaroussi/yfinance),
# * [investpy](https://investpy.readthedocs.io/),
#   etc.
#
# Besides these columns, **your data frames can have additional columns which are accessible in your strategies in a similar manner**.
#
# DataFrame should ideally be indexed with a _datetime index_ (convert it with [`pd.to_datetime()`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html));
# otherwise a simple range index will do.

# %%
# Example OHLC daily data for Google Inc.
from backtesting.test import GOOG

GOOG.tail()

# %% [markdown]
# External data can be joined to the strategy data frame before it is passed to
# `Backtest`. For example, the snippet below adds a daily sentiment column. If
# `ADANOS_API_KEY` is set, it tries to load Reddit stock sentiment from the
# [Adanos Market Sentiment API](https://api.adanos.org/reddit/stocks/v1/);
# otherwise, it uses a tiny sample series so the tutorial stays runnable offline.
#
# The resulting `Sentiment` column is available in a strategy as
# `self.data.Sentiment[-1]`, just like `self.data.Close[-1]`.

# %%
import json
import os
from urllib.error import HTTPError, URLError
from urllib.parse import urlencode
from urllib.request import Request, urlopen

import pandas as pd


def adanos_daily_sentiment(index, ticker):
    daily_index = pd.DatetimeIndex(index).normalize()
    sample = pd.Series(
        [.15, -.05, .22, .08, .18],
        index=pd.to_datetime([
            '2013-02-25',
            '2013-02-26',
            '2013-02-27',
            '2013-02-28',
            '2013-03-01',
        ]),
        name='Sentiment',
    )

    api_key = os.environ.get('ADANOS_API_KEY')
    if api_key:
        params = urlencode({
            'from': daily_index.min().date().isoformat(),
            'to': daily_index.max().date().isoformat(),
        })
        request = Request(
            f'https://api.adanos.org/reddit/stocks/v1/stock/{ticker}?{params}',
            headers={'X-API-Key': api_key},
        )
        try:
            with urlopen(request, timeout=10) as response:
                payload = json.load(response)
            rows = payload.get('daily_trend') or []
            sentiment = pd.Series(
                {pd.Timestamp(row['date']): float(row.get('sentiment_score') or 0)
                 for row in rows},
                name='Sentiment',
            )
            if not sentiment.empty:
                sample = sentiment
        except (HTTPError, URLError, OSError, ValueError, KeyError):
            pass

    return sample.reindex(daily_index).ffill().fillna(0).to_numpy()


sentiment_data = GOOG.copy()
sentiment_data['Sentiment'] = adanos_daily_sentiment(sentiment_data.index, 'GOOG')
sentiment_data[['Close', 'Sentiment']].tail()


# %% [markdown]
# ## Strategy
#
# Let's create our first strategy to backtest on these Google data, a simple [moving average (MA) cross-over strategy](https://en.wikipedia.org/wiki/Moving_average_crossover).
#
# _Backtesting.py_ doesn't ship its own set of _technical analysis indicators_. Users favoring TA should probably refer to functions from proven indicator libraries, such as
# [TA-Lib](https://github.com/TA-Lib/ta-lib-python) or
# [Tulipy](https://tulipindicators.org),
# but for this example, we can define a simple helper moving average function ourselves:

# %%
def SMA(values, n):
    """
    Return simple moving average of `values`, at
    each step taking into account `n` previous values.
    """
    return pd.Series(values).rolling(n).mean()


# %% [markdown]
# A new strategy needs to extend
# [`Strategy`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Strategy)
# class and override its two abstract methods:
# [`init()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Strategy.init) and
# [`next()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Strategy.next).
#
# Method `init()` is invoked before the strategy is run. Within it, one ideally precomputes in efficient, vectorized manner whatever indicators and signals the strategy depends on.
#
# Method `next()` is then iteratively called by the
# [`Backtest`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest)
# instance, once for each data point (data frame row), simulating the incremental availability of each new full candlestick bar.
#
# Note, _backtesting.py_ cannot make decisions / trades _within_ candlesticks — any new orders are executed on the next candle's _open_ (or the current candle's _close_ if
# [`trade_on_close=True`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.__init__)).
# If you find yourself wishing to trade within candlesticks (e.g. daytrading), you instead need to begin with more fine-grained (e.g. hourly) data.

# %%
from backtesting import Strategy
from backtesting.lib import crossover


class SmaCross(Strategy):
    # Define the two MA lags as *class variables*
    # for later optimization
    n1 = 10
    n2 = 20

    def init(self):
        # Precompute the two moving averages
        self.sma1 = self.I(SMA, self.data.Close, self.n1)
        self.sma2 = self.I(SMA, self.data.Close, self.n2)

    def next(self):
        # If sma1 crosses above sma2, close any existing
        # short trades, and buy the asset
        if crossover(self.sma1, self.sma2):
            self.position.close()
            self.buy()

        # Else, if sma1 crosses below sma2, close any existing
        # long trades, and sell the asset
        elif crossover(self.sma2, self.sma1):
            self.position.close()
            self.sell()


# %% [markdown]
# In `init()` as well as in `next()`, the data the strategy is simulated on is available as an instance variable
# [`self.data`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Strategy.data).
#
# In `init()`, we declare and **compute indicators indirectly by wrapping them in
# [`self.I()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Strategy.I)**.
# The wrapper is passed a function (our `SMA` function) along with any arguments to call it with (our _close_ values and the MA lag). Indicators wrapped in this way will be automatically plotted, and their legend strings will be intelligently inferred.
#
# In `next()`, we simply check if the faster moving average just crossed over the slower one. If it did and upwards, we close the possible short position and go long; if it did and downwards, we close the open long position and go short. Note, we don't adjust order size, so _Backtesting.py_ assumes _maximal possible position_. We use
# [`backtesting.lib.crossover()`](https://kernc.github.io/backtesting.py/doc/backtesting/lib.html#backtesting.lib.crossover)
# function instead of writing more obscure and confusing conditions, such as:

# %% magic_args="echo" language="script"
#
#     def next(self):
#         if (self.sma1[-2] < self.sma2[-2] and
#                 self.sma1[-1] > self.sma2[-1]):
#             self.position.close()
#             self.buy()
#
#         elif (self.sma1[-2] > self.sma2[-2] and    # Ugh!
#               self.sma1[-1] < self.sma2[-1]):
#             self.position.close()
#             self.sell()

# %% [markdown]
# In `init()`, the whole series of points was available, whereas **in `next()`, the length of `self.data` and all declared indicators is adjusted** on each `next()` call so that `array[-1]` (e.g. `self.data.Close[-1]` or `self.sma1[-1]`) always contains the most recent value, `array[-2]` the previous value, etc. (ordinary Python indexing of ascending-sorted 1D arrays).
#
# **Note**: `self.data` and any indicators wrapped with `self.I` (e.g. `self.sma1`) are NumPy arrays for performance reasons. If you prefer pandas Series or DataFrame objects, use `Strategy.data.<column>.s` or `Strategy.data.df` accessors respectively. You could also construct the series manually, e.g. `pd.Series(self.data.Close, index=self.data.index)`.
#
# We might avoid `self.position.close()` calls if we primed the
# [`Backtest`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest)
# instance with `Backtest(..., exclusive_orders=True)`.

# %% [markdown]
# ## Backtesting
#
# Let's see how our strategy performs on historical Google data. The
# [`Backtest`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest)
# instance is initialized with OHLC data and a strategy _class_ (see API reference for additional options), and we begin with 10,000 units of cash and set broker's commission to realistic 0.2%.

# %%
from backtesting import Backtest

bt = Backtest(sentiment_data, SmaCross, cash=10_000, commission=.002)
stats = bt.run()
stats

# %% [markdown]
# [`Backtest.run()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.run)
# method returns a pandas Series of simulation results and statistics associated with our strategy. We see that this simple strategy makes almost 600% return in the period of 9 years, with maximum drawdown 33%, and with longest drawdown period spanning almost two years ...
#
# [`Backtest.plot()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.plot)
# method provides the same insights in a more visual form.

# %%
bt.plot()

# %% [markdown]
# ## Optimization
#
# We hard-coded the two lag parameters (`n1` and `n2`) into our strategy above. However, the strategy may work better with 15–30 or some other cross-over. **We declared the parameters as optimizable by making them [class variables](https://docs.python.org/3/tutorial/classes.html#class-and-instance-variables)**.
#
# We optimize the two parameters by calling
# [`Backtest.optimize()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.optimize)
# method with each parameter a keyword argument pointing to its pool of possible values to test. Parameter `n1` is tested for values in range between 5 and 30 and parameter `n2` for values between 10 and 70, respectively. Some combinations of values of the two parameters are invalid, i.e. `n1` should not be _larger than_ or equal to `n2`. We limit admissible parameter combinations with an _ad hoc_ constraint function, which takes in the parameters and returns `True` (i.e. admissible) whenever `n1` is less than `n2`. Additionally, we search for such parameter combination that maximizes return over the observed period. We could instead choose to optimize any other key from the returned `stats` series.

# %%
# %%time

stats = bt.optimize(n1=range(5, 30, 5),
                    n2=range(10, 70, 5),
                    maximize='Equity Final [$]',
                    constraint=lambda param: param.n1 < param.n2)
stats

# %% [markdown]
# We can look into `stats['_strategy']` to access the Strategy _instance_ and its optimal parameter values (10 and 15).

# %%
stats._strategy

# %%
bt.plot(plot_volume=False, plot_pl=False)

# %% [markdown]
# Strategy optimization managed to up its initial performance _on in-sample data_ by almost 50% and even beat simple
# [buy & hold](https://en.wikipedia.org/wiki/Buy_and_hold).
# In real life optimization, however, do **take steps to avoid
# [overfitting](https://en.wikipedia.org/wiki/Overfitting)**.

# %% [markdown]
# ## Trade data
#
# In addition to backtest statistics returned by
# [`Backtest.run()`](https://kernc.github.io/backtesting.py/doc/backtesting/backtesting.html#backtesting.backtesting.Backtest.run)
# shown above, you can look into _individual trade returns_ and the changing _equity curve_ and _drawdown_ by inspecting the last few, internal keys in the result series.

# %%
stats.tail()

# %% [markdown]
# The columns should be self-explanatory.

# %%
stats['_equity_curve']  # Contains equity/drawdown curves. DrawdownDuration is only defined at ends of DD periods.

# %%
stats['_trades']  # Contains individual trade data

# %% [markdown]
# Learn more by exploring further
# [examples](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#tutorials)
# or find more framework options in the
# [full API reference](https://kernc.github.io/backtesting.py/doc/backtesting/index.html#header-submodules).