Using open source software for portfolio analysis

From Bogleheads

Using open source software for portfolio analysis is a compilation of open source software used to analyze portfolios. Topics covered include regression analysis, Monte Carlo simulation, and other statistical methods.

GnuCash

GnuCash is a free and open-source personal and business accounting software that uses professional accounting principles (double entry accounting), but yet is simple to use.

R

R is a language and environment for statistical computing and graphics.

RStudio is the free and open source integrated development environment (IDE) for R.

R has many packages available at CRAN Task Views. There is a page for finance that lists the available packages.

In the examples below, view the source code by clicking on the show/hide link of the title bar.

Asset correlation

Create a correlation matrix among 4 different funds. A matrix containing the correlations among the 4 funds, along with a set of plots is displayed.

The source code is available on finiki: Asset allocation

Bond market simulation

Bogleheads forum member linuxizer has developed a preliminary bond market simulator in R.

Downloading fund data

Bogleheads forum member camontgo has written a script to import fund data from Yahoo! Finance.

See: Downloading a Batch of Returns Using Yahoo! Finance, from The Calculating Investor

Multifactor regression

R is the preferred tool for performing multifactor regression analysis; a number of scripts are available in the referenced article under "R".

Risk parity strategy using 3x leveraged ETFs

This script is a companion to Bogleheads forum topic: "HEDGEFUNDIE's excellent adventure [risk parity strategy using 3x leveraged ETFs]".

Steps to make it work:[1]

  1. Install R (and Rstudio).
  2. Download all the files from this link to the same folder--don't change any file names.
  3. Open the script, make sure your working directory is the folder with all the files and install the required packages at the top of the script.

You should be able to reproduce the chart in this forum post and the table in the 2nd post down from there.

Statistical distribution moments, value at risk

Plot the 2nd (variance), 3rd (skewness), and 4th (kurtosis) statistical distribution moments of a fund. Additionally, examples of Value At Risk (VaR) are shown. The plots can be seen in the following forum post: Re: Risk = ?? and Re: Risk = ?? in the Financial Wisdom Forum

The source code is available on finiki: Statistical distribution moments, value at risk

MATLAB clones

MATLAB® (MATrix LABoratory) is a high-level language and interactive environment for numerical computation, visualization, and programming. It is neither open source nor free. However, it's an industry standard and open source clones have been developed which strive to emulate MATLAB functionality.[note 1]

Below are examples which run in Octave or MATLAB. Bogleheads forum member camontgo developed and maintains this code at The Calculating Investor.

Efficient frontier (mean-variance optimization)

This is a 3-part series which calculates the efficient frontier for a set of securities.[note 2] These efficient frontier calculations are not very practical. The results are too sensitive to small changes in the inputs, but it is educational since a lot of financial theory builds on some of the ideas underlying Markowitz.[2]

Rebalancing bonus

This is a replication of an analysis by William Bernstein which performs a Monte Carlo analysis of two portfolios to determine the rebalancing bonus. This simulation adds some math which allows simulation of rebalancing with correlated assets. Bernstein's original analysis assumed no correlation.[2]

Market timing

In 1975, Nobel laureate William Sharpe published a study titled “Likely Gains from Market Timing”. In this paper, Sharpe reportedly found that a market timer who switches between 100% stocks and 100% T-bills on an annual basis must be correct about 74% of the time (on average) to beat the market.[3]

This Monte Carlo simulation uses a simple market timing strategy to determine the market timing accuracy required to outperform buy-and-hold. A comparison of market timing to buy-and-hold in terms of both total returns and risk-adjusted returns (measured by the Sharpe Ratio) is performed. The analysis shows that a surprisingly high (and unlikely) degree of accuracy is necessary to beat the market return through market timing.[2]

The factor data is obtained from the Kenneth R. French - Data Library, under Fama/French Factors (direct link). Be sure to remove the file headers and the extra data sets (starting around row 1046). Save the file as F-F_Factors_annual.txt.

This simulation takes a long time to run. To start, reduce the number of iterations from 10,000 to a lower number (such as 100) and ensure that the analysis is functioning correctly.

Python

Python is a rapid development scripting language that is suitable for many tasks. Using add in libraries like NumPy and pandas make it easy to do financial analysis. There are many IDEs

The asset correlation analysis described above is available in Python. See finiki for the source code.

Forum member AlohaJoe has written several scripts:[4]

Additional Python code and techniques, such as extracting Yahoo financial info from Yahoo!, can be found in this Bogleheads forum topic: "Anyone use Python to analyze their finances?".

I savings bonds

Forum member AnonJohn, with modifications by several members, has developed a script which creates a spreadsheet containing helpful I-Bond information, such as keeping track of penalty amounts and interest rates.

The script inputs a spreadsheet containing the bonds' issue data and price. It scrapes rates directly from the Treasury website, so (until that changes format) there is no need for manual updating. This is more convenient than a complex spreadsheet and more complete than the Treasury tool.

Source code and support are available in Bogleheads forum topic: "Yet another I-bond calculator (python)"

Lakshmi

Lakshmi is an open source Python project library inspired by the Bogleheads investment philosophy with a Command Line Interpreter (CLI) interface.[note 3] The library, along with comprehensive documentation, is available at PyPi.

Features include:[5]

  • Specify and track asset allocation across accounts.
  • Ability to add/edit/delete accounts and assets (funds, stocks, ETFs, etc.) inside those accounts. The market value of these assets is automatically updated.
  • Support for running what-if scenarios to see how it impacts the overall asset allocation.
  • Suggests which funds to allocate new money to (or withdraw money from) to keep the actual asset allocation close to the desired asset allocation.
  • Suggests how to rebalance the funds in a given account to bring the actual asset allocation close to the desired asset allocation.
  • Ability to track portfolio performance IRR (Internal Rate of Return) and cash flows.
  • Supports manual assets, assets with ticker, Vanguard funds (that don't have associated ticker symbols), EE Bonds and I Bonds.
  • Listing current values of assets, asset allocation and asset location.
  • Tracking of tax-lot information for assets.
  • Analysis of portfolio to identify if there is need to rebalance or if there are losses that can be tax loss harvested.

Source is available on GitHub.

Okama

Okama is an open source Python package with portfolio analyzing & optimization tools. The main difference with many other projects is that okama goes with free historical data for many markets (NYSE, NASDAQ, LSE, European stock exchanges etc.).[6]

Portfolios can include securities with different currencies. All portfolio properties are adjusted to the base currency.

The main features of the project:

  • Investment portfolio constrained Markowitz Mean-Variance Analysis (MVA) and optimization
  • Rebalanced portfolio optimization with constraints (multi-period Efficient Frontier)
  • Monte Carlo Simulations for financial assets and investment portfolios
  • Popular risk metrics: VAR, CVaR, semi-deviation, variance and drawdowns
  • Forecasting models according to normal and lognormal distribution
  • Testing distribution on historical data
  • Dividend yield and other dividend indicators for stocks
  • Backtesting and comparing historical performance of broad range of assets and indexes in multiple currencies
  • Methods to track the performance of index funds (ETF) and compare them with benchmarks
  • Main macroeconomic indicators: inflation, central banks rates
  • Matplotlib visualization scripts for the Efficient Frontier, Transition map and assets risk / return performance

TSP share prices for Quicken

Forum member Simbilis has written a Python script which will scrape the Thrift Savings Plan website to extract a CSV (Comma Separated Value) file which can be imported into Quicken.

On-going support and discussion can be found in this forum thread: TSP share prices for Quicken

TSP share prices for Quicken
#!/usr/bin/python2
 
from urllib2 import urlopen
from urllib import urlencode
import csv
from datetime import datetime, timedelta, date
from string import lstrip
 
fundTag = {
    'L Income' : 'TSPLINCOME',
    'L 2020' : 'TSPL2020',
    'L 2030' : 'TSPL2030',
    'L 2040' : 'TSPL2040',
    'L 2050' : 'TSPL2050',
    'G Fund' : 'TSPGFUND',
    'F Fund' : 'TSPFFUND',
    'C Fund' : 'TSPCFUND',
    'S Fund' : 'TSPSFUND',
    'I Fund' : 'TSPIFUND'}
 
priceHistoryFile = 'tspQuicken.csv'
 
lastDate = ''
try:
    quickenReader = csv.reader(open(priceHistoryFile, 'r'))
    lastDate = [row for row in quickenReader][-1][2]
except:
    lastDate = '06/01/2003'
 
startDate = (datetime.strptime(lastDate, '%m/%d/%Y') + timedelta(1)).strftime('%m/%d/%Y')
endDate = date.today().strftime('%m/%d/%Y')
if lastDate == endDate:
    print 'already have prices through', endDate
    exit()
 
print 'checking for new prices starting on', startDate
tspSharePricePageUrl = 'https://www.tsp.gov/investmentfunds/shareprice/sharePriceHistory.shtml'
postData = urlencode({'startdate' : startDate, 'enddate' : endDate, 'whichButton' : 'CSV'})
page = urlopen(tspSharePricePageUrl, postData)
 
reader = csv.reader(page)
rows = [row for row in reader if len(row) > 0]
tagRow = rows[0]
 
writer = csv.writer(open(priceHistoryFile, 'a'))
for row in rows[:0:-1]:
    currDate = datetime.strptime(row[0], '%Y-%m-%d').strftime('%m/%d/%Y')
    newRows = []
    for i in range(1, len(row)):
        tag = lstrip(tagRow[i])
        if tag in fundTag:
            try:
                price = float(row[i])
            except:
                continue
            newRows.append([fundTag[tag], price, currDate])
            print 'found', [fundTag[tag], price, currDate]
 
    writer.writerows(newRows)

LibreOffice

LibreOffice is a Microsoft Office replacement. Spreadsheets are probably the most used financial analysis software.

Notes

  1. See Matlab Clones for a detailed overview.
  2. The example covariance matrix could be replaced with a real covariance matrix. For example, the R-script for downloading Yahoo Finance returns could be used to get returns for a set of assets, and the covariance matrix could be calculated and used as the input to these scripts...though the scaling would probably need to be changed.
  3. Lakshmi (meaning "She who leads to one's goal") is one of the principal goddesses in Hinduism. She is the goddess of wealth, fortune, power, health, love, beauty, joy and prosperity. Source: "Lakshmi project description". July 4, 2023. Retrieved September 12, 2023.

See also

References

  1. Bogleheads forum post: "Re: HEDGEFUNDIE's excellent adventure [risk parity strategy using 3x leveraged ETFs"], NotTooDeepLearning. May 25, 2019
  2. 2.0 2.1 2.2 camontgo PM to LadyGeek.
  3. Market Timing: How good is good enough?, from The Calculating Investor
  4. Bogleheads forum post: "Re: Anyone use Python to analyze their finances?"
  5. "Lakshmi project description". July 4, 2023. Retrieved September 12, 2023.
  6. Bogleheads forum post: "Re: [Wiki - Using open-source software for portfolio analysis"], chilango74. September 23, 2021

External links