Difference between revisions of "Using open source software for portfolio analysis"
(Removed installation step (incorporated by developer).) 
(Use the <syntaxhighlight> tag it provides rich formatting of source code) 

Line 104:  Line 104:  
! TSP share prices for Quicken  ! TSP share prices for Quicken  
    
−    +  <syntaxhighlight lang="python" line> 
#!/usr/bin/python2  #!/usr/bin/python2  
Line 164:  Line 164:  
writer.writerows(newRows)  writer.writerows(newRows)  
−  </  +  </syntaxhighlight> 
}  }  
Latest revision as of 11:16, 18 February 2021
This article is coordinated with our sister Canadian wiki, finiki, the Canadian financial Wiki. See: Using open source software for portfolio analysis (finiki)
Ongoing collaboration is in this Bogleheads' forum thread: [Wiki]  Using opensource software for portfolio analysis. 
Using open source software for portfolio analysis is a compilation of open source software used to analyze portfolios. Topics covered include regression analysis, Monte Carlo simulation, and other statistical methods.
GnuCash
GnuCash is a free and opensource personal and business accounting software that uses professional accounting principles (double entry accounting), but yet is simple to use.
R
R is a language and environment for statistical computing and graphics.
RStudio is the free and open source integrated development environment (IDE) for R.
R has many packages available at CRAN Task Views. There is a page for finance that lists the available packages.
In the examples below, view the source code by clicking on the show/hide link of the title bar.
Asset correlation
Create a correlation matrix among 4 different funds. A matrix containing the correlations among the 4 funds, along with a set of plots is displayed.
The source code is available on finiki: Asset allocation
Bond market simulation
Bogleheads forum member linuxizer has developed a preliminary bond market simulator in R.
 maRketSim  Bond market simulation in R, in the Bogleheads forum.
 Package info: CRAN  Package maRketSim, from CRAN, The Comprehensive R Archive Network. The reference manual can be found on this page.
 Working examples: maRketSim: maRketSim market simulator for R, from rdrr.io, the R documentation repository.
Downloading fund data
Bogleheads forum member camontgo has written a script to import fund data from Yahoo! Finance.
See: Downloading a Batch of Returns Using Yahoo! Finance, from The Calculating Investor
Multifactor regression
R is the preferred tool for performing multifactor regression analysis; a number of scripts are available in the referenced article under "R".
Risk parity strategy using 3x leveraged ETFs
This script is intended for investing enthusiasts who intentionally deviate from a total market approach. The ability, willingness, and need to take this additional risk is understood. Please ask in the forum for advice. 
This script is a companion to Bogleheads® forum topic: HEDGEFUNDIE's excellent adventure [risk parity strategy using 3x leveraged ETFs].
Steps to make it work:^{[1]}
 Install R (and Rstudio).
 Download all the files from this link to the same folderdon't change any file names.
 Open the script, make sure your working directory is the folder with all the files and install the required packages at the top of the script.
You should be able to reproduce the chart in this forum post and the table in the 2nd post down from there.
Statistical distribution moments, value at risk
Plot the 2nd (variance), 3rd (skewness), and 4th (kurtosis) statistical distribution moments of a fund. Additionally, examples of Value At Risk (VaR) are shown. The plots can be seen in the following forum post: Re: Risk = ?? and Re: Risk = ?? in the Financial Wisdom Forum
The source code is available on finiki: Statistical distribution moments, value at risk
MATLAB clones
MATLAB® (MATrix LABoratory) is a highlevel language and interactive environment for numerical computation, visualization, and programming. It is neither open source nor free. However, it's an industry standard and open source clones have been developed which strive to emulate MATLAB functionality.^{[note 1]}
Below are examples which run in Octave or MATLAB. Bogleheads forum member camontgo developed and maintains this code at The Calculating Investor.
Efficient frontier (meanvariance optimization)
This is a 3part series which calculates the efficient frontier for a set of securities.^{[note 2]} These efficient frontier calculations are not very practical. The results are too sensitive to small changes in the inputs, but it is educational since a lot of financial theory builds on some of the ideas underlying Markowitz.^{[2]}
 Calculating the Efficient Frontier: Part 1, Part 2, Part 3, from The Calculating Investor
Rebalancing bonus
This is a replication of an analysis by William Bernstein which performs a Monte Carlo analysis of two portfolios to determine the rebalancing bonus. This simulation adds some math which allows simulation of rebalancing with correlated assets. Bernstein's original analysis assumed no correlation.^{[2]}
Market timing
In 1975, Nobel laureate William Sharpe published a study titled “Likely Gains from Market Timing”. In this paper, Sharpe reportedly found that a market timer who switches between 100% stocks and 100% Tbills on an annual basis must be correct about 74% of the time (on average) to beat the market.^{[3]}
This Monte Carlo simulation uses a simple market timing strategy to determine the market timing accuracy required to outperform buyandhold. A comparison of market timing to buyandhold in terms of both total returns and riskadjusted returns (measured by the Sharpe Ratio) is performed. The analysis shows that a surprisingly high (and unlikely) degree of accuracy is necessary to beat the market return through market timing.^{[2]}
The factor data is obtained from the Kenneth R. French  Data Library, under Fama/French Factors (direct link). Be sure to remove the file headers and the extra data sets (starting around row 1046). Save the file as FF_Factors_annual.txt.
This simulation takes a long time to run. To start, reduce the number of iterations from 10,000 to a lower number (such as 100) and ensure that the analysis is functioning correctly.
Python
Python is a rapid development scripting language that is suitable for many tasks. Using add in libraries like NumPy and pandas make it easy to do financial analysis. There are many IDEs
The asset correlation analysis described above is available in Python. See finiki for the source code.
Forum member AlohaJoe has written several scripts:^{[4]}
 See if using CAPE10 to market time rebalancing makes a difference when using monthly data
 Build a bond fund simulator that allows me to simulate historical performance of bonds
 Extend that to do the same for Japanese government bonds, to see how bonds outside the US have performed
 Compare 18 different withdrawal strategies in retirement
 Compare 8 different "harvesting" strategies during retirement
 Compare the relative importance of early "bad returns" and "high inflation" on retirement
 Calculate safe withdrawal rates in Japan for various portfolio allocations
 Compare various income smoothing strategies in retirement
 Calculate how many months of gains are wiped out by various market corrections
Additional Python code and techniques, such as extracting Yahoo financial info from Yahoo!, can be found in this Bogleheads® forum topic: Anyone use Python to analyze their finances?.
Forum member Simbilis has written a Python script which will scrape the Thrift Savings Plan website to extract a CSV (Comma Separated Value) file which can be imported into Quicken.
Ongoing support and discussion can be found in this forum thread: TSP share prices for Quicken
TSP share prices for Quicken 

1#!/usr/bin/python2
2
3from urllib2 import urlopen
4from urllib import urlencode
5import csv
6from datetime import datetime, timedelta, date
7from string import lstrip
8
9fundTag = {
10 'L Income' : 'TSPLINCOME',
11 'L 2020' : 'TSPL2020',
12 'L 2030' : 'TSPL2030',
13 'L 2040' : 'TSPL2040',
14 'L 2050' : 'TSPL2050',
15 'G Fund' : 'TSPGFUND',
16 'F Fund' : 'TSPFFUND',
17 'C Fund' : 'TSPCFUND',
18 'S Fund' : 'TSPSFUND',
19 'I Fund' : 'TSPIFUND'}
20
21priceHistoryFile = 'tspQuicken.csv'
22
23lastDate = ''
24try:
25 quickenReader = csv.reader(open(priceHistoryFile, 'r'))
26 lastDate = [row for row in quickenReader][1][2]
27except:
28 lastDate = '06/01/2003'
29
30startDate = (datetime.strptime(lastDate, '%m/%d/%Y') + timedelta(1)).strftime('%m/%d/%Y')
31endDate = date.today().strftime('%m/%d/%Y')
32if lastDate == endDate:
33 print 'already have prices through', endDate
34 exit()
35
36print 'checking for new prices starting on', startDate
37tspSharePricePageUrl = 'https://www.tsp.gov/investmentfunds/shareprice/sharePriceHistory.shtml'
38postData = urlencode({'startdate' : startDate, 'enddate' : endDate, 'whichButton' : 'CSV'})
39page = urlopen(tspSharePricePageUrl, postData)
40
41reader = csv.reader(page)
42rows = [row for row in reader if len(row) > 0]
43tagRow = rows[0]
44
45writer = csv.writer(open(priceHistoryFile, 'a'))
46for row in rows[:0:1]:
47 currDate = datetime.strptime(row[0], '%Y%m%d').strftime('%m/%d/%Y')
48 newRows = []
49 for i in range(1, len(row)):
50 tag = lstrip(tagRow[i])
51 if tag in fundTag:
52 try:
53 price = float(row[i])
54 except:
55 continue
56 newRows.append([fundTag[tag], price, currDate])
57 print 'found', [fundTag[tag], price, currDate]
58
59 writer.writerows(newRows)

LibreOffice
LibreOffice is a Microsoft Office replacement. Spreadsheets are probably the most used financial analysis software.
Notes
 ↑ See Matlab Clones for a detailed overview.
 ↑ The example covariance matrix could be replaced with a real covariance matrix. For example, the Rscript for downloading Yahoo Finance returns could be used to get returns for a set of assets, and the covariance matrix could be calculated and used as the input to these scripts...though the scaling would probably need to be changed.
See also
References
 ↑ Bogleheads® forum post: Re: HEDGEFUNDIE's excellent adventure [risk parity strategy using 3x leveraged ETFs], NotTooDeepLearning. May 25, 2019
 ↑ ^{2.0} ^{2.1} ^{2.2} camontgo PM to LadyGeek.
 ↑ Market Timing: How good is good enough?, from The Calculating Investor
 ↑ Bogleheads® forum post: Re: Anyone use Python to analyze their finances?
External links
 The R Project for Statistical Computing
 RStudio
 MATLAB  The Language of Technical Computing
 MATLAB (on Wikipedia)
 Matlab Clones
 GNU Octave
 [Wiki]  Using opensource software for portfolio analysis, forum discussion.
