I’m a big fan of coding in Python for finance. However, I have to admit (perhaps grudgingly) that many of the tools which are useful for finance which I use in Python are based on ideas borrowed from R. If you’re using R for finance or data science, more broadly which R tools and packages should you know about?
This isn’t actually a single package. It’s instead a number of packages that are useful for finance and also for data science more broadly, a bit like the SciPy stack in Python. Of these ggplot2 is probably the most well known, which allows you to visualise data in a declarative way. It also creates great looking plots without fiddling around with all the parameters. There’s also dplyr which helps you to slice and dice data and tidyr for (as the name suggests) tidy your data into a consistent form. To ingest data from CSVs and similar datasets there’s readr. There’s also purrr which enables you to use functional programming like constructs.
xts (and zoo)
In finance, we often deal with time series, which often consist of price data. Hence, having an easy to use way to manipulate and process time series is key for doing most financial analysis. The xts package, extends the zoo time series package. It helps you create xts objects which are like normal R matrices but they also have time component. We can then do time series operations, like joining different time series together, adding and subtracting them, fill missing values, lag them and so on.
Let’s say you want to backtest a trading strategy? Typically, this involves a lot of steps, including loading market data, constructing a signal and then calculating returns. Lastly, we want to display the results to the user. Most of these steps are pretty repetitive, the main difference is how you generate the signal and that’s what you want to research. quantmod does a lot of the repetitive parts of the backtest, letting you concentrate on the “fun” bit, that is generating the actual trading signals. You can use quantmod with TTR, which implements a lot of technical indicators and associated trading rules.
Under the hood, tidyquant sits on top of several different packages, firstly for time series zoo and xts. It also wraps around quantmod, TTR and PerformanceAnalysis. tidquant makes it easier to use these packages with the tidyverse.
R is still pretty popular in finance, and definitely has a lot of use cases. If you’re interested in getting a nice introduction to using R in finance, I’d recommend the new book, Reproducible Finance with R by Jonathan Regenstein. The book discusses the above packages in much more detail, with some specific financial examples.
Saeed Amen is the founder of Cuemacro. Over the past decade and a half, Saeed Amen has developed systematic trading strategies at major investment banks including Lehman Brothers and Nomura. He is the author of Trading Thalesians: What the ancient world can teach us about trading today (Palgrave Macmillan) and is currently co-authoring The Book of Alternative Data (Wiley) with Alex Denev. Through Cuemacro, he now consults and publishes research for clients in the area of systematic trading. He has developed many Python libraries including finmarketpy and tcapy for transaction cost analysis. His clients have included major quant funds and data companies such as Bloomberg. He has presented his research at many conferences and institutions including the ECB and the Fed. He is also a co-founder of the Thalesians.