traims' blog

AI, machine learning, data analysis; complex networks, natural language processing. #DataMining #MachineLearning #recsys #python #ruby #rstats
Recent Tweets @
Posts I Like

My friends who are doing research in economics and finance sometimes ask me what Python libraries they should look into. The most obvious choices are NumPy and SciPy. In this post, I would like to describe two other libraries that are less known in the finance community: pandas and QSTK.

pandas: Python Data Analysis Library

Pandas is a powerful tool for data analysis in Python. Some of its features are specifically tailored for finance applications.

Let’s look at a simple example. Imagine that we would like to download data on stock prices for the Apple company (AAPL) from Yahoo Finance. This can be done in a single line of Python code. Next, we can output what we have downloaded.

Let’s take a look at the data that we have downloaded (the output of line 7):

Our plot which shows how price changes over time (the output of line 10):

View and download the code in iPython Notebook

For a detailed review of the features, you can take a look at the library documentation, or at the following materials:

QSTK: QuantSoftware Toolkit

QSTK is an open-source library for portfolio construction and management. It seems to be a mostly educational tool, which is also useful for rapid prototyping.

We are building the QSToolKit primarily for finance students, computing students, and quantitative analysts with programming experience. You should not expect to use it as a desktop app trading platform. Instead, think of it as a software infrastructure to support a workflow of modeling, testing and trading. (from QSTK Wiki)

The only reason I mention this library is because it was required for the programming assignments in the Computational Investing MOOC on Coursera. The course was prepared by Dr. Tucker Balch from GeorgiaTech, and he is a lead developer of QSTK.

  1. traims posted this