Five things you need to know before learning Python for a finance job

eFC logo
Python for finance

Python programmers are in hot demand in banks and hedge funds. Fortunately the language is easy to learn – it’s often used in UK primary schools to teach the basics of programming. However, before your first encounter with Python there are a few things you should know - particularly if you want to use it in a finance context.

Python 3 or Python 2?

When new versions of languages come out they usually involve incremental updates, and are backwardly compatible with earlier versions. This means that all your existing code will still work. Not so with Python version 3. This included substantial changes, and doesn’t play at all well with Python 2. Weirdly Python 2 continues to be supported, with over 30 updates released since Python 3 came out nearly ten years ago. New releases of Python 2 have now stopped, but it’s still widely used in the finance industry.

Given the choice Python 3 is definitely the vintage to use on any new projects, and the newer the better – all the cool new features of later versions are worth having. However you could find yourself working with legacy Python 2 code, so it’s important to be fluent in both variants if you want to apply Python in a finance job.

Batteries not included, but easy to get

You also need to know that the core Python library is pretty lightweight. You’ll need to import pre-packaged libraries if you want to do anything interesting. These libraries incorporate functions to perform most mathematical operations, deal with calendars, import and handle data, and do common system tasks.

However the real power of Python comes when you start downloading some of the numerous, and freely available, third party libraries. For serious finance work you’ll need numpy (to handle operations on large arrays), scipy (advanced statistical and mathematical functions), and matplotlib (data visualisation). Data scientists interested in machine learning will probably want to look at tensorflow. Pandas is a must for data manipulation, and has a solid finance pedigree - it was originally developed at giant hedge fund AQR capital management.

Lazier Python users might want to check out the Anaconda distribution, which includes all of the above packages and more, in a neat pre-packaged environment.

Python is slow. But it’s easy to blend it with C

Programmers who are used to the lightning speed of C or C++, or the relatively fast pace of Julia or Java, will find Python somewhat sluggish (although it’s still a bit quicker than R and Matlab, both popular languages in quantitative finance).

Programmers love to brag about how fast and efficient their code is, but most code doesn’t have to run that quickly. However Python will definitely be too slow when it comes to functions that are run repeatedly over large datasets or latency sensitive trading algorithims.

Fortunately it’s extremely easy to write speedy C or C++ functions, and then embed these into your Python modules. Learn how to do this too.

Python likes Big Data

Finance firms looking for an edge in today's markets are focusing on new sources of data. These alternative data sources have one thing in common – they’re big. Using Twitter feed data to predict market sentiment is a cool idea, but there are about 500 million new tweets every day. That’s an awful lot of data to store, process, and analyse.

Fortunately, Python fits neatly into the big data ecosystem with packages you can use to talk to Spark and Hadoop. Python also has APIs for NoSQL databases like MongoDB, and for all major providers of cloud storage.

Don’t be scared of the GIL

The Global Interpreter Lock – or GIL to it’s many enemies – is the infamous Achilles heel of Python. Only one thread can be executed by the interpreter at any one time, creating a bottleneck that slows down execution and doesn’t take advantage of modern multi-core CPUs. However the GIL rarely causes problems in practice. Most real world programs spend more time waiting for input or output.

Large computationally intensive operations can be affected by the GIL, but only a masochist would try and run these on a desktop machine or laptop. It makes more sense to parallelize your code, and then farm it out to your local cluster, or friendly provider of cloud computing.

Robert Carver is a former head of fixed income at quantitative hedge fund AHL, and the author of 'Systematic Trading' and 'Smart Portfolios'. Since learning to code at the age of seven he has learned, and mostly forgotten, over 30 programming languages. He uses Python every day.

Have a confidential story, tip, or comment you’d like to share? Contact: sbutcher@efinancialcareers.com in the first instance. Whatsapp/Signal/Telegram also available.

Bear with us if you leave a comment at the bottom of this article: all our comments are moderated by human beings. Sometimes these humans might be asleep, or away from their desks, so it may take a while for your comment to appear. Eventually it will – unless it’s offensive or libelous (in which case it won’t.)

Related articles

Close