Wangari Digest

Wangari Digest

Share this post

Wangari Digest
Wangari Digest
Excel spreadsheets are dead for Big Data. Companies need more Python instead

Excel spreadsheets are dead for Big Data. Companies need more Python instead

Data is getting too complicated for Excel to keep up

Ari Joury's avatar
Ari Joury
Nov 15, 2024
∙ Paid

Share this post

Wangari Digest
Wangari Digest
Excel spreadsheets are dead for Big Data. Companies need more Python instead
Share

TLDR: Excel spreadsheets are holding companies back in today’s data-driven world. As corporate data becomes increasingly complex, traditional tools like Excel — even the AI-enhanced versions — struggle with issues of scalability, accuracy, and collaboration. Python, on the other hand, offers automation, powerful data handling, and advanced analytics capabilities that are essential for modern corporate reporting, data collection, and data analysis. This article explores why it’s time to embrace Python and leave outdated tools behind.

Python is taking over what used to be Excel-dominated terrains. Image generated with Leonardo AI

Spreadsheets are bogging us down. For all the reliance on Excel in the corporate world, clinging to it is like trying to run a Formula 1 race in a broken-down car. Sure, it is familiar and widespread. And, to be fair, it does work for many tasks ranging from simple data extraction in investment banking to fairly complex insurance pricing models.

Nevertheless, when it comes to data with thousands of entries, often interrelated tables and complex clusters, using Excel can get downright dangerous when it comes to managing today’s complex data. Take this: Excel's row limit is infamous, leading to high-profile disasters like the UK's COVID-19 data mishap, where thousands of test results were missed due to Excel’s constraints. 

Or consider the untold hours wasted double-checking manual entries, only to end up with reports that can still fall victim to human error. The truth is that Excel is dead weight when you are dealing with data analysis from a certain level of complexity onwards.

In an age where even simple consumer apps handle complex data faster and with more precision, why are we still dealing with the limitations of the world’s most infamous spreadsheet tool?

Enter Python, a tool that actually fits the complexity and volume of today’s corporate data. With Python and its many useful libraries, data cleaning, transformation, and analysis are streamlined, automated, and accurate. Python might seem intimidating to people unfamiliar with it, but is not just for coders. It is really worth the learning curve for anyone tired of bending over backward for a spreadsheet.

In this article, we explore the cases in which Excel cannot keep up, why using Python within Excel is not enough, how Python deals with what Excel cannot do, and exactly how you can start transforming your data analysis game with Python.

Data in Excel does not scale

Broadly speaking, there are two scenarios in which Excel cannot win: When dealing with large data volumes, or a certain data complexity.

Excel's capacity to work with lots of data is limited. Modern versions support up to 1,048,576 rows and 16,384 columns per worksheet. 

This might seem ample for some readers, but many organizations now handle datasets that far exceed these limits. In practice, one should go nowhere even close to the data limits: Excel gets slow, unresponsive, and crashes more often (ask me, I’ve tried it).

In contrast, Python can easily handle tens, even hundreds of millions of rows and columns (yes, I’ve done that in the past). Packages like Pandas (which can be used through Python) make this rather efficient. The only bottleneck with Python is the capacity of your computer; performance issues with Python and Pandas themselves are rarely a concern.

Data complexity is a bit harder to quantify. As a rule of thumb, if your formulas start exceeding the page width of your screen, then it is probably too much for Excel. You would be surprised how quickly one might run into such complex formulas: Even with simple exploratory data analysis you can quickly run into more lengthy code, and once you need to perform Bayesian inference or advanced methods you really should not use Excel anymore.

From a practical point of view, complex data should always be handled with caution, and Excel cells can be a tad too easy to edit (and mess up). Python code, on the other hand, is easier to track, version-control (for example using Git or similar technologies), and re-run when needed.

Python can be used within Excel, but that is not enough

Python can actually be used from within Excel — this has been a feature since 2023. Aside from Python, VBA is also available and widely used by Excel wizards.

To be fair, this does make things a little easier than handling a more purified version of Excel. Nevertheless, even if you use Python-within-Excel, the row- and column limits persist. If you are building more complex algorithms, you will still run into performance issues long before those limits are reached.

On top of that, the very Python packages that make data analysis so delightful are currently not available in Excel. Also, interactive data analysis is not very easy to perform; there is just to much Ctrl+Shift+F5 and other finger gymnastics going on to make this worth its investment.

If these were not enough downsides, it turns out that embedding Python scripts within Excel files can lead to maintenance difficulties. Collaborators must ensure they have compatible environments, and version control becomes more complex. (This is something that Python-without-Excel is really good for, and now we’ve made it harder than basic Excel!) The obvious result, except for being a huge time dump, are inconsistencies and errors, particularly in collaborative settings.

In short, Python-within-Excel is hated for a reason. Perhaps it will get better in the future, but for now it is really not worth the pain.

Python is much easier to automate than Excel

Let’s go back to comparing regular Python scripts with Excel-without-Python sheets. Not only does Excel run into scalability issues; existing data analysis workflows are also harder to automate.

Keep reading with a 7-day free trial

Subscribe to Wangari Digest to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Ari Joury
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share