What’s the coolest thing you’ve programmed with Python?

A Python library for data science that works like a pandas dataframe but can handle arbitrary data sizes.This is available as opensource software (https://github.com/omegaml/omegaml).

Problem: In-memory data analysis quickly becomes complex and has significant drawbacks when data > RAM

Pandas is a library for statistical data processing.It only works “in memory”, i.e. the data must be fully loaded in RAM. Advantage: very fast. Disadvantage: only works if the data is less than/equal to the RAM size. After that, it quickly becomes very complex and requires complex infrastructure.

Example – Calculate sales per product:

• df is a dataframe of all sales. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Solution “hybrid in-memory” – combination of pandas with horizontally scalable database (MongoDB).

With my library omega|ml [1 you can execute exactly the same statement, only that it is translated into MongoDB syntax and executed directly in the MongoDB cluster.

Thus, any data size is no longer a problem, the whole thing is still very fast.Sometimes omega|ml is even faster because the data doesn’t have to be loaded into memory first. Working together in a team with the same data is even easier as a side effect, because you no longer have to laboriously push CSV data.

It goes on – omega|ml is also the fastest way to turn a machine learning model into a REST API (takes less than 1 second)

Data retrieval is only a small but essential part of the library, which also makes such dataframes as well as machine learning models very easy to access via a REST API.

I described how it works in an article in Towards Data Science[2.

Opensource & available free of charge

omega|ml is available free of charge as open source[3, the part described above is called MDataFrame (M stands either for “Massive” or “MongoDB”, which I use as DB).

Footnotes

[1 omega|ml – enable machine learning in production

[2 http://”omega|ml-u200a—”u200adeploying dat…

[3 omegaml

Leave a Reply