Skip to content

Commit

Permalink
Made README nicer
Browse files Browse the repository at this point in the history
  • Loading branch information
hosseinmoein committed Oct 14, 2024
1 parent 476e6c6 commit 8b7180e
Showing 1 changed file with 1 addition and 5 deletions.
6 changes: 1 addition & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,17 +64,13 @@ Each program has three identical parts. First it generates and populates 3 colum
The maximum dataset I could load into Polars was 300m rows per column. Any bigger dataset blew up the memory and caused OS to kill it. I ran C++ DataFrame with 10b rows per column and I am sure it would have run with bigger datasets too. So, I was forced to run both with 300m rows to compare.
I ran each test 4 times and took the best time. Polars numbers varied a lot from one run to another, especially calculation and selection times. C++ DataFrame numbers were significantly more consistent.

| | <B>C++ DataFrame</B> | <B>Polars</B> | <B>Pandas</B> |
| | [<B>C++ DataFrame</B>](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/dataframe_performance.cc) | [<B>Polars</B>](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/polars_performance.py) | [<B>Pandas</B>](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/pandas_performance.py) |
| :-- | :---: | :--: | :--: |
| Data generation/load time | 26.945900 secs | 28.468640 secs | 36.678976 secs |
| Calculation time | 1.260150 secs | 4.876561 secs | 40.326350 secs |
| Selection time | 0.742493 secs | 3.876561 secs | 8.326350 secs |
| Overall time: | 28.948600 secs | 36.876345 secs | 85.845114 secs |

[C++ DataFrame source file](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/dataframe_performance.cc) <BR>
[Polars source file](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/polars_performance.py) <BR>
[Pandas source file](https://github.com/hosseinmoein/DataFrame/blob/master/benchmarks/pandas_performance.py)

---

[**Please consider sponsoring DataFrame, especially if you are using it in production capacity. It is the strongest form of appreciation**](https://github.com/sponsors/hosseinmoein)
Expand Down

0 comments on commit 8b7180e

Please sign in to comment.