271 lines
11 KiB
Plaintext
271 lines
11 KiB
Plaintext
Metadata-Version: 2.4
|
|
Name: polars
|
|
Version: 1.35.2
|
|
Summary: Blazingly fast DataFrame library
|
|
Author-email: Ritchie Vink <ritchie46@gmail.com>
|
|
License: Copyright (c) 2025 Ritchie Vink
|
|
Some portions Copyright (c) 2024 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
|
|
|
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
of this software and associated documentation files (the "Software"), to deal
|
|
in the Software without restriction, including without limitation the rights
|
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
copies of the Software, and to permit persons to whom the Software is
|
|
furnished to do so, subject to the following conditions:
|
|
|
|
The above copyright notice and this permission notice shall be included in all
|
|
copies or substantial portions of the Software.
|
|
|
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
SOFTWARE.
|
|
|
|
Project-URL: Homepage, https://www.pola.rs/
|
|
Project-URL: Documentation, https://docs.pola.rs/api/python/stable/reference/index.html
|
|
Project-URL: Repository, https://github.com/pola-rs/polars
|
|
Project-URL: Changelog, https://github.com/pola-rs/polars/releases
|
|
Keywords: dataframe,arrow,out-of-core
|
|
Classifier: Development Status :: 5 - Production/Stable
|
|
Classifier: Environment :: Console
|
|
Classifier: Intended Audience :: Science/Research
|
|
Classifier: License :: OSI Approved :: MIT License
|
|
Classifier: Operating System :: OS Independent
|
|
Classifier: Programming Language :: Python
|
|
Classifier: Programming Language :: Python :: 3
|
|
Classifier: Programming Language :: Python :: 3 :: Only
|
|
Classifier: Programming Language :: Python :: 3.9
|
|
Classifier: Programming Language :: Python :: 3.10
|
|
Classifier: Programming Language :: Python :: 3.11
|
|
Classifier: Programming Language :: Python :: 3.12
|
|
Classifier: Programming Language :: Python :: 3.13
|
|
Classifier: Programming Language :: Rust
|
|
Classifier: Topic :: Scientific/Engineering
|
|
Classifier: Typing :: Typed
|
|
Requires-Python: >=3.9
|
|
Description-Content-Type: text/markdown
|
|
License-File: LICENSE
|
|
Requires-Dist: polars-runtime-32==1.35.2
|
|
Provides-Extra: rt64
|
|
Requires-Dist: polars-runtime-64==1.35.2; extra == "rt64"
|
|
Provides-Extra: rtcompat
|
|
Requires-Dist: polars-runtime-compat==1.35.2; extra == "rtcompat"
|
|
Provides-Extra: polars-cloud
|
|
Requires-Dist: polars_cloud>=0.0.1a1; extra == "polars-cloud"
|
|
Provides-Extra: numpy
|
|
Requires-Dist: numpy>=1.16.0; extra == "numpy"
|
|
Provides-Extra: pandas
|
|
Requires-Dist: pandas; extra == "pandas"
|
|
Requires-Dist: polars[pyarrow]; extra == "pandas"
|
|
Provides-Extra: pyarrow
|
|
Requires-Dist: pyarrow>=7.0.0; extra == "pyarrow"
|
|
Provides-Extra: pydantic
|
|
Requires-Dist: pydantic; extra == "pydantic"
|
|
Provides-Extra: calamine
|
|
Requires-Dist: fastexcel>=0.9; extra == "calamine"
|
|
Provides-Extra: openpyxl
|
|
Requires-Dist: openpyxl>=3.0.0; extra == "openpyxl"
|
|
Provides-Extra: xlsx2csv
|
|
Requires-Dist: xlsx2csv>=0.8.0; extra == "xlsx2csv"
|
|
Provides-Extra: xlsxwriter
|
|
Requires-Dist: xlsxwriter; extra == "xlsxwriter"
|
|
Provides-Extra: excel
|
|
Requires-Dist: polars[calamine,openpyxl,xlsx2csv,xlsxwriter]; extra == "excel"
|
|
Provides-Extra: adbc
|
|
Requires-Dist: adbc-driver-manager[dbapi]; extra == "adbc"
|
|
Requires-Dist: adbc-driver-sqlite[dbapi]; extra == "adbc"
|
|
Provides-Extra: connectorx
|
|
Requires-Dist: connectorx>=0.3.2; extra == "connectorx"
|
|
Provides-Extra: sqlalchemy
|
|
Requires-Dist: sqlalchemy; extra == "sqlalchemy"
|
|
Requires-Dist: polars[pandas]; extra == "sqlalchemy"
|
|
Provides-Extra: database
|
|
Requires-Dist: polars[adbc,connectorx,sqlalchemy]; extra == "database"
|
|
Provides-Extra: fsspec
|
|
Requires-Dist: fsspec; extra == "fsspec"
|
|
Provides-Extra: deltalake
|
|
Requires-Dist: deltalake>=1.0.0; extra == "deltalake"
|
|
Provides-Extra: iceberg
|
|
Requires-Dist: pyiceberg>=0.7.1; extra == "iceberg"
|
|
Provides-Extra: async
|
|
Requires-Dist: gevent; extra == "async"
|
|
Provides-Extra: cloudpickle
|
|
Requires-Dist: cloudpickle; extra == "cloudpickle"
|
|
Provides-Extra: graph
|
|
Requires-Dist: matplotlib; extra == "graph"
|
|
Provides-Extra: plot
|
|
Requires-Dist: altair>=5.4.0; extra == "plot"
|
|
Provides-Extra: style
|
|
Requires-Dist: great-tables>=0.8.0; extra == "style"
|
|
Provides-Extra: timezone
|
|
Requires-Dist: tzdata; platform_system == "Windows" and extra == "timezone"
|
|
Provides-Extra: gpu
|
|
Requires-Dist: cudf-polars-cu12; extra == "gpu"
|
|
Provides-Extra: all
|
|
Requires-Dist: polars[async,cloudpickle,database,deltalake,excel,fsspec,graph,iceberg,numpy,pandas,plot,pyarrow,pydantic,style,timezone]; extra == "all"
|
|
Dynamic: license-file
|
|
|
|
<h1 align="center">
|
|
<a href="https://pola.rs">
|
|
<img src="https://raw.githubusercontent.com/pola-rs/polars-static/master/banner/polars_github_banner.svg" alt="Polars logo">
|
|
</a>
|
|
</h1>
|
|
|
|
<div align="center">
|
|
<a href="https://crates.io/crates/polars">
|
|
<img src="https://img.shields.io/crates/v/polars.svg" alt="crates.io Latest Release"/>
|
|
</a>
|
|
<a href="https://pypi.org/project/polars/">
|
|
<img src="https://img.shields.io/pypi/v/polars.svg" alt="PyPi Latest Release"/>
|
|
</a>
|
|
<a href="https://www.npmjs.com/package/nodejs-polars">
|
|
<img src="https://img.shields.io/npm/v/nodejs-polars.svg" alt="NPM Latest Release"/>
|
|
</a>
|
|
<a href="https://community.r-multiverse.org/polars">
|
|
<img src="https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fcommunity.r-multiverse.org%2Fapi%2Fpackages%2Fpolars&query=%24.Version&label=r-multiverse" alt="R-multiverse Latest Release"/>
|
|
</a>
|
|
<a href="https://doi.org/10.5281/zenodo.7697217">
|
|
<img src="https://zenodo.org/badge/DOI/10.5281/zenodo.7697217.svg" alt="DOI Latest Release"/>
|
|
</a>
|
|
</div>
|
|
|
|
<p align="center">
|
|
<b>Documentation</b>:
|
|
<a href="https://docs.pola.rs/api/python/stable/reference/index.html">Python</a>
|
|
-
|
|
<a href="https://docs.rs/polars/latest/polars/">Rust</a>
|
|
-
|
|
<a href="https://pola-rs.github.io/nodejs-polars/index.html">Node.js</a>
|
|
-
|
|
<a href="https://pola-rs.github.io/r-polars/index.html">R</a>
|
|
|
|
|
<b>StackOverflow</b>:
|
|
<a href="https://stackoverflow.com/questions/tagged/python-polars">Python</a>
|
|
-
|
|
<a href="https://stackoverflow.com/questions/tagged/rust-polars">Rust</a>
|
|
-
|
|
<a href="https://stackoverflow.com/questions/tagged/nodejs-polars">Node.js</a>
|
|
-
|
|
<a href="https://stackoverflow.com/questions/tagged/r-polars">R</a>
|
|
|
|
|
<a href="https://docs.pola.rs/">User guide</a>
|
|
|
|
|
<a href="https://discord.gg/4UfP5cfBE7">Discord</a>
|
|
</p>
|
|
|
|
## Polars: Extremely fast Query Engine for DataFrames, written in Rust
|
|
|
|
Polars is an analytical query engine written for DataFrames. It is designed to be fast, easy to use
|
|
and expressive. Key features are:
|
|
|
|
- Lazy | Eager execution
|
|
- Streaming (larger-than-RAM datasets)
|
|
- Query optimization
|
|
- Multi-threaded
|
|
- Written in Rust
|
|
- SIMD
|
|
- Powerful expression API
|
|
- Front end in Python | Rust | NodeJS | R | SQL
|
|
- [Apache Arrow Columnar Format](https://arrow.apache.org/docs/format/Columnar.html)
|
|
|
|
To learn more, read the [user guide](https://docs.pola.rs/).
|
|
|
|
## Performance 🚀🚀
|
|
|
|
### Blazingly fast
|
|
|
|
Polars is very fast. In fact, it is one of the best performing solutions available. See the
|
|
[PDS-H benchmarks](https://www.pola.rs/benchmarks.html) results.
|
|
|
|
### Lightweight
|
|
|
|
Polars is also very lightweight. It comes with zero required dependencies, and this shows in the
|
|
import times:
|
|
|
|
- polars: 70ms
|
|
- numpy: 104ms
|
|
- pandas: 520ms
|
|
|
|
### Handles larger-than-RAM data
|
|
|
|
If you have data that does not fit into memory, Polars' query engine is able to process your query
|
|
(or parts of your query) in a streaming fashion. This drastically reduces memory requirements, so
|
|
you might be able to process your 250GB dataset on your laptop. Collect with
|
|
`collect(engine='streaming')` to run the query streaming.
|
|
|
|
## Setup
|
|
|
|
### Python
|
|
|
|
Install the latest Polars version with:
|
|
|
|
```sh
|
|
pip install polars
|
|
```
|
|
|
|
See the [User Guide](https://docs.pola.rs/user-guide/installation/#feature-flags) for more details
|
|
on optional dependencies
|
|
|
|
To see the current Polars version and a full list of its optional dependencies, run:
|
|
|
|
```python
|
|
pl.show_versions()
|
|
```
|
|
|
|
## Contributing
|
|
|
|
Want to contribute? Read our [contributing guide](https://docs.pola.rs/development/contributing/).
|
|
|
|
## Managed/Distributed Polars
|
|
|
|
Do you want a managed solution or scale out to distributed clusters? Consider our
|
|
[offering](https://cloud.pola.rs/) and help the project!
|
|
|
|
## Python: compile Polars from source
|
|
|
|
If you want a bleeding edge release or maximal performance you should compile Polars from source.
|
|
|
|
This can be done by going through the following steps in sequence:
|
|
|
|
1. Install the latest [Rust compiler](https://www.rust-lang.org/tools/install)
|
|
2. Install [maturin](https://maturin.rs/): `pip install maturin`
|
|
3. `cd py-polars` and choose one of the following:
|
|
- `make build`, slow binary with debug assertions and symbols, fast compile times
|
|
- `make build-release`, fast binary without debug assertions, minimal debug symbols, long compile
|
|
times
|
|
- `make build-nodebug-release`, same as build-release but without any debug symbols, slightly
|
|
faster to compile
|
|
- `make build-debug-release`, same as build-release but with full debug symbols, slightly slower
|
|
to compile
|
|
- `make build-dist-release`, fastest binary, extreme compile times
|
|
|
|
By default the binary is compiled with optimizations turned on for a modern CPU. Specify `LTS_CPU=1`
|
|
with the command if your CPU is older and does not support e.g. AVX2.
|
|
|
|
Note that the Rust crate implementing the Python bindings is called `py-polars` to distinguish from
|
|
the wrapped Rust crate `polars` itself. However, both the Python package and the Python module are
|
|
named `polars`, so you can `pip install polars` and `import polars`.
|
|
|
|
## Using custom Rust functions in Python
|
|
|
|
Extending Polars with UDFs compiled in Rust is easy. We expose PyO3 extensions for `DataFrame` and
|
|
`Series` data structures. See more in https://github.com/pola-rs/polars/tree/main/pyo3-polars.
|
|
|
|
## Going big...
|
|
|
|
Do you expect more than 2^32 (~4.2 billion) rows? Compile Polars with the `bigidx` feature flag or,
|
|
for Python users, install `pip install polars[rt64]`.
|
|
|
|
Don't use this unless you hit the row boundary as the default build of Polars is faster and consumes
|
|
less memory.
|
|
|
|
## Legacy
|
|
|
|
Do you want Polars to run on an old CPU (e.g. dating from before 2011), or on an `x86-64` build of
|
|
Python on Apple Silicon under Rosetta? Install `pip install polars[rtcompat]`. This version of
|
|
Polars is compiled without [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) target
|
|
features.
|