ports/misc/py-datasets/pkg-descr

Datasets is a library for easily accessing and sharing datasets for Audio,
Computer Vision, and Natural Language Processing (NLP) tasks.

Load a dataset in a single line of code, and use our powerful data processing
methods to quickly get your dataset ready for training in a deep learning model.
Backed by the Apache Arrow format, process large datasets with zero-copy reads
without any memory constraints for optimal speed and efficiency. We also feature
a deep integration with the Hugging Face Hub, allowing you to easily load and
share a dataset with the wider machine learning community.