Using Surrogates to Improve Datasets with Missing Values
Results for “datasets”
16 tools found
Two large-scale and complementary visio-linguistic datasets (aka Nr3D and Sr3D) for identifying fine-grained 3D objects in ScanNet scenes. Nr3D contains 41.5K natural, free-form utterances, and Sr3d contains 83.5K template-based utterances.
SANAD Dataset is a large collection of Arabic news articles that can be used in different Arabic NLP tasks such as Text Classification and Word Embedding. The articles were collected using Python scripts written specifically for three popular news websites: AlKhaleej, AlArabiya and Akhbarona.
Several image datasets of faces and gestures that are ground truth annotated for benchmarking
Library and tools for storing, editing, and rendering volumetric datasets. [MPL2]
How to Remove Duplicates in Large Datasets
CLARIN is a European repository for scientific datasets.
Google Dataset Search (beta)
Institute for Health Metrics and Evaluation - a catalog of health and demographic datasets from around the world and including IHME results
grouplens.org
Nasdaq Data Link A premier source for financial, economic and alternative datasets.
free e-book comprehended by an online course
Ducks Unlimited
enables developers to trace, evaluate, manage prompts and datasets, and debug issues related to an LLM application’s performance. It creates open telemetry standard traces for any LLM which helps with observability and works with any observability client. Free plan offers 50K traces/month.
Package Python datasets sur PyPI
Liste de ressources: Awesome Public Datasets