这是indexloc提供的服务,不要输入任何密码
Skip to content

Releases: NVIDIA-NeMo/Curator

NVIDIA NeMo Curator 0.8.0

09 May 01:11
cf12d34
Compare
Choose a tag to compare
  • Llama Based PII Redaction
  • Trafilatura Text Extractor
  • Chinese & Japanese Stopwords for Text Extractors
  • Writing gzip compressed jsonl datasets
  • Training dataset curation for retriever customization using hard-negative mining
  • Implemented a memory efficient pairwise similarity in Semantic Deduplication

NVIDIA NeMo Curator 0.8.0rc3.dev0

15 Apr 19:44
cff3cb6
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.8.0rc3.dev0 (2025-04-15)

NVIDIA NeMo Curator 0.8.0rc2.dev0

07 Apr 20:15
8cbd68f
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.8.0rc2.dev0 (2025-04-07)

NVIDIA NeMo Curator 0.7.1

31 Mar 22:52
d0cc62d
Compare
Choose a tag to compare
  • Fix Transformers + Cuda Context bug
  • Fix rate limit in SDG Retriever Eval Tutorial

NVIDIA NeMo Curator 0.7.0

12 Mar 21:22
f207c99
Compare
Choose a tag to compare
  • Python 3.12 Support
  • Curator on Blackwell
  • Nemotron-CC Dataset Recipe
  • Performant S3 for Fuzzy Deduplication

NVIDIA NeMo Curator 0.7.0rc2.dev0

25 Feb 13:12
6a05d29
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.7.0rc2.dev0 (2025-02-25)

NVIDIA NeMo Curator 0.7.0rc1.dev1

19 Feb 18:21
c3ebcb5
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.7.0rc1.dev1 (2025-02-19)

NVIDIA NeMo Curator 0.7.0rc0.dev1

04 Feb 21:41
7ab04ce
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.7.0rc0.dev1 (2025-02-04)

NVIDIA NeMo Curator 0.6.0

07 Jan 15:41
4f25a91
Compare
Choose a tag to compare

What's changed

  • Synthetic Data Generation for Text Retrieval
    • LLM-based Filters
      • Easiness
      • Answerability
    • Q&A Retrieval Generation Pipeline
  • Parallel Dataset Curation for Machine Translation
    • Load/Write Bitext Files
    • Heuristic filtering (Histogram, Length Ratio)
    • Classifier filtering (Comet, Cometoid)

NVIDIA NeMo Curator 0.6.0rc2.dev1

03 Jan 21:38
5a3c047
Compare
Choose a tag to compare
Pre-release

Prerelease: NVIDIA NeMo Curator 0.6.0rc2.dev1 (2025-01-03)