θΏ™ζ˜―indexlocζδΎ›ηš„ζœεŠ‘οΌŒδΈθ¦θΎ“ε…₯任何密码
Skip to content

πŸ¦› CHONK your texts with Chonkie ✨ Type-friendly, light-weight, fast and super-simple chunking library

License

Notifications You must be signed in to change notification settings

chonkie-inc/chonkiejs

Repository files navigation

Chonkie Logo

πŸ¦› chonkiejs ✨

npm version npm downloads npm license npm bundle size Discord Github Stars

πŸ¦› CHONK your texts in TypeScript with Chonkie!✨ The no-nonsense lightweight and efficient chunking library.

Installation β€’ Usage β€’ Chunkers β€’ Acknowledgements β€’ Citation

We built chonkiejs while developing a TypeScript web app that needed fast, on-the-fly text chunking for RAG applications. After trying several existing libraries, we found them either too heavy or not flexible enough for our needs. chonkiejs is a port of the original chonkie library, but with some type-safety and a few extra features to make it more useful for TypeScript developers!

πŸš€ Feature-rich: All the CHONKs you'd ever need
✨ Easy to use: Install, Import, CHONK
⚑ Fast: CHONK at the max speed of TypeScript! tssssooooooom
πŸͺΆ Light-weight: No bloat, just CHONK
πŸ¦› Cute CHONK mascot: psst it's a pygmy hippo btw
❀️ Moto Moto's favorite TypeScript library

Chonkie is a chunking library that "just works" ✨

Note

This library is not a binding but a port of the original chonkie library written in Python, to TypeScript. This library is still under active development and not at feature parity with the original chonkie library yet. Please bear with us! πŸ«‚

πŸ“¦ Installation

npm install @chonkiejs/core

πŸ“š Usage

import { RecursiveChunker } from '@chonkiejs/core';

// Create a chunker
const chunker = await RecursiveChunker.create({
  chunkSize: 512
});

// Chunk your text
const chunks = await chunker.chunk('Your text here...');

// Use the chunks
for (const chunk of chunks) {
  console.log(chunk.text);
  console.log(`Tokens: ${chunk.tokenCount}`);
}

πŸ“¦ Packages

Package Description Dependencies
@chonkiejs/core Local chunking (Recursive, Token) with character-based tokenization Zero
@chonkiejs/cloud Cloud-based chunkers (Semantic, Neural, Code, etc.) via api.chonkie.ai @chonkiejs/core
@chonkiejs/token HuggingFace tokenizer support for core chunkers @huggingface/transformers

Contributing

Want to help grow Chonkie? Check out CONTRIBUTING.md to get started! Whether you're fixing bugs, adding features, improving docs, or simply leaving a ⭐️ on the repo, every contribution helps make Chonkie a better CHONK for everyone.

Remember: No contribution is too small for this tiny hippo!

Acknowledgements

Chonkie would like to CHONK its way through a special thanks to all the users and contributors who have helped make this library what it is today! Your feedback, issue reports, and improvements have helped make Chonkie the CHONKIEST it can be.

And of course, special thanks to Moto Moto for endorsing Chonkie with his famous quote:

"I like them big, I like them chonkie in TypeScript" ~ Moto Moto... definitely did not say this

Citation

If you use Chonkie in your research, please cite it as follows:

@software{chonkie2025,
  author = {Bhavnick Minhas and Shreyash Nigam},
  title = {Chonkie: A no-nonsense fast, lightweight, and efficient text chunking library},
  year = {2025},
  publisher = {GitHub},
  howpublished = {\url{https://github.com/chonkie-inc/chonkie}},
}