+
Skip to content
View jiroh1's full-sized avatar

Block or report jiroh1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Python 30,405 2,084 Updated Jul 8, 2025

ETL, Analytics, Versioning for Unstructured Data

Python 2,609 116 Updated Jul 18, 2025

Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.

Jupyter Notebook 31,949 6,778 Updated Jul 17, 2025

Terraform module to create VPC resource on AWS.

HCL 37 21 Updated Jun 26, 2025

The Data Engineering Cookbook

Python 14,397 2,618 Updated Jun 11, 2025

A curated list of engineering blogs

Ruby 34,875 1,832 Updated Aug 21, 2024

A curated list of data engineering tools for software developers

7,578 1,343 Updated Jul 7, 2025

Convert PDF to markdown + JSON quickly with high accuracy

Python 26,620 1,738 Updated Jul 17, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 52,632 8,793 Updated Jul 19, 2025

🔥Highlighting the top ML papers every week.

11,637 708 Updated Jun 8, 2025

LLM UI with advanced features, easy setup, and multiple backend support.

Python 44,386 5,714 Updated Jul 15, 2025

PyTorch native post-training library

Python 5,352 654 Updated Jul 18, 2025

PostgreSQL Index Advisor

PLpgSQL 1,679 15 Updated Apr 14, 2024

Let's Study.

650 76 Updated Aug 27, 2024

Supercharge Your LLM Application Evaluations 🚀

Python 9,976 984 Updated Jul 17, 2025

ARX is a comprehensive open source data anonymization tool aiming to provide scalability and usability. It supports various anonymization techniques, methods for analyzing data quality and re-ident…

Java 666 223 Updated Jan 3, 2025

This is a repo with links to everything you'd ever want to learn about data engineering

Jupyter Notebook 35,670 6,857 Updated Jul 17, 2025

Examples and guides for using the OpenAI API

MDX 65,428 10,850 Updated Jul 18, 2025

PyGWalker: Turn your dataframe into an interactive UI for visual analysis

Python 15,023 800 Updated Jul 3, 2025

A curated list of awesome pipeline toolkits inspired by Awesome Sysadmin

6,412 639 Updated Mar 10, 2025

📄 CLI that generates beautiful README.md files

JavaScript 11,050 1,376 Updated Sep 20, 2022

Roadmap to becoming a data engineer in 2021

12,667 1,351 Updated Jan 25, 2022

✏️ 기술 면접 스터디 Cheat Sheet

222 18 Updated Nov 24, 2023

Kaggle-Knowhow(Korean Ver) 한국분들을 위한 Kaggle 자료 모음입니다

378 112 Updated Nov 7, 2019

개발자 인터뷰 빈출 내용 정리

926 56 Updated Oct 9, 2023

프로그래머가 알아야 할 알고리즘 40 예제 파일입니다.

Jupyter Notebook 5 2 Updated Mar 7, 2022

Fundamentals of Spark with Python (using PySpark), code examples

Jupyter Notebook 350 273 Updated Oct 29, 2022

[한빛미디어] "이것이 취업을 위한 코딩 테스트다 with 파이썬" 전체 소스코드 저장소입니다.

Python 2,378 843 Updated May 8, 2022
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载