+
Skip to content
View gkiril's full-sized avatar
  • NEC Laboratories Europe
  • Heidelberg, Germany
  • X @kgashteo

Block or report gkiril

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A library for making RepE control vectors

Jupyter Notebook 651 49 Updated Sep 24, 2025

The official repo for the Dialz Python library - a toolkit for steering vector research.

Jupyter Notebook 18 1 Updated Jul 9, 2025

Delphi was the home of a temple to Phoebus Apollo, which famously had the inscription, 'Know Thyself.' This library lets language models know themselves through automated interpretability.

Python 219 50 Updated Oct 20, 2025

[ICLR 2025] This is the official implementation for the paper: "Large Language Models Meet Symbolic Provers for Logical Reasoning Evaluation"

Python 34 4 Updated Jun 11, 2025

Large language model and dataset for natural language to first-order logic translation

Jupyter Notebook 66 6 Updated Oct 25, 2023

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

2,139 175 Updated Apr 30, 2025

This rep is done for the bachelors project VU - AI.

Python 1 Updated Jun 30, 2025

ACL2023 - AlignScore, a metric for factual consistency evaluation.

Python 138 27 Updated Mar 11, 2024
Jupyter Notebook 5 1 Updated Apr 14, 2025

Prefix-Tuning: Optimizing Continuous Prompts for Generation

Python 949 162 Updated Apr 26, 2024

implementation of EMU for KG link prediction

Python 2 Updated May 5, 2025

Repository for "Scaling Evaluation-time Compute with Reasoning Models as Process Evaluators"

12 Updated Mar 25, 2025

This is the repository of our EMNLP 2024 Main conference paper "Revealing Personality Traits: A New Benchmark Dataset for Explainable Personality Recognition on Dialogues".

Python 7 3 Updated Dec 5, 2024

Official repository for ACL 2025 paper "ProcessBench: Identifying Process Errors in Mathematical Reasoning"

Python 174 14 Updated May 20, 2025

Apache Wayang(incubating) is the first cross-platform data processing system.

Java 234 104 Updated Oct 17, 2025

Simple language-driven navigation tasks for studying compositional learning

198 27 Updated Nov 5, 2020

From Chain-of-Thought prompting to OpenAI o1 and DeepSeek-R1 🍓

3,387 200 Updated May 7, 2025

Codebase for "DAVE: Diagnostic benchmark for Audio Visual Evaluation" (NeurIPS 2025 Datasets & Benchmarks)

Python 4 1 Updated May 15, 2025
Jupyter Notebook 11 1 Updated Jun 26, 2025

Systematic evaluation framework that automatically rates overthinking behavior in large language models.

Shell 93 13 Updated May 16, 2025

Toolkit for linearizing PDFs for LLM datasets/training

Python 14,407 1,081 Updated Oct 20, 2025

Parsers for clinical trials data from clinicaltrials.gov

Python 7 Updated Jun 28, 2023

Fully open reproduction of DeepSeek-R1

Python 25,561 2,396 Updated Sep 8, 2025

A blueprint for AI development, focusing on applied examples of RAG, information extraction, analysis and fine-tuning in the age of LLMs and agents.

Jupyter Notebook 61 5 Updated Feb 6, 2025

A course on aligning smol models.

Jupyter Notebook 6,458 2,289 Updated Oct 1, 2025

Benchmarking Benchmark Leakage in Large Language Models

JavaScript 55 3 Updated May 20, 2024
Python 58 11 Updated May 10, 2021
Next
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载