- Israel
- @ddofer
Stars
ddofer / InterFeat
Forked from LinialLab/InterFeatAutomatically uncover Interesting (novel, plausible, and useful) features in biomedical data by combining statistical filtering, literature mining, knowledge graphs and LLMs on UK BioBank tabular f…
Open-source offline translation library written in Python
Awesome RSS feeds - A curated list of RSS feeds (and OPML files) used in Recommended Feeds and local news sections of Plenary - an RSS reader, article downloader and a podcast player app for android
Vocabulary Trimming (VT) is a model compression technique, which reduces a multilingual LM vocabulary to a target language by deleting irrelevant tokens from its vocabulary. This repository contain…
Tsururu is a Python-based library that provides a wide range of multi-series and multi-point-ahead prediction strategies, compatible with any underlying model, including neural networks.
Automatically uncover Interesting (novel, plausible, and useful) features in biomedical data by combining statistical filtering, literature mining, knowledge graphs and LLMs on UK BioBank tabular f…
Utility for behavioral and representational analyses of Language Models
Official Repository of Toward Reliable Scientific Hypothesis Generation: Evaluating Truthfulness and Hallucination in Large Language Models (IJCAI 2025)
Evaluation of multiple genomic language model for clinical VEP
A lightweight, low-dependency, unified API to use all common reranking and cross-encoder models.
Visualize and compare datasets, target values and associations, with one line of code.
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
A world wines dataset with user ratings for recommendation systems and general use.
Pysam is a Python package for reading, manipulating, and writing genomics data such as SAM/BAM/CRAM and VCF/BCF files. It's a lightweight wrapper of the HTSlib API, the same one that powers samtool…
Library to extract embeddings for DNA sequences using BioFM genomics foundation model
Implementation of the BatchTopK activation function for training sparse autoencoders (SAEs)
A fast, parallelized, memory efficient, and cache-optimized Python implementation of node2vec
Ekstra Bladet Recommender System repository for benchmarking the EBNeRD dataset.
A question answering dataset in Modern Hebrew, containing 30,147 questions.
Supporting code for cGen: Contrastive pre-training for sequence based genomics models
pathoscore evaluates variant pathogenicity tools and scores.
This API provides programmatic access to the AlphaGenome model developed by Google DeepMind.
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.