👓
brrrrr
math, post training LLMs, RL, (practical) mech interp and AI safety
|
other interests : distributed systems and databases
- Find me in the attic
Pinned Loading
-
-
-
MedARC-AI/med-lm-envs
MedARC-AI/med-lm-envs PublicAutomated LLM evaluation suite for medical tasks
-
Information-Theory-Group/Adaptive-Sampling-Networks
Information-Theory-Group/Adaptive-Sampling-Networks PublicLearned Logit Transforms for Adaptive Sampling
Python
-
-
mt-bench-101
mt-bench-101 PublicForked from mtbench101/mt-bench-101
[ACL 2024] MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
