Balancing-The-Flow

Code for Balancing the Flow: An Information-Theoretic Study of RLHF-Induced Uniformity in Language Model Outputs

All code is available in the Scripts Folder. Alternative to running scripts, there are a few Python Jupyter notebooks to explore our results. Automated graphs of our results are also available at UID Analysis and UID_Comparison specific to each model and model comparison pairs.

For base models, we're generating article completions by only providing the title and first sentence of each article. For each model, we have different configurations based on the recommended default values given by their respective teams.

Llama 2 7B:

temperature=0.9
top_p=0.6

Mistral 7B:

temperature=0.7

Our environment and project is configured by the Astral uv project manager, with all dependencies stored in pyproject.toml. You can install the package through

pip install uv

Followed by

cd [scripts directory]
uv sync

to install all dependencies. We used the following command-line argument to run our experiments:

uv run prompter.py llama-2-7b@q8_0 -g 2000 verbose=True

If there are any missing dependencies, please let us know! You can also use uv add [package] for an update to the dependencies and PR.

Each 2000 generations took approximately 4 hours.

General Use

Calculating Surprisal (analyze only will only compute surprisals without generating):

uv run prompter.py gpt2 --analyze-human -v
uv run prompter.py llama-2-7b-32k-instruct --analyze-only -g 300 -v
uv run prompter.py llama-2-7b@q8_0 --analyze-only -g 300 -v
uv run prompter.py mistral-7b-instruct-v0.3 --analyze-only -g 300 -v
uv run prompter.py mistral-7b-v0.1 --analyze-only -g 300 -v

To generate from scratch and compute surprisals:

uv run prompter.py gpt2 --analyze-human -v
uv run prompter.py llama-2-7b-32k-instruct -g 300 -s "Provided only the following article title and first sentence, complete the rest of the article from this moment onwards:" -v
uv run prompter.py llama-2-7b@q8_0 --analyze-only -g 300 -v
uv run prompter.py mistral-7b-instruct-v0.3 -g 300 -t 0.7 -p 0.95 -s "Provided only the following article title and first sentence, complete the rest of the article from this moment onwards:" -v
uv run prompter.py mistral-7b-v0.1 --analyze-only -g 300 -v

Other prompter flags:

parser.add_argument('model', type=str, help='Model name to use for generation')
parser.add_argument('-g', '--generate', type=int, default=300, help='Number of examples to generate')
parser.add_argument('-t', '--temperature', type=float, default=0.9, help='Temperature for generation')
parser.add_argument('-p', '--top-p', type=float, default=1.0, help='Top-p (nucleus sampling) parameter')
parser.add_argument('-s', '--system-prompt', type=str, help='System prompt to prepend to each generation')
parser.add_argument('-r', '--regenerate', action='store_true', help='Regenerate existing outputs')
parser.add_argument('-v', '--verbose', action='store_true', help='Print verbose information')
parser.add_argument('--max-tokens', type=int, default=2048, help='Maximum tokens for generation')
parser.add_argument('--max-retries', type=int, default=3, help='Maximum retries for failed generations')
parser.add_argument('--analyze-human', action='store_true', help='Analyze human texts instead of generating new ones')
parser.add_argument('--human-dir', type=str, default='../Sources', help='Directory containing human texts to analyze')
parser.add_argument('--analyze-only', action='store_true', help='Only analyze surprisals without generating new texts')

UID Analysis

uv run analyze_uid.py --input-dir "../Surprisals/human_texts" --output-dir "../UID_Analysis/human_texts"
uv run analyze_uid.py --input-dir "../Surprisals/llama-2-7b-32k-instruct" --output-dir "../UID_Analysis/llama-2-7b-32k-instruct"
uv run analyze_uid.py --input-dir "../Surprisals/llama-2-7b@q8_0" --output-dir "../UID_Analysis/llama-2-7b@q8_0"
uv run analyze_uid.py --input-dir "../Surprisals/mistral-7b-instruct-v0.3" --output-dir "../UID_Analysis/mistral-7b-instruct-v0.3"
uv run analyze_uid.py --input-dir "../Surprisals/mistral-7b-v0.1" --output-dir "../UID_Analysis/mistral-7b-v0.1"

All comparison

uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/llama-2-7b-32k-instruct" "../UID_Analysis/llama-2-7b@q8_0" "../UID_Analysis/mistral-7b-instruct-v0.3" "../UID_Analysis/mistral-7b-v0.1" --output-dir "../UID_Comparison/all_models"

Human vs all

uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/llama-2-7b-32k-instruct" "../UID_Analysis/llama-2-7b@q8_0" "../UID_Analysis/mistral-7b-instruct-v0.3" "../UID_Analysis/mistral-7b-v0.1" --output-dir "../UID_Comparison/human_vs_all"

Llama models

uv run compare_uid.py --directories "../UID_Analysis/llama-2-7b-32k-instruct" "../UID_Analysis/llama-2-7b@q8_0" --output-dir "../UID_Comparison/llama_models"

Mistral models

uv run compare_uid.py --directories "../UID_Analysis/mistral-7b-instruct-v0.3" "../UID_Analysis/mistral-7b-v0.1" --output-dir "../UID_Comparison/mistral_models"

Human vs Each

uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/llama-2-7b-32k-instruct" --output-dir "../UID_Comparison/human_vs_llama_32k"
uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/llama-2-7b@q8_0" --output-dir "../UID_Comparison/human_vs_llama_q8"
uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/mistral-7b-instruct-v0.3" --output-dir "../UID_Comparison/human_vs_mistral_instruct"
uv run compare_uid.py --directories "../UID_Analysis/human_texts" "../UID_Analysis/mistral-7b-v0.1" --output-dir "../UID_Comparison/human_vs_mistral_v0.1"

Name		Name	Last commit message	Last commit date
Latest commit History 143 Commits
Data		Data
Generations		Generations
Scripts		Scripts
Sources		Sources
Summary		Summary
Surprisals		Surprisals
UID_Analysis		UID_Analysis
UID_Comparison		UID_Comparison
UID_Domain_Comparisons		UID_Domain_Comparisons
refactor		refactor
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Balancing-The-Flow

General Use

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

NolanChai/Balancing-The-Flow

Folders and files

Latest commit

History

Repository files navigation

Balancing-The-Flow

General Use

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages