Skip to content

uber/Moirai-SOSP25-code

Repository files navigation

Moirai

Prerequisites

Get gurobipy LICENSE (commercial trial works for 30 days)

Install the required dependencies:

pip3 install pandas, gurobipy, numpy, matplotlib

Usage

Placing traces

First, download and decompress the traces from Moirai-SOSP25-logs.

Second, organize the traces into the following folders:

  1. Table size files (report-table-size-*.csv): place under Moirai/
  2. Workload files (from 2024/10 onward, report-abFP-volume-table-*.csv): place under Moirai/newTraces/
  3. Job traces (%Y%m%d-Presto.csv or %Y%m%d-Spark.csv): place under Moirai/jobTraces/

Running Optimization

Run the following command to verify outputs under sample_0.050:

python3 tests.py --test=samplek --c=30 --k=0.05 --num_week=2 --rep_rate=0.002 --Spark

Running Scheduler

Job traces in jobTraces/ contain per-job information rather than aggregated optimizer traces. Run the scheduler after optimization:

python3 scheduler.py --c=30 --num_week=2 --opt_path="sample_0.050"

Note: This process takes ~30 minutes per week of job traces. Since the example runs for two weeks, expect ~1 hour.

Another flag is --simple to run the scheduler without simulating the traffic rate per minute. This can save you time if you do not care the traffic rate.

Complete experiments

Note: If you re-run these commands, it won't cover the results.

python3 tests.py --test=samplek --k=1 --num_week=13 --rep_rate=0.002 --Spark --c=30
python3 scheduler.py --num_week=13 --opt_path="sample_1.000" --c=30

python3 tests.py --test=long_term --Spark
python3 tests.py --test=reorg_unaware --Spark

Other useful flags (see more in --help):

  • tests.py
    • --view: displays parameters without running the optimization.
    • --opt_start_date: specifies the start date for optimization (default: 2024-10-22).
  • scheduler.py
    • --debug: runs a smaller subset of traces for debugging.

Traces

cputime column in Spark traces (both the ones under jobTraces/ and newTraces/) represents the total CPU time in seconds for the job. Therefore, you should not sum them up to get the total CPU time for the job.

Other notes

  • Code related to Yugong is our baseline from VLDB 2019.
python3 tests.py --test=yugong --num_week=13 --rep_rate=0.004 --c=30
python3 scheduler.py --yugong --num_week=13 --opt_path="yugong_results" --c=30

Other baselines:

  • Without pre-selecting replication, can we achieve enough speedup with sampling? Try k=0.001, 0.01, 0.05
python3 tests.py --test=samplek --k=0.001 --num_week=13 --rep_rate=0 --Spark --c=30
python3 tests.py --test=samplek --k=0.01 --num_week=13 --rep_rate=0 --Spark --c=30
python3 tests.py --test=samplek --k=0.05 --num_week=13 --rep_rate=0 --Spark --c=30
python3 tests.py --test=samplek --k=0.1 --num_week=13 --rep_rate=0 --Spark --c=30
python3 scheduler.py --num_week=13 --opt_path="sample_0.050" --c=30
  • How do other scheduling policies perform?
python3 scheduler.py --num_week=13 --opt_path="sample_1.000_rep0.002" --policy="size-aware" --c=30
python3 scheduler.py --num_week=13 --opt_path="sample_1.000_rep0.002" --policy="size-unaware" --simple --c=30
  • How do other replication strategies perform?
python3 tests.py --test=samplek --k=1 --num_week=1 --rep_rate=0.002 --Spark --c=50 --rep_strategy="read_traffic_density"
  • Baselines
python3 baselines.py --baseline="rep_x_month" --rep_rate=0.21 --c=30 # Rep 3 month
python3 baselines.py --baseline="rep_rtd" --rep_rate=0 --c=30 # No rep
  • Customized test
python3 tests.py --test=samplek --k=1 --num_week=1 --rep_rate=0.001 --c=10 --opt_start_date="2025-03-04" --table_size_file="report-table-size-20250310.csv"

About

No description, website, or topics provided.

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages