-
Notifications
You must be signed in to change notification settings - Fork 95
Description
tldr; We should work to remove dependencies on system utilities where possible (e.g. perl) and allow Cylc to pick up remaining dependencies (e.g. bash) from the Cylc environment.
System Dependencies
Before Cylc 8 cylc-flow was not packaged and relied on a number of dependencies to be provided at the system level. This made Cylc difficult to install and resulted in a number of issues.
Since Cylc 8.0a0 Cylc has been packaged, access to modern libraries has allowed us to remove many of the system dependencies, however, cylc-flow still has a handful of system dependencies left including:
-
Bash (acceptable as long as Bash is the default Cylc shell, however, we should have control over the version range)
cylc-flow/cylc/flow/job_file.py
Line 120 in 442e505
handle.write("#!/bin/bash -l\n") -
Perl (not justifiable)
r"exec perl -e 'setpgrp(0,0);exec(@ARGV)'" + -
GNU Coreutils
" timeout --signal=XCPU %d" % execution_time_limit)
cylc-flow/cylc/flow/job_file.py
Line 85 in 442e505
['/usr/bin/env', 'bash', '-n', tmp_name], -
GNU Sed
cylc-flow/cylc/flow/etc/job.sh
Lines 101 to 102 in 442e505
CYLC_WORKFLOW_HOST="$(sed -n 's/^CYLC_WORKFLOW_HOST=//p' "${contact}")" CYLC_WORKFLOW_OWNER="$(sed -n 's/^CYLC_WORKFLOW_OWNER=//p' "${contact}")" -
GNU ps (Cylc Flow ps command syntax not compatible with Alpine Linux busybox #4416)
cylc-flow/cylc/flow/workflow_files.py
Line 378 in 4aee200
cmd = ["timeout", "10", "ps", PS_OPTS, str(old_pid_str)]
Note: Whilst these dependencies could be included with the Conda recipe, that wouldn't necessarily help as cylc-flow actually requires them to be installed in the system environment rather than the Cylc environment (e.g. the job script is invoked with the system Bash not the cylc-flow environment Bash).
The Problem
This means that even Conda installations aren't necessarily complete. Relying on a "standard set" of system libs is increasingly unsustainable, not all Linux comes with Bash pre-installed, some OSes have really ancient versions and Unix OSes aren't guaranteed to have GNU variants of utilities, e.g. for Cylc to be installed on Mac OS the user must separately replace the BSD utils with GNU ones for the default environment.
This is the reason Cylc makes such extensive use of login shells and also forces us to maintain compatibility with older versions of these tools, some assorted examples:
cylc-flow/cylc/flow/job_file.py
Lines 147 to 150 in 442e505
| # NOTE we cannot do /usr/bin/env bash because we need to use the -l | |
| # option and GNU env doesn't support additional arguments (recent | |
| # versions permit this with the -S option similar to BSD env but we | |
| # cannot make the jump to this until is it more widely adopted) |
cylc-flow/cylc/flow/etc/job.sh
Line 59 in 442e505
| # On AIX the hostname command has no '-f' option |
| brew install bash coreutils gnu-sed |
These libs are dependencies of cylc-flow itself so should be managed similarly to other dependencies so that we can:
- Specify and test against compatible version ranges (e.g. there are known bad versions of Bash we don't want to use).
- Ensure the Cylc stack is not exposed to known security vulnerabilities introduced by these libs.
- Provide a one-click full-stack installation via Conda/Docker/whatever-the-future-holds.
Path Forward
- Where possible we should continue to eliminate these dependencies as they can mostly be replaced by simpler solutions.
- Where the dependencies are required we should alter the code so that the dependency can be installed into the Cylc environment rather than mandating its presence in the system environment.
- We should add any remaining dependencies (hopefully just Bash?) into the Conda environemt (but provide a separate output allowing installation without these dependencies if desired).
Adopting a Python job script (#3613) would help remove the Bash dependency and provide support for arbitrary configurable shells (Bash, Ash, Fish, Python, NodeJS, whatever).