Conversation
|
@narategithub any idea why the build test is failing here? query->err_msg is defined in the sosapi_client.c afaict. |
|
@nick-enoent How do i specify "banner mode=1" in a yaml configuration file so that it goes to the head of the generated conf as suggested by some of the code being removed here? |
|
@nichamon, @nick-enoent, @baallan some of these options should likely be removed altogether, and others may need changes to the YAML parser. My reading of this is as follows:
Please consider the above and provide your opinions. |
You would include it as a dictionary in the "daemons" section. `daemons:
The same thing would apply here with the relevant v4 configuration attribute for "-C". |
| "The option `-F` is obsolete. "); | ||
| cleanup(EINVAL, "Received an obsolete command-line option"); | ||
| default: | ||
| ret = ldmsd_process_cmd_line_arg(op, optarg); |
There was a problem hiding this comment.
If we are going to remove all of these options in main(), we should also remove them from ldmsd_process_cmd_line(). Otherwise the code is just confusing since ldmsd_process_cmd_line_arg() is called here in "default".
But be aware that ldmsd_process_cmd_line_arg() is called a few places in ldmsd_request.c. So for instance, case 'B' in ldmsd_process_cmd_line_arg() will probably need to be broken out into a little function of its own so that it can be called by banner_mode_handler().
There was a problem hiding this comment.
Your point about ldmsd_process_cmd_line_arg is absolutely correct. Whatever we do, it has to be reflected in both places.
I think the -B is an artifact of old code that daemonized by default. It simply controlled what message would be logged so the SA could look at the log and see that ldmsd had been started/stopped.
We changed that because really almost everyone is using systemd. The -F flag is what would be used run it in the foreground and is what was used when using systemd.
My point is that -F and -B are related. I don't think we need either going forward.
My larger goal is to open up a discussion about dead code we don't use or need: these options, samplers for decommissioned platforms, e.g. msr_interlagos, cray_system_sampler? Maybe it should be an issue instead of a draft pull request?
There was a problem hiding this comment.
Your point about ldmsd_process_cmd_line_arg is absolutely correct. Whatever we do, it has to be reflected in both places.
I think the -B is an artifact of old code that daemonized by default. It simply controlled what message would be logged so the SA could look at the log and see that ldmsd had been started/stopped.
We changed that because really almost everyone is using systemd. The -F flag is what would be used run it in the foreground and is what was used when using systemd.
That is not a good assumption. systemd also supports "Type=forking", and since that is how ldmsd worked by default, that is how things are running here under systemd. And actually, this points at a problem moving forward from 4.4 to 4.5. I think in 4.5.1 we should accept the "-F" option, and just print a warning that it is deprecated and now the default behavior. This gives folks a cleaner migration path.
Our systemd .service files for ldms are installed by ansible. Supporting -F across the transition from 4.4. to 4.5 would allow us to move to:
Type=simple
ExecStart=/usr/sbin/ldmsd -F <etc>
And have it work both pre and post upgrade. Then after 4.5 is installed everywhere, we can retire the use of the -F option.
| "The option `-B` is obsolete. " | ||
| "Please specify `banner mode=<0|1|2>` in a configuration file."); | ||
| cleanup(EINVAL, "Received an obsolete command-line option"); | ||
| case 'F': |
There was a problem hiding this comment.
Note that this will be the first release that doesn't support "-F". Perhaps we should stop setting "opterr = 0" so we do get the error message from getopt_long() when it hits an unrecognized argument. Or alternatively we could print an explicit "X argument not recognized" message of our own. But it would be nice to tell the user which argument triggered the error and exit of ldmsd, not only print usage().
There was a problem hiding this comment.
I like the opterr approach. journalctl ... will show usage errors in the way that people expect instead of having to search syslog.
There was a problem hiding this comment.
I like the opterr approach. journalctl ... will show usage errors in the way that people expect instead of having to search syslog.
Keep in mind that for many people ovis_log() is logging to journald too. It goes to stdout by default, which under systemd is likely handled by journald. So actually, you are making me think it should be an ovis_log() error. Rather than letting getopt print the error.
nichamon
left a comment
There was a problem hiding this comment.
I understand the intention to clean up these unused options — that makes sense and should help reduce future confusion.
For the release we're trying to finalize, the one change to this table that seems critical is renaming quota to default_quota; otherwise, users won’t be able to specify the default quota via YAML.
Regarding worker_threads and kernel_file, I've added a note in the code. Since the YAML parser sends the options command and relies on this table for mapping, removing those entries now would drop their support in YAML.
| { "publish_kernel", optional_argument, 0, 'k' }, | ||
| { "log_file", required_argument, 0, 'l' }, | ||
| { "set_memory", required_argument, 0, 'm' }, | ||
| { "daemon_name", required_argument, 0, 'n' }, |
There was a problem hiding this comment.
We still need daemon_name because it is required when YAML is provided.
| { "log_file", required_argument, 0, 'l' }, | ||
| { "set_memory", required_argument, 0, 'm' }, | ||
| { "daemon_name", required_argument, 0, 'n' }, | ||
| { "worker_threads", required_argument, 0, 'P' }, |
There was a problem hiding this comment.
I’d like to keep worker_threads and kernel_file in the array for now.
Currently, there are two ways to specify ldmsd options in configuration files:
- Using the
optionscommand - Using individual commands for each option (e.g.,
log_level,default_quota)
When ldmsd receives the options command, the handler (ldmsd_request.c:cmd_line_arg_set_handler()) uses this table to map long options to short options, and then calls ldmsd_process_cmd_line_arg().
Since ldmsd_yaml_parser sends an options command to ldmsd, removing worker_threads and kernel_file from the table would effectively remove support for these options in YAML config.
There was a problem hiding this comment.
Don't we scale worker threads automatically? What purpose does it serve?
| { "version", required_argument, 0, 'V' }, | ||
| { "log_level", required_argument, 0, 'v' }, | ||
| { "log_config", required_argument, 0, 'L' }, | ||
| { "quota", required_argument, 0, 'C' }, |
There was a problem hiding this comment.
quota should be changed to default_quota to make it consistent with the config command to set the default quota. In YAML config, users can provide default_quota in the daemons section and it will just work.
This change removes commands marked "obsolete" in ldmsd:main()