recent strchriscntrl change

Hello,

It seems to me that the recent commit 19d725da3c61d4785aec08766408ce32a81a0a0d "reject C1 control bytes" is misguided. When the terminals work in UTF-8 mode, they aren't interpreting C1 control *bytes*; they are interpreting C1 control *codepoints*, and even the example in the commit message uses UTF-8 encoded codepoints (U+009B encoded as 0xC2 0x9B) rather than single bytes. Yet the function filters raw bytes, whenever they happen to appear as part of an UTF-8 sequence.

However, while 0xC2 0x97 is indeed the UTF-8 encoding of the U+0097 control character, **0xC4** 0x97 is instead the UTF-8 encoding of the letter `ė`: not a control character, and not interpreted as one by any UTF-8 capable terminal, but will be blocked by strchriscntrl() regardless. As a result, `chfn -f` now rejects my last name as invalid.

The function needs to decode input per the locale codepage and check whether the resulting _wchar_ is in the C1 control range.

(If the user uses a UTF-8 locale but an administrator does not, I don't think that is solvable – it sounds like a self-inflicted problem – and best I can suggest is "use `cat -v` when reading untrusted files".)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recent strchriscntrl change #1598

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

recent strchriscntrl change #1598

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions