Skip to content

recent strchriscntrl change #1598

@grawity

Description

@grawity

Hello,

It seems to me that the recent commit 19d725d "reject C1 control bytes" is misguided. When the terminals work in UTF-8 mode, they aren't interpreting C1 control bytes; they are interpreting C1 control codepoints, and even the example in the commit message uses UTF-8 encoded codepoints (U+009B encoded as 0xC2 0x9B) rather than single bytes. Yet the function filters raw bytes, whenever they happen to appear as part of an UTF-8 sequence.

However, while 0xC2 0x97 is indeed the UTF-8 encoding of the U+0097 control character, 0xC4 0x97 is instead the UTF-8 encoding of the letter ė: not a control character, and not interpreted as one by any UTF-8 capable terminal, but will be blocked by strchriscntrl() regardless. As a result, chfn -f now rejects my last name as invalid.

The function needs to decode input per the locale codepage and check whether the resulting wchar is in the C1 control range.

(If the user uses a UTF-8 locale but an administrator does not, I don't think that is solvable – it sounds like a self-inflicted problem – and best I can suggest is "use cat -v when reading untrusted files".)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions