We don't want things like smart quotes in the data. Write a test to check to ensure we don't add them back.
Alternatively, add valid charsets as a list to all the YAML files, and validate that those files match those charsets. Some files have both Greek and English in them.