What is state of the art: Unicode file names on Windows

The only way that mixed content can be handled reliably is if the binary data is base64 encoded. Then the whole document is utf-8 encoded.

In general yes, but I was referring to the file readers listed by @efahl in my comment.