spack/lib
Massimiliano Culpo 5cd28847e8 filter_file uses "surrogateescape" error handling (#12765)
From Python docs:
--
'surrogateescape' will represent any incorrect bytes as code points in
the Unicode Private Use Area ranging from U+DC80 to U+DCFF. These
private code points will then be turned back into the same bytes when
the surrogateescape error handler is used when writing data. This is
useful for processing files in an unknown encoding.
--

This will allow us to process files with unknown encodings.

To accommodate the case of self-extracting bash scripts, filter_file
can now stop filtering text input if a certain marker is found. The
marker must be passed at call time via the "stop_at" function argument.
At that point the file will be reopened in binary mode and copied
verbatim.

* use "surrogateescape" error handling to ignore unknown chars
* permit to stop filtering if a marker is found
* add unit tests for non-ASCII and mixed text/binary files
2019-10-14 20:35:14 -07:00
..
spack filter_file uses "surrogateescape" error handling (#12765) 2019-10-14 20:35:14 -07:00