Changelog¶
1.5.0 - 2020-05-09¶
1.4.0 - 2019-11-17¶
- Python 3.3 support removed
- Added type annotations
1.3.4 - 2018-01-23¶
fsn2bytes()
andbytes2fsn()
now default to “wtf-8” for the Windows path encoding instead of having no default.
1.3.3 - 2018-01-03¶
- Restore WinXP support
- Fix some warnings with Python 3.6
1.3.2 - 2017-11-05¶
- Tests: Fix some errors with newer pytest and make the test suite work on native Windows.
1.3.1 - 2017-07-29¶
- Fixed missing normalization with
path2fsn()
on Linux + Python 3
1.3.0 - 2017-07-28¶
- New
supports_ansi_escape_codes()
- New
fsn2norm()
(+ all API now returns normalized fsnative) fsn2uri()
: various fixes
1.2.1 - 2016-12-07¶
isinstance(path, fsnative)
now checks the value as well. If True passing the instance topath2fsn
will never fail.
1.2.0 - 2016-12-06¶
1.1.0 - 2016-12-05¶
print_()
: Don’t ignoreflush
in Windows redirect modeargv
: Forwards changes tosys.argv
#2environ
: Forwards changes toos.environ
#2environ
: Handle case insensitive env vars on Windowsfsn2text()
: Add astrict
modefsn2uri()
: Always returntext
fsn2bytes()
: Merge surrogate pairs under Python 3 + Windowsfsn2bytes()
: Supportutf-16-be
under Python 2.7/3.3
1.0.1 - 2016-10-25¶
1.0.0 - 2016-09-09¶
- First stable release
0.4.0 - 2016-09-07¶
- Support paths with surrogates under Windows
0.3.0 - 2016-09-03¶
- Support
__fspath__
inpath2fsn()
. See PEP 519 for details. - Rename fsn2uri_ascii to
fsn2uri()
, remove the later. - Fix
fsn2uri()
output on Windows for certain unicode ranges. - Add
expandvars()
0.1.0 - 2016-08-22¶
- Initial release
Tutorial¶
There are various ways to create fsnative instances:
# create from unicode text
>>> senf.fsnative(u"foo")
'foo'
# create from some serialized format
>>> senf.bytes2fsn(b"foo", "utf-8")
'foo'
# create from an URI
>>> senf.uri2fsn("file:///foo")
'/foo'
# create from some Python path-like
>>> senf.path2fsn(b"foo")
'foo'
You can mix and match the fsnative type with ASCII str on all Python versions and platforms:
>>> senf.fsnative(u"foo") + "bar"
'foobar'
>>> senf.fsnative(u"foo").endswith("foo")
True
>>> "File: %s" % senf.fsnative(u"foo")
'File: foo'
Now that we have a fsnative
, what can we do with it?
>>> path = senf.fsnative(u"/foo")
# We can print it
>>> senf.print_(path)
/foo
# We can convert it to text for our favorite GUI toolkit
>>> senf.fsn2text(path)
'/foo'
# We can convert it to an ASCII only URI
>>> senf.fsn2uri(path)
'file:///foo'
# We can serialize the path so we can save it somewhere
>>> senf.fsn2bytes(path, "utf-8")
b'/foo'
The functions in the stdlib usually return the same type as was passed in. If
we pass in a fsnative
to os.listdir
, we get one back as well.
>>> files = os.listdir(senf.fsnative(u"."))
>>> isinstance(files[0], senf.fsnative)
True
In some cases the stdlib functions don’t take arguments and always return the same type. For those cases Senf provide alternative implementations.
>>> isinstance(senf.getcwd(), senf.fsnative)
True
A similar problem arises with stdlib collections. Senf provides alternatives
for sys.argv
and os.environ
.
>>> isinstance(senf.argv[0], senf.fsnative)
True
>>> isinstance(senf.environ["PATH"], senf.fsnative)
True
Also for os.environ
related functions.
>>> isinstance(senf.getenv("HOME"), fsnative)
True
>>> isinstance(senf.expanduser("~"), fsnative)
True
If you work with files a lot your unit tests will probably need temporary
files. Senf provides wrappers for tempfile
functions which always return a
fsnative
.
>>> senf.mkdtemp()
'/tmp/tmp26Daqo'
>>> isinstance(_, senf.fsnative)
True
API Documentation¶
Stdlib Replacements¶
Alternative implementations or wrappers of stdlib functions and constants. In some cases their default is changed to return an fsnative path (mkdtemp() with default arguments) or Unicode support for Windows is added (sys.argv)
environ |
os.environ replacement |
argv |
sys.argv replacement |
sep |
os.sep replacement |
pathsep |
os.pathsep replacement |
curdir |
os.curdir replacement |
pardir |
os.pardir replacement |
altsep |
os.altsep replacement |
extsep |
os.extsep replacement |
devnull |
os.devnull replacement |
defpath |
os.defpath replacement |
getcwd() |
os.getcwd replacement |
getenv() |
os.getenv replacement |
putenv() |
os.putenv replacement |
unsetenv() |
os.unsetenv replacement |
print_() |
print() replacement |
input_() |
input() replacement |
expanduser() |
os.path.expanduser() replacement |
expandvars() |
os.path.expandvars() replacement |
gettempdir() |
tempfile.gettempdir() replacement |
gettempprefix() |
tempfile.gettempprefix() replacement |
mkstemp() |
tempfile.mkstemp() replacement |
mkdtemp() |
tempfile.mkdtemp() replacement |
Misc Functions¶
supports_ansi_escape_codes() |
if the output file supports ANSI codes |
Documentation Types¶
These types only exist for documentation purposes and represent different types depending on the Python version and platform used.
-
class
senf.
text
¶ Represents
unicode
under Python 2 andstr
under Python 3. Does not include surrogates.
Frequently Asked Questions¶
- Are there any existing users of Senf?
- It is currently used in Quod Libet and mutagen.
- Why not use bytes for paths on Python 3 + Unix?
Downsides of using str: str can not be pickled as it depends on the locale encoding. You have to use something like
fsn2bytes
first, or you have to make sure that the encoding doesn’t change across program invocations.Upsides of using str: str has more support in the stdlib (pathlib for example) and it can be used in combination with the string literal
"foo"
. The later makessome_fsnative + "foo"
work for all Python versions and platforms as long as it contains ASCII only.- Why the weird “foo2bar” function naming?
- As the real types depend on the platform anything like “decode”/”encode” is confusing. So you end up with “a_to_b” or “a_from_b”. And imo having things always go one direction, being fast to parse visually and not being too long makes this a good choice. But ymmv.
- How can it be that
fsnative()
can’t fail, even with an ASCII encoding? - It falls back to utf-8 if encoding fails. Raising there would make everything complicated and there is no good way to handle that error case anyway.
- Why not replace
sys.stdout
instead of providing a newprint()
? - No monkey patching. Allows us to do our own error handling so print will never fail. Printing some question marks is better than a stack trace if the target is a user.
Senf introduces a new platform native string type called fsnative
. It
adds functions to convert text, bytes and paths to and from that new type and
helper functions to integrate it nicely with the Python stdlib.
Senf supports Python 2.7, 3.3+, works with PyPy, works on Linux, Windows, macOS, is MIT licensed, and only depends on the stdlib. It does not monkey patch anything in the stdlib.
pip install senf
https://github.com/quodlibet/senf
Why?¶
OS strings are used in many different places across the Python stdlib. They
are used for filesystem paths, for environment variables (os.environ
), for
program arguments (sys.argv
and subprocess
), for printing to the console
(sys.stdout
, sys.stderr
) and more.
The problem with them is that they come in many shapes and forms and handling them has changed significantly between Python 2 and Python 3.
A valid platform native string is either bytes
, unicode
, str
+
surrogates (either through the surrogatepass
or the surrogateescape
error handler) or anything implementing the __fspath__
protocol. The
values of those types depend on the Python version, the platform and the
enviroment the program was started in. Ideally we don’t want to care about any
of those details.
For example, assume you want to check the extension of a file name:
import os
from senf import path2fsn
def has_extension(filename, ext):
root, filename_ext = os.path.splitext(path2fsn(filemame))
return filename_ext == path2fsn(ext)
This will just work everywhere. path2fsn()
will convert anything which
is considered a valid path by Python to a fsnative
and then we can just
compare by value. Note that Python stdlib functions will always returns the
same type which was passed in, so os.path.splitext()
will return two
fsnative
values.
Or you want to send a filename over some binary interface:
from senf import fsnative, fsn2bytes, bytes2fsn
def send(filename):
assert isinstance(filename, fsnative)
data = fsn2bytes(filename, "utf-8")
return data
def receive(data):
filename = bytes2fsn(data, "utf-8")
return filename
fsn2bytes()
converts the path to binary (“utf-8” is used on Windows, or
“wtf-8” to be exact) and the receiving end re-creates the filename with
bytes2fsn()
.
Another example is printing filenames and text to a console:
import os
from senf import print_, argv
for filename in os.listdir(argv[1]):
print_(u"File: ", filename)
Senf provids its own print function which can output platform strings as is and mix them with text. No more encoding/decoding errors.
In addition, Senf emulates ANSI escape sequence handling when using the
Windows console and extends Python 2 under Windows with Unicode support for
sys.argv
and os.environ
.
Who?¶
Senf is used by the following software:
- Quod Libet - A multi platform music player
- mutagen - A Python multimedia tagging library