FAT Python

Three-year-old Cambodian boy Oeun Sambat hugs his best friend, a four-metre long female python named Chamreun

Intro

The FAT Python project was started by Victor Stinner in October 2015 to try to solve issues of previous attempts of “static optimizers” for Python. The main feature are efficient guards using versionned dictionaries to check if something was modified. Guards are used to decide if the specialized bytecode of a function can be used or not.

Python FAT is expected to be FAT... maybe FAST if we are lucky. FAT because it will use two versions of some functions where one version is specialised to specific argument types, a specific environment, optimized when builtins are not mocked, etc.

See the fatoptimizer documentation which is the main part of FAT Python.

The FAT Python project is made of multiple parts:

Announcements and status reports:

Getting started

Compile Python 3.6 patched with PEP 509, PEP 510 and PEP 511:

git clone https://github.com/haypo/cpython -b fatpython fatpython
cd fatpython
./configure --with-pydebug CFLAGS="-O0" && make

Install fat:

git clone https://github.com/haypo/fat
cd fat
../python setup.py build
cp -v build/lib*/fat.*so ../Lib
cd ..

For OS X users, use ./python.exe instead of ./python.

Install fatoptimizer:

git clone https://github.com/haypo/fatoptimizer
(cd Lib; ln -s ../fatoptimizer/fatoptimizer .)

fatoptimizer is registed by the site module if -X fat command line option is used. Extract of Lib/site.py:

if 'fat' in sys._xoptions:
    import fatoptimizer
    fatoptimizer._register()

Check that fatoptimizer is registered with:

$ ./python -X fat -c 'import sys; print(sys.implementation.optim_tag)'
fat-opt

You must get fat-opt (and not opt).

How can you contribute?

The fatoptimizer project needs the most love. Currently, the optimizer is not really smart. There is a long TODO list. Pick a simple optimization, try to implement it, send a pull request on GitHub. At least, any kind of feedback is useful ;-)

If you know the C API of Python, you may also review the implementation of the PEPs:

But these PEPs are still work-in-progress, so the implementation can still change.

Play with FAT Python

See Getting started to compile FAT Python.

Disable peephole optimizer

The -o noopt command line option disables the Python peephole optimizer:

$ ./python -o noopt -c 'import dis; dis.dis(compile("1+1", "test", "exec"))'
  1           0 LOAD_CONST               0 (1)
              3 LOAD_CONST               0 (1)
              6 BINARY_ADD
              7 POP_TOP
              8 LOAD_CONST               1 (None)
             11 RETURN_VALUE

Specialized code calling builtin function

Test fatoptimizer on builtin function:

$ ./python -X fat
>>> def func(): return len("abc")
...

>>> import dis
>>> dis.dis(func)
  1           0 LOAD_GLOBAL              0 (len)
              3 LOAD_CONST               1 ('abc')
              6 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE

>>> import fat
>>> fat.get_specialized(func)
[(<code object func at 0x7f9d3155b1e0, file "<stdin>", line 1>,
[<fat.GuardBuiltins object at 0x7f9d39191198>])]

>>> dis.dis(fat.get_specialized(func)[0][0])
  1           0 LOAD_CONST               1 (3)
              3 RETURN_VALUE

The specialized code is removed when the function is called if the builtin function is replaced (here by declaring a len() function in the global namespace):

>>> len=lambda obj: "mock"
>>> func()
'mock'
>>> fat.get_specialized(func)
[]

Microbenchmark

Run a microbenchmark on specialized code:

$ ./python -m timeit -s 'def f(): return len("abc")' 'f()'
10000000 loops, best of 3: 0.122 usec per loop

$ ./python -X fat -m timeit -s 'def f(): return len("abc")' 'f()'
10000000 loops, best of 3: 0.0932 usec per loop

Python must be optimized to run a benchmark: use ./configure && make clean && make if you previsouly compiled it in debug mode.

You should compare specialized code to an unpatched Python 3.6 to run a fair benchmark (to also measure the overhead of PEP 509, 510 and 511 patches).

Run optimized code without registering fatoptimizer

You have to compile optimized .pyc files:

# the optimizer is slow, so add -v to enable fatoptimizer logs for more fun
./python -X fat -v -m compileall

# why does compileall not compile encodings/*.py?
./python -X fat -m py_compile Lib/encodings/{__init__,aliases,latin_1,utf_8}.py

Finally, enjoy optimized code with no registered optimized:

$ ./python -o fat-opt -c 'import sys; print(sys.implementation.optim_tag, sys.get_code_transformers())'
fat-opt []

Remember that you cannot import .py files in this case, only .pyc:

$ echo 'print("Hello World!")' > hello.py
$ ENV/bin/python -o fat-opt -c 'import hello'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: missing AST transformers for 'hello.py': optim_tag='fat-opt', transformers tag='noopt'

Origins of FAT Python