Python Notes

GIL 1 2 3 4

The Global Interpreter Lock is a mutex that protects access to objects thus allowing only a single thread in the Python interpreter to execute at any given time. The lock is needed because the memory management in CPython is not thread-safe (mostly due to the reference counting mechanism for garbage collection). This prevents multithreaded Python programs from fully utilizing multiprocessor hardware. To achieve true parallelism with CPython, one must turn to the multiprocessing module to utilize multiprocessor CPUs.

GC and COW 5 6 7

Copy on Write can allow forked subprocesses to share more memory and reduce overall memory footprint of a multiprocess application. However, due to the reference counting employed by CPython’s garbage collection, accessing an object is effectively a modification triggering duplication into the child process. There is now the ability in CPython 3.7 to “freeze” objects making POSIX fork() calls more copy-on-write friendly, but it is not without caveats. 8 9 10

list comprehension

Syntactic sugar for defining lists; commonly used to create lists based on another iterable such as applying checks to each member of another iterable to create a subset of elements which all satisfy a given condition.

generator function

A function that yields computed values and behaves as an iterator; a use case would be to defer the code execution of the generator until it is iterated through, or to conserve memory by simply yielding the values rather than storing them.

generators vs comprehensions

Generator vs Comprehension - The comprehension will create the entire list in memory first, while the generator will calculate individual items on the fly (also allowing it to handle infinite sequences). Typically speaking, you’ll want to use a generator if you are only iterating over the results a single time, or if you need to save storage space, or if you are writing coroutines. If you are iterating multiple times over the same data, you want to use a comprehension since the result is stored.

decorators

A decorator is simply a function (well normally a function, but technically a callable object) that takes at least a function as an argument, but can also take additional parameters, and returns a replacement function. This allows it to wrap the argument function and modify the input and/or output of the ‘wrapped’ function.

pseudo-private name mangling

Private name mangling: When an identifier that textually occurs in a class definition begins with two or more underscore characters and does not end in two or more underscores, it is considered a private name of that class. Private names are transformed to a longer form before code is generated for them. The transformation inserts the class name in front of the name, with leading underscores removed, and a single underscore inserted in front of the class name. For example, the identifier __spam occurring in a class named Ham will be transformed to _Ham__spam. This transformation is independent of the syntactical context in which the identifier is used. If the transformed name is extremely long (longer than 255 characters), implementation defined truncation may happen. If the class name consists only of underscores, no transformation is done.

metaclasses:
single-quotes vs double-quotes:

The difference is that using one allows you to include the other un-escaped.

namedtuple multiprocessing logging queues lambda gevent, twisted, tornado pickle for mp

collections itertools functools

database getting hammered, what do you do? cache responses just make sure you have cache invalidation on updates

after you cache, then what if response times are still low? load balance your application with multiple instances use something like gevent for async request processing

after load balancing and clustering, code bottlenecks and performance profiling

import exception order old/new style class differences class level attribute overrides variables as references to objects Scopes and closures mutability vs immutability (with function parameters)

multiprocessing - cannot use pickle because its so slow - can use shared memory, but then the process would need to inherit it - can use process pool if you can preallocate all the shared memory - don’t continuously start processes or you will slow down from overhead - can use ctypes to efficiently share data between processes - cannot share memory (without posix lib) after process has started - can use something like sockets or redis to share data between processes