Synchronized in Python
Created on Oct. 14, 2011, 4:36 p.m.
Some more fun with python decorators. This time about threading.
Java gives us some nice threading primitives built into the language including the abstraction of the "synchronized" keyword. What this keyword ensures is that no two threads will enter the same block marked as "synchronized". We can actually build into python almost identical functionality using decorators and some basic meta-programming.
First we'll tackle the synchronized block syntax in Java. It looks like this:
...
/* Enter critical section */
synchronized(mLock) {
/* Do critical work */
...
}
/* Exit critical section */
...
And this can appear pretty much anywhere in Java code. For the same in python we don't have to add anything - it actually has this already built in - but instead of the "synchronized" keyword we use the "with" keyword.
...
# Enter critical section
with self.lock:
# Do critical work
...
# Exit critical section
...
Pretty straight forward - and the lovely thing about the pythonic way is that the functionality of the "with" keyword can be defined for any class using the "__enter__" and "__exit__" methods. You might have already seen it being used for files and a host of other things.
There is also a second advantage to using the "with" keyword over simply wrapping the critial section in "self.lock.acquire()" and "self.lock.release()". Using "with" ensures that the lock is always released, even if an exception occurs within the critical section. It will ensure that "__exit__" is always called, much like the "finally" statement in a "try" block.
The second way in which "synchronized" appears in Java is in a method declaration. It looks something like this.
public class Counter {
int mTotal;
public Counter() {
mTotal = 0;
}
public synchronized void addOne() {
int val = mTotal;
val++;
mTotal = val;
}
}
This ensures that no two threads can both be inside a class instance' method at the same time. For this example it ensures that the total always get updated correctly. We can do something similar in python using our own hand made decorators. The decorator looks something like this:
import threading
def synchronized(func):
func.__lock__ = threading.Lock()
def synced_func(*args, **kws):
with func.__lock__:
func(*args, **kws)
return synced_func
As you can see, it takes a function, attaches a lock to that function, and wraps the function within that lock. As in Java this ensures that no two threads can be inside the function at the same time. Here is an example of it at work.
import time
total = 0
@synchronized
def count():
global total
curr = total + 1
time.sleep(0.1)
total = curr
def counter():
for i in range(0,10): count()
thread1 = threading.Thread(target = counter)
thread2 = threading.Thread(target = counter)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print total
With the function "count" synchronized this script should correctly print out "20" for the value of "total". If the decorator is removed then technically the value of "total" is some unknown between "10" and "20", but in this script it will tend to be "10" as the timing almost always ensures both threads enter the critical section at the same time and the total is updated incorrectly.
But unfortunately the decorator above wont work for a class method. The reason for this is the way in which classes are instanced. If you were to apply the above to a class method it would create a lock across all instances of that method (the actual method function only exists in one place). If we want a lock on a per-instance basis we have to do something else.
def synchronized_method(method):
outer_lock = threading.Lock()
sync_method.__outerlock__ = outer_lock
lock_name = "__"+method.__name__+"_lock__"
def sync_method(self, *args, **kws):
with outer_lock:
if not hasattr(self, lock_name): setattr(self, lock_name, threading.Lock())
lock = getattr(self, "__lock__")
with lock:
method(self, *args, **kws)
return sync_method
class Counter: def __init__(self): self.total = 0 @synchronized_method def add_one(self): val = self.total val += 1 self.total = val
This is a bit more messy but we can solve it by taking advantage of the fact that all method functions take a "self" value as their first parameter representing the class instance. This new decorator basically checks if the class instance already has a lock for the method to use. If a lock exists it uses it, otherwise it creates a new one and attaches it to the instance. Worth nothing that as this class instance lock is created at run-time we need an outer lock, created before anything in the instance, to ensure there is no race condition with who gets to create the class instance lock (imagine two threads calling this method for the first time, at the same time). For this we create an outer lock at the time of class definition (when the decorator is applied) and use this to ensure only one instance lock is created.
In Java method syncronization all syncronizated methods refer to the same implicit lock. In our python example above it is slightly different. Each of our methods uses a different lock. There is no real reason for this other than the fact that actually emulating Java in this nature is really quite hard using decorators while avoiding race conditions on the creation of the instance lock.
We can create something more similar though. What we can do is to lock whenever we wish to manipulate a certain member of a class.
import threading
import time
def synchronized_with_attr(lock_name):
def decorator(method):
def synced_method(self, *args, **kws):
lock = getattr(self, lock_name)
with lock:
method(self, *args, **kws)
return synced_method
return decorator
class Counter:
def __init__(self):
self.lock = threading.RLock()
self.total = 0
@synchronized_with_attr("lock")
def add_one(self):
val = self.total
val += 1
time.sleep(0.1)
self.total = val
@synchronized_with_attr("lock")
def add_two(self):
val = self.total
val += 2
time.sleep(0.1)
self.total = val
The most obvious difference is that we are now passing in the decorator a string with the name of the lock attribute. The reason for this is that decorators are applied at the time the class is defined, not at the point at which it is instanced. Before it is instanced python has no idea that "self.lock" actually exists. Because of this we have to give the name as a string and rely on the fact that it will accurately find the correct member once the class is instanced.
The second thing to notice is that I'm now using an RLock instead of a normal lock. This is a re-entrant lock and allows a thread to call acquire on a lock more than once if it already holds the lock. This ensures that if a method calls another method internally it will not cause deadlock as a thread tries to acquire a lock it already owns. In fact we might have wanted to use an RLock for our previous method decorator - otherwise recursion would cause a deadlock.
Finally if we take a look at our decorator "synchronized_with_attr" we realize this isn't a decorator in itself, but in fact a function which returns a decorator. This is a key aspect of how we argument decorators and begins to show their full power.
Sometimes we wish to lock over all of the methods for a class. Creating what is essentially a thread safe data structure. In Java this is easy - we just add the "synchronized" keyword to the class definition.
public synchronized class Counter {
int mTotal;
public MyClass() {
mTotal = 0;
}
public void addOne() {
int val = mTotal;
val += 1;
mTotal = val;
}
public void addTwo() {
int val = mTotal;
val += 2;
mTotal = val;
}
}
We can do this in python too, though the solution becomes a bit more complicated. Let us try and create an appropriate decorator for a class.
import threading
import types
def synchronized_with(lock):
def decorator(func):
def synced_func(*args, **kws):
with lock:
func(*args, **kws)
return synced_func
return decorator
def synchronized_class(sync_class):
lock = threading.RLock()
orig_init = sync_class.__init__
def __init__(self, *args, **kws):
self.__lock__ = lock
orig_init(self, *args, **kws)
sync_class.__init__ = __init__
for key in sync_class.__dict__:
val = sync_class.__dict__[key]
if type(val) is types.FunctionType:
decorator = synchronized_with(lock)
sync_class.__dict__[key] = decorator(val)
return sync_class
Let me explain what is happening here.
The first thing that happens is we create a new lock for the class to use. We then override the "__init__" method of the class so that it first assigns this new lock as the class member "__lock__" before calling the old "__init__" function. We then loop over all of the items in the class dictionary. We check for which ones are functions, and if they are we apply our "synchronized_with" decorator to them with the lock we created at the beginning. We then return the modified class. And that's it! We have a synchronized class.
There is one more nice tweak we can do, which is to combine all of these new decorators into one function that decides which decorator is appropriate to apply. This is fairly straight forward - we simply look at the argument to the function. If the argument is a lock then we know we must return a new decorator using that lock. If the argument is a string we try to apply the attribute synchronization. If we get anything else (such as a function or a class) then we apply the decorator as usual.
import thread
import threading import types def synchronized_with_attr(lock_name): def decorator(method): def synced_method(self, *args, **kws): lock = getattr(self, lock_name) with lock: method(self, *args, **kws) return synced_method return decorator def syncronized_with(lock): def synchronized_obj(obj): if type(obj) is types.FunctionType: obj.__lock__ = lock def func(*args, **kws): with lock: obj(*args, **kws) return func elif type(obj) is types.ClassType: orig_init = obj.__init__ def __init__(self, *args, **kws): self.__lock__ = lock orig_init(self, *args, **kws) obj.__init__ = __init__ for key in obj.__dict__: val = obj.__dict__[key] if type(val) is types.FunctionType: decorator = syncronized_with(lock) obj.__dict__[key] = decorator(val) return obj return synchronized_obj def synchronized(item): if type(item) is types.StringType: decorator = synchronized_with_attr(item) return decorator(item) if type(item) is thread.LockType: decorator = syncronized_with(item) return decorator(item) else: new_lock = threading.Lock() decorator = syncronized_with(new_lock) return decorator(item)
And with about 50 lines of code we've added the synchronization primatives to python, beautiful!
@synchronized
class Counter:
def __init__(self):
self.counter = 0
def add_one(self):
val = self.counter
val += 1
time.sleep(0.1)
self.counter = val
def add_two(self):
val = self.counter
val += 2
time.sleep(0.1)
self.counter = val
my_counter = Counter()
def class_counter1():
global my_counter
for i in range(0,10): my_counter.add_one()
def class_counter2():
global my_counter
for i in range(0,10): my_counter.add_two()
thread1 = threading.Thread(target = class_counter1)
thread2 = threading.Thread(target = class_counter2)
thread1.start()
thread2.start()
thread1.join()
thread2.join()
print my_counter.counter