Expecting the unexpected¶

To handle errors properly deserves a chapter on its own in any programming book. Python gives us many ways do deal with errors fatal and otherwise: try, except, assert, if … Using these mechanisms in a naive way may lead to code that is littered with safety if statements and try-except blocks, just because we need to account for errors at every level in a program.

In this tutorial we’ll see how we can use exceptions in a more effective way. As an added bonus we learn how to use exceptions in a manner that is compatible with the Noodles programming model. Let’s try something dangerous! We’ll compute the reciprocal of a list of numbers. To see what is happening, the function something_dangerous contains a print statement.

[1]:

import sys

def something_dangerous(x):
    print("computing reciprocal of", x)
    return 1 / x

try:
    for x in [2, 1, 0, -1]:
        print("1/{} = {}".format(x, something_dangerous(x)))

except ArithmeticError as error:
    print("Something went terribly wrong:", error)

computing reciprocal of 2
1/2 = 0.5
computing reciprocal of 1
1/1 = 1.0
computing reciprocal of 0
Something went terribly wrong: division by zero

This shows how exceptions are raised and caught, but this approach is somewhat limited. Suppose now, that we weren’t expecting this expected unexpected behaviour and we wanted to compute everything before displaying our results.

[2]:

input_list = [2, 1, 0, -1]
reciprocals = [something_dangerous(item)
               for item in input_list]

print("The reciprocal of", input_list, "is", reciprocals)

computing reciprocal of 2
computing reciprocal of 1
computing reciprocal of 0

---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-2-5d396078122a> in <module>()
      1 input_list = [2, 1, 0, -1]
      2 reciprocals = [something_dangerous(item)
----> 3                for item in input_list]
      4
      5 print("The reciprocal of", input_list, "is", reciprocals)

<ipython-input-2-5d396078122a> in <listcomp>(.0)
      1 input_list = [2, 1, 0, -1]
      2 reciprocals = [something_dangerous(item)
----> 3                for item in input_list]
      4
      5 print("The reciprocal of", input_list, "is", reciprocals)

<ipython-input-1-990ff89c780e> in something_dangerous(x)
      3 def something_dangerous(x):
      4     print("computing reciprocal of", x)
----> 5     return 1 / x
      6
      7 try:

ZeroDivisionError: division by zero

Ooops! Let’s fix that.

[3]:

try:
    reciprocals = [something_dangerous(item)
                   for item in input_list]

except ArithmeticError as error:
    print("Something went terribly wrong:", error)

else:
    print("The reciprocal of\n\t", input_list,
          "\nis\n\t", reciprocals)

computing reciprocal of 2
computing reciprocal of 1
computing reciprocal of 0
Something went terribly wrong: division by zero

That’s also not what we want. We wasted all this time computing nice reciprocals of numbers, only to find all of our results being thrown away because of one stupid zero in the input list. We can fix this.

[4]:

import math

def something_safe(x):
    try:
        return something_dangerous(x)
    except ArithmeticError as error:
        return math.nan

reciprocals = [something_safe(item)
               for item in input_list]

print("The reciprocal of\n\t", input_list,
      "\nis\n\t", reciprocals)

computing reciprocal of 2
computing reciprocal of 1
computing reciprocal of 0
computing reciprocal of -1
The reciprocal of
         [2, 1, 0, -1]
is
         [0.5, 1.0, nan, -1.0]

That’s better! We skipped right over the error and continued to more interesting results. So how are we going to make this solution more generic? Subsequent functions may not know how to handle that little nan in our list.

[5]:

square_roots = [math.sqrt(item) for item in reciprocals]

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-4d8b46ef9954> in <module>()
----> 1 square_roots = [math.sqrt(item) for item in reciprocals]

<ipython-input-5-4d8b46ef9954> in <listcomp>(.0)
----> 1 square_roots = [math.sqrt(item) for item in reciprocals]

ValueError: math domain error

Hmmmpf. There we go again.

[6]:

def safe_sqrt(x):
    try:
        return math.sqrt(x)
    except ValueError as error:
        return math.nan

[safe_sqrt(item) for item in reciprocals]

[6]:

[0.7071067811865476, 1.0, nan, nan]

This seems Ok, but there are two problems here. For one, it feels like we’re doing too much work! We have a repeating code pattern here. That’s always a moment to go back and consider making parts of our code more generic. At the same time, this is when we need some more advanced Python concepts to get us out of trouble. We’re going to define a function in a function!

[7]:

def secure_function(dangerous_function):
    def something_safe(x):
        """A safer version of something dangerous."""
        try:
            return dangerous_function(x)
        except (ArithmeticError, ValueError):
            return math.nan

    return something_safe

Consider what happens here. The function secure_function takes a function something_dangerous as an argument and returns a new function something_safe. This new function executes something_dangerous within a try-except block to deal with the possibility of failure. Let’s see how this works.

[8]:

safe_sqrt = secure_function(math.sqrt)
print("⎷2 =", safe_sqrt(2))
print("⎷-1 =", safe_sqrt(-1))
print()
help(safe_sqrt)

⎷2 = 1.4142135623730951
⎷-1 = nan

Help on function something_safe in module __main__:

something_safe(x)
    A safer version of something dangerous.

Ok, so that works! However, the documentation of safe_sqrt is not yet very useful. There is a nice library routine that may help us here: functools.wraps; this utility function sets the correct name and doc-string to our new function.

[9]:

import functools

def secure_function(dangerous_function):
    """Create a function that doesn't raise ValueErrors."""
    @functools.wraps(dangerous_function)
    def something_safe(x):
        """A safer version of something dangerous."""
        try:
            return dangerous_function(x)
        except (ArithmeticError, ValueError):
            return math.nan

    return something_safe

[10]:

safe_sqrt = secure_function(math.sqrt)
help(safe_sqrt)

Help on function sqrt in module math:

sqrt(...)
    sqrt(x)

    Return the square root of x.

Now it is very easy to also rewrite our function computing the reciprocals safely:

[11]:

something_safe = secure_function(something_dangerous)
[safe_sqrt(something_safe(item)) for item in input_list]

computing reciprocal of 2
computing reciprocal of 1
computing reciprocal of 0
computing reciprocal of -1

[11]:

[0.7071067811865476, 1.0, nan, nan]

There is a second problem to this approach, which is a bit more subtle. How do we know where the error occured? We got two values of nan and are desperate to find out what went wrong. We’ll need a little class to capture all aspects of failure.

[12]:

class Fail:
    """Keep track of failures."""
    def __init__(self, exception, trace):
        self.exception = exception
        self.trace = trace

    def extend_trace(self, f):
        """Grow a stack trace."""
        self.trace.append(f)
        return self

    def __str__(self):
        return "Fail in " + " -> ".join(
            f.__name__ for f in reversed(self.trace)) \
            + ":\n\t" + type(self.exception).__name__ \
            + ": " + str(self.exception)

We will adapt our earlier design for secure_function. If the given argument is a Fail, we don’t even attempt to run the next function. In stead, we extend the trace of the failure, so that we can see what happened later on.

[13]:

def secure_function(dangerous_function):
    """Create a function that doesn't raise ValueErrors."""
    @functools.wraps(dangerous_function)
    def something_safe(x):
        """A safer version of something dangerous."""
        if isinstance(x, Fail):
            return x.extend_trace(dangerous_function)
        try:
            return dangerous_function(x)
        except Exception as error:
            return Fail(error, [dangerous_function])

    return something_safe

Now we can rewrite our little program entirely from scratch:

[14]:

@secure_function
def reciprocal(x):
    return 1 / x

@secure_function
def square_root(x):
    return math.sqrt(x)

reciprocals = map(reciprocal, input_list)
square_roots = map(square_root, reciprocals)

for x, result in zip(input_list, square_roots):
    print("sqrt( 1 /", x, ") =", result)

sqrt( 1 / 2 ) = 0.7071067811865476
sqrt( 1 / 1 ) = 1.0
sqrt( 1 / 0 ) = Fail in square_root -> reciprocal:
        ZeroDivisionError: division by zero
sqrt( 1 / -1 ) = Fail in square_root:
        ValueError: math domain error

See how we retain a trace of the functions that were involved in creating the failed state, even though the execution of that produced those values is entirely decoupled. This is exactly what we need to trace errors in Noodles.

Handling errors in Noodles¶

Noodles has the functionality of secure_function build in by the name of maybe. The following code implements the above example in terms of noodles.maybe:

[1]:

import noodles
import math
from noodles.tutorial import display_workflows

@noodles.maybe
def reciprocal(x):
    return 1 / x

@noodles.maybe
def square_root(x):
    return math.sqrt(x)

results = [square_root(reciprocal(x)) for x in [2, 1, 0, -1]]
for result in results:
    print(str(result))

0.7071067811865476
1.0
Fail: __main__.square_root (<ipython-input-1-74755080cfd2>:9)
* failed arguments:
    __main__.square_root `0` Fail: __main__.reciprocal (<ipython-input-1-74755080cfd2>:5)
    * ZeroDivisionError: division by zero
Fail: __main__.square_root (<ipython-input-1-74755080cfd2>:9)
* ValueError: math domain error

The maybe decorator works well together with schedule. The following workflow is full of errors!

[16]:

@noodles.schedule
@noodles.maybe
def add(a, b):
    return a + b

workflow = add(noodles.schedule(reciprocal)(0),
               noodles.schedule(square_root)(-1))
display_workflows(arithmetic=workflow, prefix='errors')

arithmetic

Both the reciprocal and the square root functions will fail. Noodles is smart enough to report on both errors.`

[17]:

result = noodles.run_single(workflow)
print(result)

Fail: __main__.add (<ipython-input-16-ca83c3781f78>:1)
* failed arguments:
    __main__.add `0` Fail: __main__.reciprocal (<ipython-input-15-74755080cfd2>:5)
    * ZeroDivisionError: division by zero
    __main__.add `1` Fail: __main__.square_root (<ipython-input-15-74755080cfd2>:9)
    * ValueError: math domain error

Example: parallel stat¶

Let’s do an example that works with external processes. The UNIX command stat gives the status of a file

[18]:

!stat -t -c '%A %10s %n' /dev/null

crw-rw-rw-          0 /dev/null

If a file does note exist, stat returns an error-code of 1.

[19]:

!stat -t -c '%A %10s %n' does-not-exist

stat: kan status van 'does-not-exist' niet opvragen: No such file or directory

We can wrap the execution of the stat command in a helper function.

[20]:

from subprocess import run, PIPE, CalledProcessError

@noodles.schedule
@noodles.maybe
def stat_file(filename):
    p = run(['stat', '-t', '-c', '%A %10s %n', filename],
            check=True, stdout=PIPE, stderr=PIPE)
    return p.stdout.decode().strip()

The run function runs the given command and returns a CompletedProcess object. The check=True argument enables checking for return value of the child process. If the return value is any other then 0, a CalledProcessError is raised. Because we decorated our function with noodles.maybe, such an error will be caught and a Fail object will be returned.

[21]:

files = ['/dev/null', 'does-not-exist', '/home', '/usr/bin/python3']
workflow = noodles.gather_all(stat_file(f) for f in files)
display_workflows(stat=workflow, prefix='errors')

stat

We can now run this workflow and print the output in a table.

[22]:

result = noodles.run_parallel(workflow, n_threads=4)

for file, stat in zip(files, result):
    print('stat {:18} -> {}'.format(
        file, stat if not noodles.failed(stat)
        else 'failed: ' + stat.exception.stderr.decode().strip()))

stat /dev/null          -> crw-rw-rw-          0 /dev/null
stat does-not-exist     -> failed: stat: kan status van 'does-not-exist' niet opvragen: No such file or directory
stat /home              -> drwxr-xr-x       4096 /home
stat /usr/bin/python3   -> lrwxrwxrwx          9 /usr/bin/python3