harre.dev

Testing Python code that runs during import

I've been pretty comfortable with Python after making the switch from dotnet core to Python for a new client.

During this time I've gotten to know the way around the ecosystem and Python-isms, but sometimes I run into something that makes me go "huh? oh...". This post describes one such moment.

Modules

So in Python modules can use each others functionality by importing them into one another using the import keyword and referencing the module you want to use e.g:

import os

Let's dive in, I was setting up a dependency container to leverage dependency injection in Python to make the composition of the final app more flexible and keep all the classes in the modules testable through inversion of control.

Here's the container (simplified for the sake of this post):

import os
from dependency_injector import containers, providers

from dep import Dependency

class Container(containers.DeclarativeContainer):
    config = providers.Configuration()
    config.client_id.from_env("CLIENT_ID", required=True)  
    config.working_directory.from_env("WORKING_DIRECTORY", default=os.getcwd())  

    some_dependency = providers.Singleton(Dependency, config.working_directory, config.tenant_id)  

Instantiation of this container happens as soon as the app is started to get access to all the registered dependencies. It also holds some configuration that is read from the environment (see 12-factor apps configuration).

I generally skip testing of this kind of code since it's all glue between the various dependencies but as I put in the required=True on the client_id I thought: I might want to capture this behavior somewhere so that I'll know when I mess with something the future. So, in come the tests, maybe not test-first this time, but I still wanted to put some guardrails around this type of initialization code.

I wrote the tests one by one and all was fine, until I ran the tests one single session, only the first one passed! Wat?

The tests

Here's the final result with the "Today I Learned" moment explained:

import os  
import sys  
import unittest  
from unittest import mock  
  
  
class ContainerTest(unittest.TestCase):  
    def setUp(self):  
        # The container initialization happens at import time, so we need wipe any existing  
        # imports for application.container. The container uses from_env during import
        # which we work around by importing the container again for each test.
        if "application.container" in sys.modules:  
            del sys.modules["application.container"]  
  
    @mock.patch.dict(os.environ, {"CLIENT_ID": "test-client", "WORKING_DIRECTORY": "/some/root/path"}, clear=True)  
    def test_it_should_initialize_container(self):  
        from application.container import Container  
        container = Container()  
  
        self.assertEqual("test-client", container.config.tenant_id())  
        self.assertEqual("/some/root/path", container.config.working_directory())  
  
    @mock.patch.dict(os.environ, {"WORKING_DIRECTORY": "/some/root/path"}, clear=True)  
    def test_it_should_not_initialize_container_without_tenant_variable(self):  
        with self.assertRaises(ValueError) as ve:  
            from application.container import Container  
            _ = Container()  
  
        exception = ve.exception  
        self.assertEqual('Environment variable "CLIENT_ID" is undefined', str(exception))  
  
    @mock.patch.dict(os.environ, {"CLIENT_ID": "test-client"}, clear=True)  
    def test_it_should_initialize_container_with_default_policy_sets_root_path(self):  
        from application.container import Container  
        container = Container()  
        self.assertEqual(os.getcwd(), container.config.working_directory())

The clue is in the comments already, but here's the full details. When I started writing my tests I imported the container at the top of the file:

from application.container import Container

This made each individual test pass in isolation but not all together in a single run... What's the deal? I mistakenly assumed that those declarations inside the container's class would happen at the moment the container was constructed (container = Container()) but this is not the case.

In my assumption I mixed up instance level attributes (those that operate at instance creation) and class attributes (these run when the container code is imported!). Which in the case of the container makes sense, we want those dependencies to be constant.

This also explains why the tests kept failing when running in a single session. The import only took place once! As soon as the container code was imported, its state (those environment variables and other dependencies) was locked in and would not change from one test to the next. The other tests after it would fail due to the container's unexpected state.

The solution

Ok, then how do you get around this? Well I learned that you can remove modules that have already been imported and then... Import them again! Precisely what I need. Python keeps track of all the modules that have been loaded in a special list in sys.modules. Turns out you can also remove modules from this list. It does come with a word of caution about its use though:

sys.modules:

This is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. However, replacing the dictionary will not necessarily work as expected and deleting essential items from the dictionary may cause Python to fail. If you want to iterate over this global dictionary always use sys.modules.copy() or tuple(sys.modules) to avoid exceptions as its size may change during iteration as a side effect of code or activity in other threads.

Not really something you would use in your every day Python code I think, but these are tests, so we are in a well known state for each of the test modules and we should be fine.

To unload a module you can delete it from the list using the module's name:

del sys.modules["application.container"]

Since we are running more than one test, we need to unload the container module before each one. This can be achieved using Python unittest's setUp helper method. This method runs before each test and deletes the application.container module if it's present. Each test preps its own conditions using mocks and can then decide at what point the container code should get imported, and we can successfully assert the behavior we expect from it.

While sys.modules does come with a little disclaimer about unpredictable behavior, code that runs during import (instead of instantiation) can now be tested without running into unpredictable state issues in your own code.

Happy coding!