Reset Django Migrations

I guess we've been all there at one point: You started out with one (Django) application, new requirements came in and you added more models and then more applications and somehow everything got out of control as there was no clear definition of the application hierarchy.

So you end up with a bunch of migrations in different applications that require each other and there seems to be no way to ever get out of this mess as there are still depenencies or seed data in migrations so you cannot just reset them.

Worry no more - this article might be your rescue.

Prerequisites

Migration types

There are a couple of different types of migrations to consider here:

  • Database Model Migration
  • Scripts that add Permissions, Seed Data, ...
  • One time migrations (elidable)

We only care about the first two here as the one time scripts already run and don't do anything on an empty database.

Database Model Migration

So the idea is to first create migrations for all models without any outgoing relationships, then create all migrations for models that only link to those models from the previous step, then create all migrations for models that only link to those models from the previous two steps and so on. In the code sample below I call them "levels".

This is not a one fits all solution but it gives you a good overview on what migrations to create first which ones second and so on.

First step is to generate a dot file that includes all inheritances and relations between models:

python manage.py graph_models -a > relations.dot

Then we load that file into a graph representation so we can use networkx to analyze the incoming and outgoing relations.

import networkx as nx
dot = nx.drawing.nx_agraph.read_dot('relations.dot')

The problems with the dot representation are:

  • abstract models don't have migrations but still are connected with edges so we want to filter them out
  • for abstract models that have relations to other models those relations are not inherited so we need to do that manually

Disclaimer: I didn't spend too much time cleaning up this code after it was working. So I guess there are more elegant ways of writing parts of it.

abstract_nodes = []
for edge in dot.edges():
    data = dot.get_edge_data(*edge)
    if ' abstract\\ninheritance' in [i.get('label') for i in data.values()]:
        abstract_nodes.append(edge[1])

# move links from abstract to inheriting models
for node in set(abstract_nodes):
    edges = dot.in_edges(node)
    abstract_outgoing = dot.out_edges(node)
    for edge in edges: 
        data = dot.get_edge_data(*edge)
        if ' abstract\\ninheritance' in [i.get('label') for i in data.values()]:
            for outgoing in abstract_outgoing:
                dot.add_edge(edge[0], outgoing[1])

# remove abstract models
for node in set(abstract_nodes):
    dot.remove_node(node)
from pprint import pprint

def split_node(name):
    group, model = name.split('_models_')
    group = group.replace('_', '.')
    return group, model

def get_next_level(current_set):
    level = []
    for node in dot.nodes():
        outgoing = dot.out_edges(node)

        if {o[1] for o in outgoing if o[1] != str(node)}.issubset(current_set):
            if node not in current_set:
                level.append(node)

    return set(level)

all_entries = set()
for i in range(0, 10):
    print(f'LEVEL {i}')
    new_entries = get_next_level(all_entries)
    all_entries = all_entries.union(new_entries)
    pprint({split_node(n) for n in new_entries})

    # group the apps together as we can only run migrations on app level
    print(' '.join(list({
        split_node(n)[0].replace('apps.', '').replace('marketplace.', '') 
        for n in new_entries
    })))
    print()

This produces an output like:

LEVEL 0
 ('apps.products', 'Category'),
 ('apps.products', 'Manufacturer'),
 ('apps.products', 'Supplier'),
 ('apps.products', 'Promotion'),
 ('apps.marketplace.orders', 'ConversionRate'),
 ('apps.marketplace.shipment', 'Courier'),
 ('django.contrib.contenttypes', 'ContentType'),
 ('django.contrib.sessions', 'Session'),
 ('django.contrib.sites', 'Site'),
 ('social.django', 'Association'),
 ('social.django', 'Code'),
 ('social.django', 'Nonce'),
 ('social.django', 'Partial')}
orders django.contrib.sessions django.contrib.contenttypes 
social.django mptt.graph shipment products django.contrib.sites

LEVEL 1
{('apps.products', 'Brand'),
 ('apps.products', 'Discount'),
 ('apps.user.organizations', 'Organization'),
 ('django.contrib.auth', 'Permission'),
 ('django.contrib.redirects', 'Redirect')}
django.contrib.redirects user.organizations products django.contrib.auth

LEVEL 2
{('apps.products', 'ProductImage'),
 ('apps.products', 'Product'),
 ('apps.products', 'Price'),
 ('apps.user.organizations', 'Office'),
 ('django.contrib.auth', 'Group')}
user.organizations products django.contrib.auth

LEVEL 3
{('apps.products', 'Sheet'),
 ('apps.products', 'Variant'),
 ('apps.user', 'User')}
user products

[...]

There are also some nodes that are not shown above e.g. because of weird foreign keys:

print(set(dot.nodes()) - all_entries)

What we need to do now is to start at level one and take out the groups that make sense (e.g. not django internal models):

python manage.py makemigrations orders shipment products

As those apps might have models that are in other levels we need to go through those migrations and check in the dependencies for things similar to ('organizations', '__first__').

If we find a line like that this means there is a unresolved model here. So we need to search for organizations in that migration file and remove all foreign keys to it and then remove the ('organizations', '__first__') from the dependencies.

Those lines will be created in separate migrations afterwards again in later levels.

After creating level one migrations, we repeat this for each level.

Scripts that add Permissions, Seed Data, ...

The second part is about keeping the migrations that create seed data.

The easiest way to detact those is to search for all files in migrations folders that contain RunPython or RunSQL. After identifying those you should remove all those that were one time only migrations (= changing existing data in the database) as the database is empty in e.g. the case of running tests.

For all the other ones you can copy the functions together grouped by app and create seed migration files for those.

The End

At the end you can apply all migrations with the manage migrate --fake command.