prisma migrate
is mainly used for schema migrations, and it's great for doing that. But sometimes you also need or want to migrate your data along with your schema. Turns out prisma migrate can do that as well!
I'm using RedwoodJS, and Redwood has its own data migration solution. But I wanted to do it all with prisma for various reasons. You should decide for your particular scenario what works best for you.
Prisma migrations are just standard .sql files, so if you can express your data migration as one or more sql statements you're good to go.
As I said, I'm using RedwoodJS, so first I edit my api/db/schema.prisma
file to do my schema migration. In this case I have a "Products" model with a soft/implicit self-reference. I wanted to make that a proper Prisma one-to-many relation. The self-reference is used to model product variations. Like you have a main "Product" that's a shirt. Then you have variants for the different colors of the shirt. So you could have three variants for "red", "green" and "blue" shirts.
This is the diff for my schema.prisma
-file
diff --git a/api/db/schema.prisma b/api/db/schema.prisma
index d29fc77..40ffc59 100644
--- a/api/db/schema.prisma
+++ b/api/db/schema.prisma
@@ -45,6 +45,9 @@ model Product {
stock Int @default(0)
images String?
parent_v_p_id Int?
+ parentId String? @db.Uuid
+ parentProduct Product? @relation("ParentProduct", fields: [parentId], references: [id])
+ variants Product[] @relation("ParentProduct")
ean String?
attrib1name String?
attrib1val String?
Next up I run yarn rw prisma migrate dev --create-only
. The key here is the --create-only
flag. It creates a .sql file with the migration instruction in it, but doesn't actually apply it. This gives us a chance to modify the SQL statements the migration will run.
This is what the generated file looks like
-- AlterTable
ALTER TABLE "Product" ADD COLUMN "parentId" UUID;
-- AddForeignKey
ALTER TABLE "Product" ADD FOREIGN KEY ("parentId") REFERENCES "Product"("id") ON DELETE SET NULL ON UPDATE CASCADE;
As you can see the Prisma relation is made up of three fields, but only one is actually created in the database. The parentId
field in this case. I'm using the database-native datatype UUID
here to match with the data type of my id
field.
I mentioned I had a soft or implicit self-reference for product variants. Half of it is the parent_v_p_id
field you can see in the diff output above. The other half is a variable_product_id
field. I call products that have different variants for "variable" products. (And products without variants for "simple" products.) Now however I decided to just use the id
that all products have as the relationship reference. No need for a specific field when I already have a perfectly good id to use.
Now that we know what we want to do, let's write the SQL for it.
I'm not great at SQL, but I do know the basics, so after a few minutes on Google I came up with this statement*
UPDATE "Product"
SET "parentId" = p.id
FROM (
SELECT id,
variable_product_id AS v_p_id
FROM "Product"
) AS p
WHERE parent_v_p_id IS NOT NULL
AND p.v_p_id = parent_v_p_id;
(* Not really true. I came up with something similar, but had to do some additional tweaks to arrive at what you see above)
Modifying the DB, and especially editing the data in it, is scary. So I wanted a way to test my SQL before making it part of the migration. This is where I turned to my db admin tool of choice, DBeaver. I'm sure TablePlus or something like that would work great as well. Just need a way to run custom SQL statements. The trick I used to be able to test my code was to wrap it in a transaction that I always roll back at the end. That way I can iterate on the sql without doing any permanent changes to the database.
BEGIN;
-- [migration statements]...
ROLLBACK;
I did have to make a few tweaks to my original SQL. When the entire transaction completed without errors I was ready to try it on my test database. So I just copied the statements between the BEGIN;
and ROLLBACK;
lines and replaced all the content of my migration .sql file with the copied code.
-- 20211128123650_parentproduct_relation/migration.sql
-- AlterTable
ALTER TABLE "Product"
ADD COLUMN "parentId" UUID;
-- AddForeignKey
ALTER TABLE "Product"
ADD FOREIGN KEY ("parentId")
REFERENCES "Product"("id")
ON DELETE SET NULL
ON UPDATE CASCADE;
UPDATE "Product"
SET "parentId" = p.id
FROM (
SELECT id,
variable_product_id AS v_p_id
FROM "Product"
) AS p
WHERE parent_v_p_id IS NOT NULL
AND p.v_p_id = parent_v_p_id;
One final precaution to take (depending on how much you care about your test database) is to dump your database to a file. I usually do that with this command
pg_dump \
--username=username \
--password \
--format=plain \
--inserts \
--no-owner \
--no-privileges \
db_name > db_name_dump_$(date +"%Y%m%dT%H%M%S").sql
It results in a big file that's slow to restore. But it's also a file that's easy read (and understand) and that will work across different versions of PostgreSQL and maybe even between different DB engines. I think it's worth the tradeoffs for when working with my test db. For backing up your prod db you might want to use something else. But now we're off on a tangent. Let's get back to data migrations...
With the migration script updated it's time to apply it. Running yarn rw prisma migrate dev
again, but without any extra flags this time, will apply the migration to your test database.
After running that it's obviously a good idea to verify that everything looks alright in the database. Again DBeaver (or TablePlus etc) comes in handy. Finally you should spin up your app to make sure it runs as expected. Hopefully the db survived the migration and your data (and schema) is properly migrated.
That's it ๐ Thanks for reading!
Cover photo by Andrey Novik on Unsplash