Skip to content
This repository was archived by the owner on Aug 1, 2025. It is now read-only.

Commit 4b874fd

Browse files
committed
Add documentation on how to migrate the test server db to prod
1 parent 0ea94e9 commit 4b874fd

29 files changed

+262
-0
lines changed
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
# Migrating Test Server Data to Production Server Data
2+
Apart from optimizations for specific use cases, there's a major difference between the
3+
BrAPI Java test server and the production server:
4+
5+
**All id columns are replaced with the UUID data type instead of TEXT in the production database.**
6+
7+
This change was done out of need for:
8+
1. Standardization of good data practices
9+
2. Performance Optimization
10+
11+
The performance difference between TEXT and UUID columns in the database might not be felt in use cases
12+
where there are small amounts of data, but in large batch operations this optimization can speed queries up by about double time.
13+
14+
This database schema change doesn't only affect the DB of BrAPI Production Server, as the codebase has also been modified to
15+
accommodate for this data standardization.
16+
17+
This document will help you prepare for a data migration to using UUIDs the standard ID column type.
18+
19+
## Do you really need to migrate test server data?
20+
Here at BrAPI we hope that you used the test server for non-production data for your application. Since the introduction of
21+
BrAPI Java Production Server we hope that the test server is only used for testing for your application before you go live.
22+
23+
If that's the case, you should need to do nothing to proceed. Simply build the application on an empty DB and the schema
24+
should be generated with UUID as the ID column type.
25+
26+
However, since the production server was only recently introduced, we realize this is likely not the case.
27+
28+
If you have been utilizing the test server with production data, there are several steps you will need to take to swap over
29+
to the production data model.
30+
31+
This document will cover these.
32+
33+
## Step 1: Undo Dummy Migration Data
34+
In the BrAPI Java Test Server, there are some migration scripts kicked off the first time
35+
you build the app to put in some data you can look at and query to understand how the data model works.
36+
37+
Unfortunately a lot of this data uses non-UUIDs for the identifiers of the dummy data that is being inserted.
38+
39+
As such, you will need to remove this dummy data from your database in order to proceed with the migration.
40+
41+
To do this, find the `undo_dummy_data` folder and go one by one and copy and paste all the undo migration scripts in order
42+
into the SQL execution service of your choice to remove all of this data from the database.
43+
44+
## Step 2: Validate id columns
45+
After removing the dummy data, you will next want to check if there still exists non-UUID data in other id columns.
46+
47+
This should only be possible via other migration scripts your application has applied that included invalid UUID data, as
48+
the BrAPI test server does create UUIDs by default.
49+
50+
To do this validation, you need to create a stored procedure we have written in your DB instance that you can run to verify.
51+
52+
This script is found in the `validate_id_columns.sql` file provided in this directory.
53+
54+
There are two notable omissions of id columns that are not validated in this script:
55+
* `external_reference.external_reference_id` which is in fact not a UUID column as defined by the production server spec. This is because this ID is supposed to be flexible to whatever id the client sends.
56+
* `table.auth_user_id` This ID column is already a known issue, and will be resolved in another step. This is because by default the test server inserts `anonymousUser` as the default `auth_user_id` when one isn't sent in the request. More info on the `auth_user_id` section.
57+
58+
Once you have run the validation script, it should tell you any tables and their associated id columns that have invalid UUIDs in their id columns.
59+
60+
If such columns were found, you can run `retrieve_table_data_with_invalid_id_cols.sql` for any of the columns that have invalid data
61+
to grab the bad data. If the data doesn't fall under any of the steps outlined in this document, you will have to resolve this on your own or you can reach out to someone at the BrAPI team.
62+
63+
It's likely that once offending data is found, the data is referenced as foreign keys in other tables.
64+
65+
If that is the case, and the data (and its associated references) can't just be removed you will have to go through the process of inserting a new row (with a correctly generated UUID) and reassigning the foreign keys pointing to the old id
66+
to the new one. An example has been done for you via `migrate_crops.sql`. There may be a way to iterate through all the IDs, but hopefully
67+
the amount of data you have is small enough you can do it one by one.
68+
69+
## Step 3: Dump the database
70+
71+
At this point after you have validated the schema to be rid of non-UUID id data (save `auth_user_id`, more on that soon),
72+
you are ready to do a pg_dump.
73+
74+
You can accomplish this with the following command:
75+
76+
`pg_dump -U db_username -d db_name --data-only > dump.sql`
77+
78+
where `db_username` is the username you can login with to your database, and `db_name` is the name of the database you want to dump.
79+
80+
This command will grab only the data associated with each table. It will not copy the schema. It will place the results in a `dump.sql`
81+
file in the directory you ran the command from.
82+
83+
If your database exists in a docker container, the command will look something like this:
84+
85+
`docker exec db_container_name pg_dump -U db_username -d db_name --data-only > dump.sql`
86+
87+
To play it totally safe, let's also grab a copy of the data and the database schema together in the event something goes awry
88+
in the next steps.
89+
90+
To do this, create another dump using:
91+
92+
`pg_dump -U db_username -d db_name > dump_with_schema.sql`
93+
94+
Or with Docker:
95+
96+
`docker exec db_container_name pg_dump -U db_username -d db_name > dump_with_schema.sql`
97+
98+
In the event that you somehow lose the original database, you can reload it by creating the database and simply loading the
99+
`dump_with_schema.sql` file onto it. (More on that later).
100+
101+
## Step 4: Modify the dumpfile
102+
103+
**NOTE: We are talking about the `dump.sql` file without the schema here. The schema file we created in step 3 should remain unchanged.**
104+
105+
As stated previously, we've largely been ignoring all the `anonymousUser`-filled `auth_user_id` columns. These do in fact need
106+
to become UUIDs, and for that, we need to line them up with the new expected default UUID for that column `'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA'`.
107+
108+
Iterating through every table and modifying this column would take too long, so instead just run a find and replace on the exported dump file
109+
in your text editor of choice. If you have a large amount of data, this operation might be too much for that text editor. In which case you
110+
likely would have to edit the database for these columns before exporting.
111+
112+
## Step 5: Create and load the new database
113+
114+
Now that the dumpfile has been modified, we are ready to load the new database.
115+
To do this, use `psql` to login to the postgresql server, wherever it is hosted.
116+
117+
If your database is hosted on a docker container, this looks like:
118+
119+
`docker exec -it name_of_db_container psql -U db_username db_name`
120+
121+
Once in the `psql` CLI, you need to create your new database. Call it something like:
122+
123+
`CREATE DATABASE db_name_uuid`
124+
125+
Eventually, if you want to keep your old database name you can rename this database after the old one is removed.
126+
127+
Next, find the `application.properties` file of the BrAPI server code and change the `spring.datasource.url` so that it points to the correct database, the new one we just created.
128+
129+
Now, build the server application like normal with `mvn clean install`
130+
131+
Then run the application like normal:
132+
133+
`java '-agentlib:jdwp=transport=dt_socket,server=y,suspend=n,address=localhost:5006' -jar target/brapi-Java-TestServer*.jar`
134+
135+
This should trigger flyway to create the database using the initial schema, which was modified for the production server to
136+
generate the database with UUID for all `id` columns and their associated foreign keys.
137+
138+
You can verify this was successful by checking the `psql` CLI to see if there were any tables created with
139+
140+
`\dt`
141+
142+
and you can further check that the schema was created with UUID type `id` columns by picking a table and running the table description command, like
143+
144+
`\d program`
145+
146+
which should look something like:
147+
148+
```
149+
Table "public.program"
150+
Column | Type | Collation | Nullable | Default
151+
---------------------+---------+-----------+----------+---------
152+
id | uuid | | not null |
153+
additional_info | jsonb | | |
154+
auth_user_id | uuid | | |
155+
abbreviation | text | | |
156+
documentationurl | text | | |
157+
funding_information | text | | |
158+
name | text | | |
159+
objective | text | | |
160+
program_type | integer | | |
161+
crop_id | uuid | | |
162+
lead_person_id | uuid | | |
163+
```
164+
165+
Now that the database has been created, all that's left it to load the dump file we have into it.
166+
167+
To do this, give `psql` the file as input to load in an import with:
168+
169+
`psql -U db_username db_name_uuid < dump.sql`
170+
171+
For docker, because the file isn't hosted there, you need to pipe it using `cat` like:
172+
173+
`cat dump.sql | docker exec -i name_of_db_container psql -U db_username db_name_uuid `
174+
175+
This should kick off copy statements for every table. Ensure that there aren't errors.
176+
177+
If an error happens for any of the copy statements, the entire table it was trying to copy will not work.
178+
179+
An error on the `flyway_schema_history` is expected in most cases and is not a worry. You really only want the new flyway schema table created by the migrations
180+
run the first time you ran the app.
181+
182+
If the errors happened on other tables, you might have to do some sleuthing to figure out why and run this step again.
183+
184+
## Congrats!
185+
186+
With this, you should have successfully migrated the test server DB to the production DB.
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
2+
3+
-- Create replacement row, can alternatively pass your own generated uuid in hand so you don't have to fetch the created one for the next part.
4+
INSERT INTO crop (id, auth_user_id, crop_name) values (uuid_generate_v4(), 'AAAAAAAA-AAAA-AAAA-AAAA-AAAAAAAAAAAA', 'your_crop_name');
5+
-- Now, with the id of the crop you need to replace on hand, iterate through all the foreign key tables, replacing:
6+
update genome_map set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
7+
update location set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
8+
update program set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
9+
update observation_variable set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
10+
update germplasm_attribute_definition set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
11+
update crop_external_references set crop_entity_id = 'new_uuid' where crop_entity_id = 'old_non_uuid';
12+
update trial set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
13+
update germplasm set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
14+
update observation set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
15+
update study set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
16+
update variable_base_entity set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
17+
update observation_unit set crop_id = 'new_uuid' where crop_id = 'old_non_uuid';
18+
19+
20+
21+
22+
23+
24+
25+
26+
27+
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
SELECT *
2+
FROM your_table
3+
WHERE id_column !~ '^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$';
4+
5+
SELECT *
6+
FROM study
7+
WHERE auth_user_id !~ '^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$';

src/main/resources/db/sql/undo_dummy_data/U10_pedigree.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U10_pedigree.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U11_crosses.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U11_crosses.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U12_observation_units.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U12_observation_units.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U13_seed_lots.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U13_seed_lots.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U14_attribute_values.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U14_attribute_values.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U15_attribute_defs.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U15_attribute_defs.sql

File renamed without changes.

src/main/resources/db/sql/undo_dummy_data/U16_germplasm.sql renamed to src/main/resources/db/sql/migrate_test_to_prod/undo_dummy_data/U16_germplasm.sql

File renamed without changes.

0 commit comments

Comments
 (0)