Beginning with dotCMS 23.06, MS SQL Server support has been deprecated; 23.01 LTS will be the last LTS major version to receive active MSSQL support. Similarly, 21.06 LTS, already past end of life by the time of writing, was the last major version to support either MySQL or Oracle databases.
Accordingly, PostgreSQL will soon be the only dotCMS-supported database. This page is intended to assist in dotCMS database migration to PostgreSQL.
Disclaimer: The information in this document are recommendations based on previous migration efforts; while this may suffice for many, unique database problems must be analyzed and resolved according to their specific characteristics and needs.
- Linux-like server: We will assume the use of Amazon Linux 2.
- Resources: Since this server will run dotCMS, Postgres, and a second database, the recommendation is 16 GB of RAM and at least 4 CPUs.
- DBeaver 22.2.1+ (DBeaver Community)
- Postgres 12+ (15+ preferred)
- Docker images: Postgres, dotCMS, and ElasticSearch
1. Upgrading to 21.06
If using a version prior to 21.06, we recommend first upgrading your dotCMS instance to dotCMS 21.06.14 LTS, the last patch in the 21.06 LTS series.
This is all the more crucial if you are using MySQL or Oracle; for either of those, 21.06 is the last-supported version. Once upgraded, begin with your dotCMS 21.06 docker instance configured to point to your MySQL or Oracle database.
2. Before Migration, Address Known Issues
a) Remove null characters
It has been found in some MySQL databases that some null characters caused the copy from MySQL to Postgres to crash. So far, the findings have been on the
contentlet table and
text_area1 field. To verify the presence of null characters, the following query can be run; if 0 results are returned, then the field is ok to be migrated:
select count(*) from contentlet where text_area1 like '%\0%';
In the event that null characters have been found, the issue can be fixed by running the following command before transferring the data to the PostgreSQL database:
UPDATE contentlet AS c1 INNER JOIN contentlet AS c2 ON c2.inode = c1.inode SET c2.text_area1 = REPLACE(c1.text_area1,'\0','') WHERE c1.text_area1 like '%\0%';
Note: The latter code snippet has been tested on MySQL; it has been syntax-validated for, but not directly tested on, MS SQL Server and Oracle. On Oracle,
CHR(0)may also be preferable to
\0. This document will expand to reflect successful tests on MSSQL and/or Oracle.
Please keep in mind this fix takes as an assumption based on past findings that the field with the null character is
text_area1 on the
contentlet table. However, it is not impossible that such may be present on another field and another table. For this reason, it is important to remain mindful of any further errors during the copy process via DBeaver.
b) Drop old asset versions
Certain inconsistencies on the
contentlet table can be fixed by deleting old asset versions.
In the Admin Panel, go to System → Maintenance → Tools and locate the Drop Old Assets Versions operation. For the “Remove assets older than” option, use 01/01/2010 as the date, and execute. Then repeat the same step incrementing one year (01/01/2011) and so on, until the present date. Remember to follow the logs for any issues.
As mentioned in the above-linked document, this process will never remove live or working versions of any assets, regardless of their age.
3. Get the Database Schema
First, download 21.06.11-schema-only-no-constraints.sql.
Please be aware that this dump contains NO data and NO constraints. It will serve as a base to facilitate the data import. Constraints will be imported later during this migration.
Note also that no database changes occurred between 21.06.11 and 21.06.14, and so this schema remains fully compatible with the final 21.06 patch.
4. Prepare the Target Database
Next, we'll build a target PostgreSQL database to receive the data.
Create a new database in Postgres with
CREATE DATABASE dotcms and restore the dotCMS schema using the
schema-only.sql file from the previous sections:
pg_restore -U <db_username> -v -d <database_name> <dump_file>
pg_restore -U postgres -v -d dotcms 21.06.11-schema-only-no-constraints.sql
5. Copy Data With DBeaver
Install DBeaver and connect to both databases — the legacy DB and the Postgres DB. Once that is done, we can start migrating data from the tables on the former to the same tables on the latter. The below examples will assume a MySQL legacy DB, but MS SQL Server and Oracle should proceed similarly.
a) Migrating data
Using DBeaver's Database Navigator, search for the tables on the MySQL database. For example, let’s migrate the image table. Once the table is selected, click on it and then Export Data:
Use the Database method to export:
On the Tables mapping section, select the Postgres database if is not select by default:
You can make sure that the columns mappings are right by clicking on the arrow on the left:
If the table does not exist in the target, by default DBeaver will suggest creating it. However, three circled tables are no longer needed, so set the mapping to skip:
The list below includes all tables that can be safely skipped, because they are temporal or related to the live environment:
Now, modify the extraction settings. These are the configurations that will use DBeaver to pull the data from the MySQL database. For more information please see DBeaver's data migration documentation:
If necessary, change the data load settings. These are the configurations used by DBeaver to insert the data into the Postgres database. Again, more detail can be found in DBeaver's data migration documentation:
Finally, confirm the operation to begin the transfer:
These steps should be performed on all tables, minus the exceptions in the table above.
b) Finishing the Migration
After copying all the tables from MySQL to Postgres, the last step is to recreate the constraints on the Postgres database.
Now run the following command with reference to it:
psql -U dotcmsdbsuer dotcms < 21.06.11-constraint-create.sql
c) After Migration
There is a known issue that affects QRTZ jobs after migration. The following operations have been shown to fix it:
DELETE FROM qrtz_locks; INSERT INTO qrtz_locks values('TRIGGER_ACCESS'); INSERT INTO qrtz_locks values('JOB_ACCESS'); INSERT INTO qrtz_locks values('CALENDAR_ACCESS'); INSERT INTO qrtz_locks values('STATE_ACCESS'); INSERT INTO qrtz_locks values('MISFIRE_ACCESS'); DELETE from QRTZ_EXCL_locks; INSERT INTO QRTZ_EXCL_locks values('TRIGGER_ACCESS'); INSERT INTO QRTZ_EXCL_locks values('JOB_ACCESS'); INSERT INTO QRTZ_EXCL_locks values('CALENDAR_ACCESS'); INSERT INTO QRTZ_EXCL_locks values('STATE_ACCESS'); INSERT INTO QRTZ_EXCL_locks values('MISFIRE_ACCESS');