Public service announcement: Check your MySQL backup!

Kier

XenForo developer
Staff member
Having moved servers today, we learned that our MySQL backups were subtly corrupted by an insidious issue that can be difficult to spot until it's too late.

All emoji and high-ascii characters in our database dumps were invalid.

This is due to a behaviour in mysqldump, which defaults the client character set to UTF8 instead of using UTF8MB4 when necessary.

To check if you are affected, review your MySQL dump SQL files and look for lines that look like this: /*!40101 SET character_set_client = utf8 */;

If you find one, your mysqldump command is not being told to use UTF8MB4, and the emoji data in your database dump will be corrupt.

To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
This public service update was brought to you by the letter 🦄
 
Thanks for heads up.
To fix it, either add the argument --default-character-set=utf8mb4 to your mysqldump command, or else update the appropriate MySQL config file with the following:
Code:
[mysqldump]
default-character-set=utf8mb4
Setting uft8mb4 as default, would bring interesting and potential issues ? If you have other non-xenforo mysql databases that don't use uft8mb4 but utf8 or non-utf8 too on the same server. You could end up messing up mysqldumps for those non-xenforo mysql databases ?

Setting --default-character-set=utf8mb4 command on mysqldump command line would seem safe in such cases
 
Last edited:
No because the point of this warning is that mysqldump would likely ignore it and export the database as UTF8 which would cause the problems we’re talking about here.

But, even so, if you want to support emoji and utf8mb4 then you should let XF convert it for you using the instructions provided in the manual:

https://xenforo.com/xf2-docs/manual/unicode/
 
Simply, if you use mysqldump to take database backups, and you are using XF 2.0.x with emoji support enabled, then you need to check to make sure that your database backups are being exported as utf8mb4 rather than utf8.

If the database is being exported as utf8 then it's very possible that any content in your database which contains emoiji (that includes, but not limited to; posts, conversations, thread titles, usernames, user profiles) is subtly corrupted; namely emoji characters such as 🤪would appear as ? if you restored the database backup.
 
If you’re backing up multiple databases then you’d be better to supply the required character set for each one.

Our warning here applies specifically to XF2 if it was a new install or upgraded from XF1 and you have converted your tables but you will need to check what the expected character set is for any other software. XF1 is fine with normal UTF8.
 
do you have to do anything special to IMPORT a db exported with the utf8mb4 option?
If you exported using the command line parameter as given earlier (to prevent other DB's from having it forced on them also) then no, the exported data should define it in the text. This is a mysqldump segment frommy IPS site (since I don't run 2.x of XenForo currently)

Code:
CREATE TABLE `calendar_calendars` (
  `cal_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `cal_moderate` tinyint(1) NOT NULL DEFAULT 0,
  `cal_position` int(3) NOT NULL DEFAULT 0,
  `cal_title_seo` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `cal_comment_moderate` tinyint(1) NOT NULL DEFAULT 0,
  `cal_color` varchar(7) COLLATE utf8mb4_unicode_ci DEFAULT NULL COMMENT 'Hex color code to represent this calendar',
  `cal_allow_comments` tinyint(1) unsigned NOT NULL DEFAULT 1,
  `cal_allow_reviews` tinyint(1) unsigned NOT NULL DEFAULT 0,
  `cal_review_moderate` tinyint(1) unsigned NOT NULL DEFAULT 0,
  `cal_club_id` bigint(20) unsigned DEFAULT NULL COMMENT 'The club ID if this calendar belongs to a club, or NULL',
  `cal_bitoptions` int(10) unsigned NOT NULL DEFAULT 0,
  `cal_calendar_bitoptions` int(10) unsigned NOT NULL DEFAULT 0,
  PRIMARY KEY (`cal_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
If you notice, it has the utf8mb4 CHARSET set and the COLLATE as utf8mb4_unicode_ci.
 
If you exported using the command line parameter as given earlier (to prevent other DB's from having it forced on them also) then no, the exported data should define it in the text. This is a mysqldump segment frommy IPS site (since I don't run 2.x of XenForo currently)
Actually to be on safe side you'd want to do mysql import with

Code:
mysql --default-character-set=utf8mb4

as your may have uf8mb4 mysql table and field character set/collations but your mysql client connection my not be utf8mb4 based and mysql server default client might be a different character set and collation from utf8mb4 i.e. utf8
 
Back
Top Bottom