The case of the miracle date
This is a little story about an oddity related to date handling and null values in MySQL that a teammate and I spotted when working with Sequel and a custom timestamp plugin.
This is a little story about an oddity related to date handling and null values in MySQL that a teammate and I spotted when working with Sequel and a custom timestamp plugin. In my opinion it should give some more reasons to use containers (with data volumes) and a version controlled configuration file for database instances on a development environment. It’s also one more reason to have devops people integrated within each team to ensure all environment match as close as possible for this kind of configuration.
The decor
The issue here appeared in a Sequel stack working with a MySQL database (5.6 and 5.7).
For those not familiar with Sequel the important point to know here is that it has a fairly “pick what you want” approach compared to ActiveRecord. So one can define what plugins to include within models. An example directly related to our issue is the timestamp plugin meant to fill created_at
and updated_at
attributes for the model.
Some times you might have models that don’t need this plugin so we have left the loading of that plugin out of the base model and only load it in the models that need it.
The actual issue
We forgot to add the plugin in a couple of models, but no red tests appeared in the CI. Think of it : who test that those timestamp values are set ? We trust the plugin to do it, and don’t really think about wether it’s done or not except if we actually define something custom that would change those attributes.
Well, it did break tests for one of us. Actually the tests caused MySQL to raise an exception as the schema specifically says that no NULL
value can be used for those columns.
We all checked our actual schema in the database and we all have the same one with the same constraint.
So, how come only one of us saw the issue ?
The symptoms
We then picked some of the SQL queries that were inserting and updating values in the table and ran them by hand to check the content of the tables. We noticed that, when not sending any value for datetime
columns MySQL was inserting 0000-00-00 00:00:00
instead except for that one colleague.
This magic date is not a NULL
value so that’s why there is no exception raised. But why is MySQL actually using such a value ?
The reasons
After a few searches we found that MySQL has a sql_mode
parameter that define how it will behave in some cases, including with its own constraints.
My colleague’s MySQL instance had one of the “strict sql mode” turned ON. Among other things, this mode will cause MySQL to not come up with the “magic” 0000-00-00 00:00:00
datetime value if no value is given when inserting a row. Instead it will strictly stick to the rules and tell you that you are messing up things.
After turning that strict mode off my colleague confirmed that it worked just like for my own environments so we concluded that it was indeed the cause of the issue he saw.
Lessons to learn
Yet, that’s not were we should stop when facing such quirks. Questions should come up :
- Why was one developer using a different
sql_mode
? - How are staging and production environment setup ?
- What do we actually want for this
sql_mode
?
If we can’t really know why the developer had a different sql_mode
we can look into how production and staging environment are configured. If one thing is clear it’s that we do want to be as close as possible to those environments when writing code. We want to have the same expectations about all environments for this kind of things. So auditing all MySQL instances is the way to go here.
As for what we want that setting to be, it’s a different story. For small projects it’s usually easy enough to figure out if there is code that would react badly to a change to the strict mode. For bigger projects, auditing all the code to check this would be difficult and changing blindly such a setting could cause some issues.
So one might want to review and define a better value for the sql_mode
in such a case but that change would have to go through some experiments and reviews.
Further on the SQL mode
This setting has some important impact on how MySQL behaves, and one should have this feature in mind when working with MySQL. It’s also good to know that this mode default values have changed from one version to another. This might cause trouble when upgrading instances. MySQL :: MySQL 5.7 Reference Manual :: 5.1.10 Server SQL Modes.
The sql_mode
can include a lot of things. Here is a handful of the possible values and their impact extracted from the above link.
ANSI
: This mode changes syntax and behavior to conform more closely to standard SQL. It is one of the special combination modes listed at the end of this section.STRICT_TRANS_TABLE
: If a value could not be inserted as given into a transactional table, abort the statement. For a nontransactional table, abort the statement if the value occurs in a single-row statement or the first row of a multiple-row statement. More details are given later in this section.NO_AUTO_CREATE_USER
: Prevent the GRANT statement from automatically creating new user accounts if it would otherwise do so, unless authentication information is specified. The statement must specify a nonempty password using IDENTIFIED BY or an authentication plugin using IDENTIFIED WITH.ONLY_FULL_GROUP_BY
: Reject queries for which the select list, HAVING condition, or ORDER BY list refer to nonaggregated columns that are neither named in the GROUP BY clause nor are functionally dependent on (uniquely determined by) GROUP BY columns.REAL_AS_FLOAT
: Treat REAL as a synonym for FLOAT . By default, MySQL treats REAL as a synonym for DOUBLE .
And this is only a few I picked from the list to show that this mode can have an impact in both how it handles values but also how it actually reacts to queries.
What about PostgreSQL ?
I usually prefer to roll with PostgreSQL (PG) but not all projects have the same opinion. PostgreSQL usually is closer to the SQL standard than MySQL so some of the above settings are not present in PG.
There are several settings that can be set in PG either through configuration files or the command line. All this is documented in PostgreSQL: Documentation: 9.5: Server Configuration.
Docker as a way to avoid such differences
There is a way to run MySQL within a docker container locally. One should use a data volume to avoid losing data for the instance when stopping the container and also to ensure configuration can be loaded from a file within the versioned source code of the service.
This would help ensure all the team members are relying on the same version of the database server and the same configuration for it without too much trouble and hand holding.
Devops you said ?
Developers rarely have an understanding of how databases are configured or configurable. Experienced developers would usually know to look into this but not always.
In my opinion this is one of the reasons why devops people or developers with strong devops skills should be included in software engineering teams. They can provide help with understanding this kind of issues, adjust development and test environments when needed. They can also ensure, in bigger organisations, that there is a common culture for this kind of things across teams and departments. The goal of that last point is to avoid, as much as possible, black boxes and silos that would prevent efficient problem solving during production incidents.
Conclusion
It had been a while since I ran in such issues with MySQL, previous projects were mostly involving PostgreSQL and usually MySQL instances handled are using default values everywhere. For me it was a big surprise to see that, while there are constraints in the table schema, MySQL just go around them by setting default values. If code can handle it properly then, it’s probably happening without issue but it’s worth knowing.
I like that facing this issue we went through the following process :
- we looked for the reasons
- we found one and checked the newly made assumption
- we verified our expectations concerning staging and production environments
- we fixed our local environments accordingly to match those environments
- we queued a discussion with the team to assess possible follow ups.
In a more general conclusion : be aware of your database settings regarding defaults and behaviour with NULL values.