History

What, another new configuration format?

It’s not a new format. Back in 2008, we announced a configuration format which had to meet a number of requirements (click on the little arrows below to see more detail):

  • Support a nested hierarchy of elements
  • Comments should be allowed
  • Trailing commas should be allowed
  • Newlines, as well as commas, should be usable to separate elements
  • Ordinary strings can be single-quoted
  • Full Unicode support
  • Multi-line strings
  • It should be possible to use hexadecimal numbers
  • Syntactically valid JSON should be accepted
  • Allow other configurations to be included
  • Reference one part of a configuration from another
  • Allow access to application state
  • Allow composition of parts of a configuration from other parts
  • Allow a configuration file to specify any of a mapping, a list or a mapping body
  • Order independence apart from lists
  • Copy-and-paste and typo protection

Back in 2004 when we first started looking, we couldn’t find anything which met these requirements. So we developed this configuration format and a Python module to make use of it. The module was first published on the Python Package Index in 2004 (but not especially publicised). It’s been used on a few internal projects, and had some users on PyPI, but not seen much development after 2010 - because the module met many, though not all, of the above requirements.

Review of other systems

Since 2010, a number of new configuration formats have been proposed: for example, HOCON (2011), JSON5 (2012), TOML (2013), HJSON (2014), and SANE (2018, though it now appears to be inactive), to name but five.

We investigated these new formats to see which might meet our original requirements. All of them support the requirement to support N-level hierarchical configurations, and to allow the full range of Unicode characters in strings. Our findings are summarised in the table below.

Requirement JSON JSON5 HJSON HOCON SANE TOML CFG
Comments allowed? No Line, block Line, block Line Line Line Line
Allow trailing commas? No Yes Yes Yes Yes Yes Yes
Newlines (as well as commas) can separate elements? No No Yes Yes Yes Yes Yes
Identifiers usable as mapping keys? No Yes Yes Yes Yes Yes Yes
Ordinary strings can be single-quoted? No Yes Yes No No No Yes
Multi-line strings? No Yes Yes Yes Yes Yes Yes
Hexadecimal numbers? No Yes No No Yes Yes Yes
Syntactically valid JSON accepted? Yes Yes Yes Yes No No Yes
Include other configurations? No No No Yes No No Yes
Reference parts of a configuration? No No No Yes No No Yes
Allow access to environment variables? No No No Yes No No Yes
Allow composing from different parts of a configuration? No No No Yes No No Yes
Configuration file type Mapping, List Mapping Mapping Mapping, Mapping body Mapping body Mapping body Mapping, Mapping body, List
Order independence? Yes Yes Yes No Yes No Yes

As can be seen from the table above, none of the other formats looked at (which are commonly proposed for use in application configuration) meet all the requirements we originally identified. HOCON, which comes closest in terms of number of requirements satisfied or partially satisfied, has numerous perversities which make it seem like it’s not a good fit for our needs: for example, in HOCON the fragment:

3.14 : 42

is interpreted as the key-value pair "3" : { "14" : 42 }, which seems bizarre. Note: TOML has the same behaviour.

We didn’t look at configuration systems that are like programming languages, such as Dhall, CUE or Jsonnet – they are a different kind of thing.

Project Status

Following the review, we decided that rather than switching to one of these newer formats, it was worth updating our old implementation to work better with recent Python versions, as well as to provide implementations for other platforms: the JVM (using Kotlin), .NET, Go, Rust, D, JavaScript, Ruby, Elixir, Nim and Dart.

As well as providing useful functionality for application and library configuration in a range of environments, these implementations also serve as an example of comparative development in these various languages of a set of functionality which is non-trivial, but not so large as to be difficult to assimilate / understand.