The CFG API for Rust

The CFG API for Rust

The CFG API for Rust

The CFG reference implementation for Rust assumes you are using a version of Rust which is 1.41 or later.

Usage

This library is on crates.io and can be used by adding cfg-lib to your dependencies in your project’s Cargo.toml:

[dependencies]
cfg-lib = "0"

There’s a minimal example of a program that uses CFG here.

Exploration

To explore CFG functionality for Rust, we use the evcxr Read-Eval-Print-Loop (REPL), which is available from here. Once installed, you can invoke a shell using

$ evcxr

You’ll normally need to do the following at the start of a session:

>> :dep cfg-lib
>> use cfg_lib::config::*;

Getting Started with CFG in Rust

A configuration is represented by an instance of the Config struct. A reference to one can be obtained using either the new() or from_file() functions. The former creates an instance with no configuration loaded, while the latter initialises using a configuration in a specified file: the text is read in, parsed and converted to an object that you can then query. A simple example:

test0.cfg
a: 'Hello, '
b: 'world!'
c: {
  d: 'e'
}
'f.g': 'h'
christmas_morning: `2019-12-25 08:39:49`
home: `$HOME`
foo: `$FOO|bar`

Loading a configuration

The configuration above can be loaded as shown below. In the REPL shell:

>> let cfg = Config::from_file("test0.cfg").unwrap();

The from_file() call returns a Config instance which can be used to query the configuration.

Access elements with keys

Accessing elements of the configuration with a simple key is not much harder than using a HashMap:

>> cfg.get("a")
Ok(Base(String("Hello, ")))
>> cfg.get("b")
Ok(Base(String("world!")))

The values returned are of type Value.

Access elements with paths

As well as simple keys, elements can also be accessed using path strings:

>> cfg.get("c.d")
Ok(Base(String("e")))

Here, the desired value is obtained in a single step, by (under the hood) walking the path c.d – first getting the mapping at key c, and then the value at d in the resulting mapping.

Note that you can have simple keys which look like paths:

>> cfg.get("f.g")
Ok(Base(String("h")))

If a key is given that exists in the configuration, it is used as such, and if it is not present in the configuration, an attempt is made to interpret it as a path. Thus, f.g is present and accessed via key, whereas c.d is not an existing key, so is interpreted as a path.

Access to date/time objects

You can also get native Rust date/time objects from a configuration, by using an ISO date/time pattern in a backtick-string:

>> cfg.get("christmas_morning")
Ok(Base(DateTime(2019-12-25T08:39:49+00:00)))

You get either NaiveDate objects, if you specify the date part only, or else DateTime<FixedOffset> objects, if you specify a time component as well. If no offset is specified, it is assumed to be zero.

Access to environment variables

To access an environment variable, use a backtick-string of the form `$VARNAME`:

>> cfg.get("home")
Ok(Base(String("/home/vinay")))

You can specify a default value to be used if an environment variable isn’t present using the `$VARNAME|default-value` form. Whatever string follows the pipe character (including the empty string) is returned if VARNAME is not a variable in the environment.

>> cfg.get("foo")
Ok(Base(String("bar")))

Access to computed values

Sometimes, it’s useful to have values computed declaratively in the configuration, rather than imperatively in the code that processes the configuration. For example, an overall time period may be specified and other configuration values are fractions thereof. It may also be desirable to perform other simple calculations declaratively, e.g. concatenation of numerous file names to a base directory to get a final pathname.

test0a.cfg
total_period : 100
header_time: 0.3 * ${total_period}
steady_time: 0.5 * ${total_period}
trailer_time: 0.2 * ${total_period}
base_prefix: '/my/app/'
log_file: ${base_prefix} + 'test.log'

When this file is read in, the computed values can be accessed directly:

>> cfg.get("header_time")
Ok(Base(Float(30.0)))
>> cfg.get("steady_time")
Ok(Base(Float(50.0)))
>> cfg.get("trailer_time")
Ok(Base(Float(20.0)))
>> cfg.get("log_file")
Ok(Base(String("/my/app/test.log")))

Including one configuration inside another

There are times when it’s useful to include one configuration inside another. For example, consider the following configuration files:

logging.cfg
layouts: {
  brief: {
    pattern: '%d [%t] %p %c - %m%n'
  }
},
appenders: {
  file: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'INFO',
    filename: 'run/server.log',
    append: true,
  },
  error: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'ERROR',
    filename: 'run/server-errors.log',
  },
  debug: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'DEBUG',
    filename: 'run/server-debug.log',
  }
}
loggers: {
  mylib: {
    level: 'INFO'
  }
  'mylib.detail': {
    level: 'DEBUG'
  }
},
root: {
  handlers: ['file', 'error', 'debug'],
  level: 'WARNING'
}
redirects.cfg
cookies: {
  url: 'http://www.allaboutcookies.org/',
  permanent: false
},
freeotp: {
  url: 'https://freeotp.github.io/',
  permanent: false
},
'google-auth': {
  url: 'https://play.google.com/store/apps/details?id=com.google.android.apps.authenticator2',
  permanent: false
}
main.cfg
secret: 'some random application secret',
port: 8000,
sitename: 'My Test Site',
default_access: 'public',
ignore_trailing_slashes: true,
site_options: {
  want_ipinfo: false,
  show_form: true,
  cookie_bar: true
},
connection: 'postgres+pool://db_user:db_pwd@localhost:5432/db_name',
debug: true,
captcha_length: 4,
captcha_timeout: 5,
session_timeout: 7 * 24 * 60 * 60,  # 7 days in seconds
redirects: @'redirects.cfg',
email: {
  sender: 'no-reply@my-domain.com',
  host: 'smtp.my-domain.com:587',
  user: 'smtp-user',
  password: 'smtp-pwd'
}
logging: @'logging.cfg'

The main.cfg contents have been kept to the highest-level values, within logging and redirection configuration relegated to other files logging.cfg and redirects.cfg which are then included in main.cfg. This allows the high-level configuration to be more readable at a glance, and even allows the separate configuration files to be e.g. maintained by different people.

The contents of the sub-configurations are easily accessible from the main configuration just as if they had been defined in the same file:

>> cfg.get("logging.appenders.file.filename")
Ok(Base(String("run/server.log")))
>> cfg.get("redirects.freeotp.url")
Ok(Base(String("https://freeotp.github.io/")))
>> cfg.get("redirects.freeotp.permanent")
Ok(Base(Bool(false)))

Avoiding unnecessary repetition

Don’t Repeat Yourself (DRY) is a useful principle to follow. CFG can help with this. You may have noticed that the logging.cfg file above has some repetitive elements:

logging.cfg (partial)
appenders: {
  file: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'INFO',
    filename: 'run/server.log',
    append: true,
  },
  error: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'ERROR',
    filename: 'run/server-errors.log',
  },
  debug: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
    level: 'DEBUG',
    filename: 'run/server-debug.log',
  }
}

This portion could be rewritten as:

logging.cfg (partial)
defs: {
  base_appender: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
  }
},
appenders: {
  file: ${defs.base_appender} + {
    level: 'INFO',
    filename: 'run/server.log',
    append: true,
  },
  error: ${defs.base_appender} + {
    level: 'ERROR',
    filename: 'run/server-errors.log',
  },
  debug: ${defs.base_appender} + {
    level: 'DEBUG',
    filename: 'run/server-debug.log',
  }
}

where the common elements are separated out and just referenced where they are needed. We find it useful to put all things which will be reused like this in one place in the condiguration, so we always know where to go to make changes. The key used is conventionally defs or base, though it can be anything you like.

Access is just as before, and provides the same results:

>> cfg.get("logging.appenders.file.level")
Ok(Base(String("INFO")))
>> cfg.get("logging.appenders.file.layout")
Ok(Base(String("brief")))
>> cfg.get("logging.appenders.file.append")
Ok(Base(Bool(true)))
>> cfg.get("logging.appenders.file.filename")
Ok(Base(String("run/server.log")))
>> cfg.get("logging.appenders.error.append")
Ok(Base(Bool(false)))
>> cfg.get("logging.appenders.error.filename")
Ok(Base(String("run/server-errors.log")))

The definition of logging.appenders.file as ${defs.base_appender} + { level: 'INFO', filename: 'run/server.log', append: true } has resulted in an evaluation which first fetches the defs.base_appender value, which is a mapping, and “adds” to that the literal mapping which defines the level, filename and append keys. The + operation for mappings is implemented as a copy of the left-hand side merged with the right-hand side. Note that the append value for logging.appenders.file is overridden by the right-hand side to true, whereas that for e.g. logging.appenders.error is unchanged as false.

We could do some further refinement by factoring out the common location for the log files:

logging.cfg (partial)
defs: {
  base_appender: {
    layout: 'brief',
    append: false,
    charset: 'UTF-8'
  }
  log_prefix: 'run/',
},
layouts: {
  brief: {
    pattern: '%d [%t] %p %c - %m%n'
  }
},
appenders: {
  file: ${defs.base_appender} + {
    level: 'INFO',
    filename: ${defs.log_prefix} + 'server.log',
    append: true,
  },
  error: ${defs.base_appender} + {
    level: 'ERROR',
    filename: ${defs.log_prefix} + 'server-errors.log',
  },
  debug: ${defs.base_appender} + {
    level: 'DEBUG',
    filename: ${defs.log_prefix} + 'server-debug.log',
  }
}

with the same result as before. It is slightly more verbose than before, but the location of all files can be changed in just one place now, as opposed to three, as it was before.

Types

These are the common types you’ll need to be aware of when using CFG in Rust:

  • enum ScalarValue – this encapsulates the elements described in the “Scalar values” bullet point in the section on Elements.
  • enum Value – this builds on ScalarValue and adds the “Mappings” and “Lists” bullet points in the section on Elements.
  • enum RecognizerError – this encapsulates the different low-level errors which can be returned when parsing CFG text.
  • enum ConfigError – this encapsulates high-level errors which can be returned when loading or querying a configuration.
  • struct Location – this is the line-and-column location in CFG source which can be returned with errors to help locate the source of problems.
  • struct Config – this represents an individual configuration or sub-configuration.

Enums

enum ScalarValue

This represents scalar values such as integers, strings and others.

Null

Represents the null value.

Bool(bool)

Represents a Boolean value.

String(String)

Represents a string value.

Integer(i64)

Represents an integer value.

Float(f64)

Represents a floating-point value.

Complex(Complex64)

Represents a complex-number value.

Date(NaiveDate)

Represents a date value. This has no timezone information.

DateTime(DateTime<FixedOffset>)

Represents a date/time value including timezone information expressed as a duration (offset) from UTC.

Identifier(String)

Represents an identifier. This is treated differently to a string literal, in that it can be used as a variable name to look up in a context mapping provided to the configuration.

enum Value

This builds upon the ScalarValue to encompass list, mapping and nested configurations types.

Base(ScalarValue)

This represents one of the scalar values described above.

List(Vec<Value>)

This represents a list of values.

Mapping(HashMap<String, Value>)

This represents a mapping of strings to values.

Config(Config)

This represents a nested configuration.

enum RecognizerError

This represents the various errors which can be returned from the low-level API.

Io(Location)

This represents an I/O error. The location indicates where it happened.

UnexpectedCharacter(char, Location)

An unexpected character was seen in the input, at the start of a lexical token. The character and its location are returned.

InvalidCharacter(char, Location)

An invalid character was seen in the input, while parsing a lexical token. The character and its location are returned.

InvalidNumber(Location)

An error was encountered when parsing a number. The location of the error is returned.

UnterminatedString(Location)

An unterminated string was detected. The location of the string is returned.

InvalidString(Location)

An invalid string was detected. The location of the string is returned.

UnexpectedToken(String, String, Location)

An unexpected token was encountered when parsing the input. The kind of token expected, the kind seen and the location are returned.

ValueExpected(String, Location)

A value was expected, but instead some other kind of token was seen. The unexpected token and its location are returned.

AtomExpected(String, Location)

An “atom” (such as an identifier or number) was expected, but not seen. The unexpected token that was seen and its location are returned.

KeyExpected(String, Location)

A mapping key (either an identifier or a string literal) was expected, but something else was seen. The unexpected token and its location are returned.

KeyValueSeparatorExpected(String, Location)

A mapping key-value separator (either a colon or an equals sign) was expected, but something else was seen. The unexpected token and its location are returned.

ContainerExpected(String, Location)

A container (list or mapping) was expected, but something else was seen. The unexpected token and its location are returned.

InvalidEscapeSequence(usize, Location)

An invalid escape sequence was detected when parsing a string. The position of the error and the location of the string are returned.

UnexpectedListSize(Location)

A list element with an unexpected size was found. This can happen if a list with more than one element is seen when parsing an array index – multidimensional arrays are allowed by the syntax but not currently supported.

TrailingText(Location)

This error is returned if trailing text is found when parsing a path. The location of the trailing text is returned.

enum ConfigError

This represents the various errors which can be returned from the high-level API.

SyntaxError(RecognizerError)

This error will be returned if there is a syntax error detected while parsing the source text of a configuration. The underlying parsing error is returned.

NotLoaded

This error is returned if a configuration is queried without anything having been loaded into it.

MappingExpected

Although a configuration file can contain a list or a mapping, a top- level configuration must be a mapping. If it isn’t, this error is returned.

StringExpected(Location)

This error is returned if a string value is expected, but not found. For example, the inclusion operator @ expects a string operand, which is interpreted as the path name of a file to be included as a sub-configuration.

DuplicateKey(String, Location, Location)

This error is returned if a duplicate key is seen when processing a mapping, unless the no_duplicates setting has been set to false.

BadFilePath(String)

This error is returned if a string (e.g. representing a configuration to be included) is not a valid file path.

FileNotFound(String)

This error is returned if a string (e.g. representing a configuration to be included) does not correspond to an existing file, even if valid as a file path.

FileReadFailed(String)

This error is returned if a file could not be opened for reading (e.g. because of insufficient privileges).

NoContext

This error is returned if an identifier is encountered when processing a configuration, but no context has been provided where a corresponding value can be looked up.

UnknownVariable(String, Location)

This error is returned if an identifier is encountered when processing a configuration, but is not present as a key in the provided context.

NotPresent(String, Option<Location>)

This error is returned if a key is not present in the configuration. The key or path element is returned, along with an error location if the error occurred while traversing a path.

InvalidPath(RecognizerError)

This error is returned if a string does not represent a valid path when it should do so. The underlying parsing error is returned.

InvalidPathOperand(Location)

This error is returned when a path operand is incompatible with the current path contents (e.g. a string operand for a list, or an integer operand for a mapping). The location of the offending path element is returned.

IndexOutOfRange(i64, Location)

This error is returned when an integer operand for a list is not within the permitted range. The offending index and the location where it occurs in the path are returned.

EvaluationFailed(Location)

This is returned if an evaluation failed, e.g. because of the incompatible types of operators and their operands.

ConversionError(String)

This is returned if a special value string can’t be converted to a value.

CircularReferenceError(Vec<(String, Location)>)

This is returned if a reference leads to an infinite recursion, whether directly or indirectly. A list of the references which lead to the recursion (along with their locations) is returned.

Structs

struct Location

This represents a location in the source text for a configuration.

line: i32

This represents a line number in the source (the first line is 1).

column: i32

This represents a column number in the source (the first column is 1).

struct Config

This represents a single configuration.

no_duplicates: bool

Whether keys are allowed to be duplicated in mappings. This defaults to true if not provided – a ConfigError is raised if a duplicate key is seen. If false, then if a duplicate key is seen, its value silently replaces the value associated with the earlier appearance of the same key.

Constructors

new() -> Self

Constructs an instance with no actual configuration loaded. A call to load() or load_from_file() would be needed to actually make a usable instance.

from_file(file_path: &str) -> Result<Self, ConfigError>

Construct this instance with the configuration read from a file named by the provided file_path.

Methods

add_include(&mut self, dir: &str)

Add a directory dir to the search path used for included files.

load_from_file(&mut self, file_path: &str) -> Result<(), ConfigError>

Load this instance with the configuration read from the file at the provided file_path.

load(&mut self, r: Box<dyn Read>) -> Result<(), ConfigError>

Load this instance with the configuration read from the provided reader r.

set_context(&mut self, context: HashMap<String, Value>)

Set the given hashmap context as the place to look up identifiers encountered in the configuration.

contains_key(&self, key: &str) -> bool

Return true if key is in the configuration, else false.

get(&self, key: &str) -> Result<Value, ConfigError>

Get a value by key or path from the configuration. The value of key can be an identifier or a path.

as_mapping(&self, unwrap_configs: bool) -> Result<HashMap<String, Value>, ConfigError>

Return the configuration as a mapping, evaluating every element of the configuration. If unwrap_configs is true, nested configurations are also converted to hashmaps, otherwise they are returned as is.