Skip to content

A minimal TOML parser and serializer for perl 5

Notifications You must be signed in to change notification settings

sysread/TOML-Tiny

Repository files navigation

NAME

TOML::Tiny - a minimal, pure perl TOML parser and serializer

VERSION

version 0.18

SYNOPSIS

use TOML::Tiny qw(from_toml to_toml);

binmode STDIN,  ':encoding(UTF-8)';
binmode STDOUT, ':encoding(UTF-8)';

# Decoding TOML
my $toml = do{ local $/; <STDIN> };
my ($parsed, $error) = from_toml $toml;

# Encoding TOML
say to_toml({
  stuff => {
    about => ['other', 'stuff'],
  },
});

# Object API
my $parser = TOML::Tiny->new;
my $data = $parser->decode($toml);
say $parser->encode($data);

DESCRIPTION

Build status

TOML::Tiny implements a pure-perl parser and generator for the TOML data format. It conforms to TOML v1.0 (with a few caveats; see "strict").

TOML::Tiny strives to maintain an interface compatible to the TOML and TOML::Parser modules, and could even be used to override $TOML::Parser:

use TOML;
use TOML::Tiny;

local $TOML::Parser = TOML::Tiny->new(...);
say to_toml(...);

EXPORTS

TOML::Tiny exports the following to functions for compatibility with the TOML module. See "FUNCTIONS" in TOML.

from_toml

Parses a string of TOML-formatted source and returns the resulting data structure. Any arguments after the first are passed to TOML::Tiny::Parser's constructor.

If there is a syntax error in the TOML source, from_toml will die with an explanation which includes the line number of the error.

my $result = eval{ from_toml($toml_string) };

Alternately, this routine may be called in list context, in which case syntax errors will result in returning two values, undef and an error message.

my ($result, $error) = from_toml($toml_string);

Additional arguments may be passed after the toml source string; see "new".

GOTCHAS

Big integers and floats

TOML supports integers and floats larger than what many perls support. When TOML::Tiny encounters a value it may not be able to represent as a number, it will instead return a Math::BigInt or Math::BigFloat. This behavior can be overridden by providing inflation routines:

my $toml = TOML::Tiny->new(
  inflate_float => sub{
    return do_something_else_with_floats( $_[0] );
  };
);

to_toml

Encodes a hash ref as a TOML-formatted string.

my $toml = to_toml({foo => {'bar' => 'bat'}});

# [foo]
# bar="bat"

mapping perl to TOML types

table

HASH ref

array

ARRAY ref

boolean

\0 or \1
JSON::PP::Boolean
Types::Serializer::Boolean

numeric types

These are tricky in perl. When encountering a Math::Big[Int|Float], that representation is used.

If the value is a defined (non-ref) scalar with the SVf_IOK or SVf_NOK flags set, the value will be emitted unchanged. This is in line with most other packages, so the normal hinting hacks for typed output apply:

number => 0 + $number,
string => "" . $string,
Math::BigInt
Math::BigFloat
numerical scalars

datetime

RFC3339-formatted string

e.g., "1985-04-12T23:20:50.52Z"

DateTime

DateTimes are formatted as RFC3339, as expected by TOML. However, TOML supports the concept of a "local" time zone, which strays from RFC3339 by allowing a datetime without a time zone offset. This is represented in perl by a DateTime with a floating time zone.

string

All other non-ref scalars are treated as strings.

OBJECT API

new

inflate_datetime

By default, TOML::Tiny treats TOML datetimes as strings in the generated data structure. The inflate_datetime parameter allows the caller to provide a routine to intercept those as they are generated:

use DateTime::Format::RFC3339;

my $parser = TOML::Tiny->new(
  inflate_datetime => sub{
    my ($dt_string) = @_;
    # DateTime::Format::RFC3339 will set the resulting DateTime's formatter
    # to itself. Fallback is the DateTime default, ISO8601, with a possibly
    # floating time zone.
    return eval{ DateTime::Format::RFC3339->parse_datetime($dt_string) }
        || DateTime::Format::ISO8601->parse_datetime($dt_string);
  },
);
inflate_boolean

By default, boolean values in a TOML document result in a 1 or 0. If Types::Serialiser is installed, they will instead be Types::Serialiser::true or Types::Serialiser::false.

If you wish to override this, you can provide your own routine to generate values:

my $parser = TOML::Tiny->new(
  inflate_boolean => sub{
    my $bool = shift;
    if ($bool eq 'true') {
      return 'The Truth';
    } else {
      return 'A Lie';
    }
  },
);
inflate_integer

TOML integers are 64 bit and may not match the size of the compiled perl's internal integer type. By default, TOML::Tiny coerces numbers that fit within a perl number by adding 0. For bignums, a Math::BigInt is returned. This may be overridden by providing an inflation routine:

my $parser = TOML::Tiny->new(
  inflate_integer => sub{
    my $parsed = shift;
    return sprintf 'the number "%d"', $parsed;
  };
);
inflate_float

TOML floats are 64 bit and may not match the size of the compiled perl's internal float type. As with integers, floats are coerced to numbers and large floats are upgraded to Math::BigFloats. The special strings NaN and inf may also be returned. You can override this by specifying an inflation routine.

my $parser = TOML::Tiny->new(
  inflate_float => sub{
    my $parsed = shift;
    return sprintf '"%0.8f" is a float', $parsed;
  };
);
strict

strict imposes some miscellaneous strictures on TOML input, such as disallowing trailing commas in inline tables and failing on invalid UTF8 input.

Note: strict was previously called strict_arrays. Both are accepted for backward compatibility, although enforcement of homogenous arrays is no longer supported as it has been dropped from the spec.

decode

Decodes TOML and returns a hash ref. Dies on parse error.

encode

Encodes a perl hash ref as a TOML-formatted string.

parse

Alias for decode to provide compatibility with TOML::Parser when overriding the parser by setting $TOML::Parser.

DIFFERENCES FROM TOML AND TOML::Parser

TOML::Tiny differs in a few significant ways from the TOML module, particularly in adding support for newer TOML features and strictness.

TOML defaults to lax parsing and provides strict_mode to (slightly) tighten things up. TOML::Tiny defaults to (somewhat) stricter parsing, enabling some extra strictures with "strict".

TOML::Tiny supports a number of options which do not exist in TOML: "inflate_integer", "inflate_float", and "strict".

TOML::Tiny ignores invalid surrogate pairs within basic and multiline strings (TOML may attempt to decode an invalid pair). Additionally, only those character escapes officially supported by TOML are interpreted as such by TOML::Tiny.

TOML::Tiny supports stripping initial whitespace and handles lines terminating with a backslash correctly in multilne strings:

# TOML input
x="""
foo"""

y="""\
   how now \
     brown \
bureaucrat.\
"""

# Perl output
{x => 'foo', y => 'how now brown bureaucrat.'}

TOML::Tiny includes support for integers specified in binary, octal or hex as well as the special float values inf and nan.

SEE ALSO

TOML::Tiny::Grammar

Regexp scraps used by TOML::Tiny to parse TOML source.

ACKNOWLEDGEMENTS

Thanks to ZipRecruiter for encouraging their employees to contribute back to the open source ecosystem. Without their dedication to quality software development this distribution would not exist.

A big thank you to those who have contributed code or bug reports:

ijackson
noctux
oschwald
jjatria

AUTHOR

Jeff Ober <sysread@fastmail.fm>

COPYRIGHT AND LICENSE

This software is copyright (c) 2024 by Jeff Ober.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

About

A minimal TOML parser and serializer for perl 5

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published