Text Formatting

2016-08-19

Victor Zverovich, victor.zverovich@gmail.com

Introduction
Design
    Format String Syntax
    Extensibility
    Locale Support
    Positional Arguments
Wording
References

Introduction

This paper proposes a new text formatting functionality that can be used as a safe and extensible alternative to the printf family of functions. It is intended to complement the existing C++ I/O streams library and reuse some of its infrastructure such as overloaded insertion operators for user-defined types.

Example:

std::string message = std::format("The answer is {}.", 42);

Design

Format String Syntax

Variations of the printf format string syntax are arguably the most popular among the programming languages and C++ itself inherits printf from C [1]. The advantage of the printf syntax is that many programmers are familiar with it. However, in its current form it has a number of issues:

Although it is possible to address these issues, this will break compatibility and can potentially be more confusing to users than introducing a different syntax.

Therefore we propose a new syntax based on the ones used in Python [3], the .NET family of languages [4], and Rust [5]. This syntax employs '{' and '}' as replacement field delimiters instead of '%' and it is described in details in TODO:link. Here are some of the advantages:

The syntax is expressive enough to enable translation, possibly automated, of most printf format strings. The correspondence between printf and the new syntax is given in the following table.

printfnewcomment
-<left alignment
++
spacespace
##
00
hhunused
hunused
lunused
llunused
junused
zunused
tunused
Lunused
cc (optional)
ss (optional)
dd (optional)
id (optional)
oo
xx
XX
ud (optional)
ff
FF
ee
EE
aa
AA
gg (optional)
GG
nunused
pp (optional)

Width and precision are represented similarly in printf and the proposed syntax with the only difference that runtime value is specified by * in the former and {} in the latter, possibly with the index of the argument inside the braces.

As can be seen from the table above, most of the specifiers remain the same which simplifies migration from printf. Notable difference is in the alignment specification. The proposed syntax allows left, center, and right alignment represented by '<', '^', and '>' respectively which is more expressive than the corresponding printf syntax. The latter only supports left and right (the default) alignment.

The following example uses center alignment and '*' as a fill character:

std::format("{:*^30}", "centered");

resulting in "***********centered***********". The same formatting cannot be easily achieved with printf.

Extensibility

Both the format string syntax and the API are designed with extensibility in mind. The mini-language can be extended for user-defined types and users can provide functions that do parsing and formatting for such types.

The general syntax of a replacement field in a format string is

replacement-field:
{ integeropt }
{ integeropt : format-spec }

where format-spec is predefined for built-in types, but can be customized for user-defined types. For example, the syntax can be extended for put_time-like date and time formatting:

std::time_t t = std::time(nullptr);
std::string date = std::format("The date is {0:%Y-%m-%d}.", *std::localtime(&t));

TODO: API

Locale Support

TODO

Positional Arguments

TODO

Wording

TODO

Implementation

The ideas proposed in this paper have been implemented in the open-source fmt library. TODO: link

References

[1] The fprintf function. ISO/IEC 9899:2011. 7.21.6.1.
[2] fprintf, printf, snprintf, sprintf - print formatted output. The Open Group Base Specifications Issue 6 IEEE Std 1003.1, 2004 Edition.
[3] 6.1.3. Format String Syntax. Python 3.5.2 documentation.
[4] String.Format Method. .NET Framework Class Library.
[5] Module std::fmt. The Rust Standard Library.
[6] Format Specification Syntax: printf and wprintf Functions. C++ Language and Standard Libraries.