libsgfc++ 2.0.1
A C++ library that uses SGFC to read and write SGF (Smart Game Format) data.
|
This document contains assorted notes about SGFC, how it operates, and the consequences this has for libsgfc++ and/or the library client.
If you simply want to use libsgfc++ you don't need to read this file, instead SGF notes is the document for you.
If you want to develop for libsgfc++ then these notes can be important to understand the implementation of libsgfc++.
On reading SimpleText/Text property values, and values of Point/Move/Stone properties of game types != Go, SGFC removes all escape characters.
On writing these property values, SGFC adds escape characters back where they are needed to protect the SGF skeleton.
Also see the detailed comments in SgfcPropertyDecoder
and SgfcDocumentEncoder
.
Note: SgfcPropertyDecoder
used to perform some escape processing, but this became unnecessary when SGFC V2.00 started to remove all escape characters. The code that performs the processing has been left in but made optional and disabled by default. Some comments may still mention the old behaviour.
SGFC detects any kind of line breaks when it reads/parses SGF content.
However, the kind of line break SGFC uses when writing SGF content is determined at compile time. By default SGFC uses a single LF character. This can be changed by redefining the pre-processor macro EOLCHAR
to something else during compilation. The macro must resolve into a single character, such as a LF character (already the default) or a CR character (used on classic MacOS systems). If you undefine EOLCHAR
then SGFC will write two characters, a LF followed by a CR, which is the standard on Windows/MS-DOS systems.
When reading SimpleText or Text property values, SGFC follows the SGF standard rules for soft and hard line breaks.
When writing SimpleText or Text property values, SGFC preserves unescaped line breaks and generates escaped line breaks as it sees fit (cf. -L
and -t
command line options).
For Go the SGF standard defines that black or white pass moves can have either value "" (an empty string) or "tt". The latter counts as pass move only for board sizes <= 19, for larger boards "tt" is a normal move. The SGF standard also mentions that "tt" is kept only for compatibility with FF3.
The observed behaviour is that SGFC can deal with "tt" both on reading and writing, and it always performs the conversion to an empty string in an attempt to produce FF4 content. Consequently:
ISgfcDocumentReader
reads SGF content with "tt" property values, the resulting ISgfcDocument
will contain pass moves with empty string property values.ISgfcDocument
is programmatically set up with black or white move properties that have value "tt", and the document is then passed to ISgfcDocumentWriter
for writing, the resulting SGF content will contain pass moves with empty string property values.SGFC automatically expands compressed point lists during parsing when the game type is Go (GM[1]). See Check_Pos()
. It compresses them again during saving unless the -e
option is specified. See WriteNode()
.
It's impossible to preserve the original format in all cases, because SGFC normalizes the input during parsing, but it does not record any traces of what it did.
The original idea was that the SgfcDocumentEncoder
class creates data structures using the structs that SGFC defines (e.g. Node
, Property
, PropValue
). In theory it should have been as easy as invoking the appropriate SGFC helper functions, such as NewNode()
or NewProperty()
, to create the data structures. In practice this scheme turned out at first to be problematic, and then doomed.
The first minor problem is accessibility of the involved functions: NewNode()
is declared extern
, so it can be used by SgfcDocumentEncoder
easily. NewProperty()
and other functions, on the other hand, are declared static
inside load.c
, so it would have been necessary to patch SGFC in order to be able to use those functions.
The next problem, which turned out to be the real bummer, is that each Property
structure has a member named buffer
, which is a pointer that SGFC expects to point deeply into the SGFInfo
file buffer. Although NewProperty()
sets buffer
up for us, it expects the buffer start as a parameter, i.e. the start within the file buffer from where property parsing should begin. We can't provide NewProperty()
with a pointer into an std::string
buffer because that goes away when the std::string
object is destroyed. This means we would have to create a copy of the std::string
buffer on the heap. But then the next problem would be, who frees that memory when the SGFC operation is done? FreeSGFInfo()
is not the one, because it assumes that the Property
buffer is part of the file buffer, so it merely frees the file buffer. In addition to the memory management issue, there are doubts whether SGFC's parsing functions can handle an abrupt end of the file buffer - which would occur because our copy of the std::string
buffer would naturally be bounded with a zero byte.
In the end the best (because simplest and safest) idea seemed to be to just let SgfcDocumentEncoder
generate an SGF content stream that simulates an entire file buffer.
This section can be seen as a very high-level approach to an inofficial SGFC API (there is no official API). You may find this interesting if you're new to SGFC and want to learn how you can reuse its code in a software project of your own.
main()
functionAt the highest level the SGFC codebase can be divided into two parts:
main()
functionA software library must not contain a main()
function, so the first thing that needs to be done to enable reusing the "everything else" part is to remove the main()
function part. This can be achieved in one of two ways:
main.c
in the buildmain.c
in the build but define the preprocessor macro VERSION_NO_MAIN
when main.c
is compiled.libsgfc++ uses the second approach because it allows main.c
to be included in IDE projects that are generated by CMake.
The SGFC codebase offers a number of high-level global functions that are of interest to a library. The functions can be placed into four broad categories:
SetupSGFInfo()
creates an SGFInfo
data structure and initializes it with useful default values. An SGFInfo
data structure must be passed as a parameter to all global functions that perform some sort of data processing. The SGFInfo
data structure can be parameterized in a number of ways, but this is beyond the scope of this document.FreeSGFInfo()
frees all memory that is currently occupied by an SGFInfo
structure.ParseArgs()
processes command line arguments in the usual argc
/ argv
format known from the main()
function and stores the result in an SGFCOptions
data structure.LoadSGF()
loads the SGF data from the filesystem into intermediate in-memory data structures.LoadSGFFromFileBuffer()
loads the SGF data from an in-memory buffer into intermediate in-memory data structures.ParseSGF()
processes the intermediate data structures generated by either one of the two load functions and generates the final data structures. These are what a software library such as libsgfc++ wants to work with.SaveSGF()
processes the final data structures generated by ParseSGF()
into SGF data and saves that data, either to the filesystem or to an in-memory buffer.SGFC offers a software library to tap into parts of its data processing by way of hooks/callbacks that can be installed/provided in strategic places.
print_error_output_hook
. The hook/callback function will then receive the message data in structured form for further processing. See the SgfcMessageStream
class for an example how libsgfc++ implements the hook/callback.SaveSGF()
is to write the SGF data to the filesystem. A library can redirect the output to be written to somewhere else so that the library can perform further processing. SGFC provides three hook/callback opportunities to tap into the save procedure: open
and close
for when an SGF file would normally be opened/created and closed, and putc
for when character data is actually output to the save stream. See the SgfcSaveStream
class for an example how libsgfc++ implements redirection to an in-memory buffer.exit()
. SGFC has an internal central memory-allocation function (actually a preprocessor macro). When that function fails to allocate memory it invokes a "panic" function to report the out-of-memory problem. When SGFC is run as command line tool the default "panic" function prints a message to standard output and then invokes exit()
. A software library can install its own versin of a "panic" function by setting the global variable oom_panic_hook
. See the SgfcBackendController
class for an example how libsgfc++ attempts to handle memory allocation failures.According to the SGFC readme document the "KI" property is a private property of the "Smart Game Board" application (SGB). The property name means "integer komi".
SGFC converts "KI" to the Go-specific "KM" property, dividing the original "KI" numeric value by 2 to obtain the new "KM" value. SGFC performs this conversion in all cases, even if the game tree's game type is not Go.