This text represents the draft SGF file standard as of 2016-12-31 - aeb@cwi.nl

Table of contents

The SGF format

Introduction

SGF is a format ('Smart Game Format') used to write down commented game records, including variation trees, especially for the game Go. It was introduced by Anders Kierulf in 1990, and his version is now called FF[1]. Martin Müller (1995) made FF[3]. Arno Hollosi (1999) made FF[4], which is the current version.

The present text updates this standard to current usage. In many places it comes with recommendations where FF[4] had prescriptions. No version number is assigned other than the publication date.

Grammar

An SGF file is a production of the following grammar, possibly with interspersed ignored parts such as whitespace (see Parsing, below).

  Collection     = { GameTree }
  GameTree       = "(" RootNode NodeSequence { Tail } ")"
  Tail           = "(" NodeSequence { Tail } ")"
  NodeSequence   = { Node }
  RootNode       = Node
  Node           = ";" { Property }
  Property       = PropIdent PropValue { PropValue }
  PropIdent      = UcLetter { UcLetter }
  PropValue      = "[" Value "]"
  UcLetter       = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" |
                   "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" |
                   "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z"

Here braces { } denote repetition (zero or more times), the vertical bar | denotes a choice, and the strings "(", ")", ";", "[", "]", "A", ..., "Z" are constants.

That is, an SGF file (Collection) is the concatenation of zero or more GameTrees.
A GameTree is an open parenthesis, followed by a NodeSequence, followed by zero or more GameTrees, followed by a close parenthesis.
A NodeSequence is the concatenation of zero or more Nodes.
A Node is a semicolon followed by a Property.
A Property is an identifier (PropIdent) followed by one or more PropValues.
A PropIdent is a sequence of one or more upper case letters (UcLetters).
A PropValue consists of arbitrary data enclosed in square brackets.

Parsing

Whitespace is ignored everywhere, except inside a Value. Inside a Value whitespace is significant, but an escaped newline is ignored. Whitespace inside a PropIdent should be avoided.
Lower case letters are ignored inside a PropIdent.

There is no ambiguity in the grammar, as long as it is possible to recognize the "]" that ends a PropValue. If a Value contains a "]", it must be escaped. A symbol is escaped by preceding it by a non-escaped "\".

Character set

An SGF file is a text file with text in a character set that includes ASCII as a subset. The constant strings with grammatical function are in ASCII. By default, each Value is given in UTF-8. Then the possibility that a character code might contain an embedded ']' byte does not arise. If a different character set is used, then before the first use of non-ASCII data there should be a property indicating the character set, using the PropIdent "CA". No escape is used for multibyte characters where one of the constituent bytes has the value ']'.

Semantics

A Collection represents a collection of game records.

A GameTree represents a single game record. The RootNode gives metadata about the game. Each further Node represents a move in the game. A sequence of Tails represents a number of variations, possible sequels, to the game given thus far. In particular, a Tail that is just "()" represents the possibility of ending the game here.

In a record, the status of variations can differ. If it represents an actually played game, together with variations, then the moves actually played form a distinguished tail. It will be the first tail given. If the record represents a problem, there may be several correct and incorrect solutions, and the first tail has no special status.

Game types

The meaning of the property identifiers depends on the type of game being played. This type is indicated using the PropIdent "GM" in the root node. In the text below we describe the use for the game of Go (Baduk, Weiqi). The property "GM[1]" denotes a game record for this game. If no such property is given, we shall assume that the record is a Go record. For the meaning of values other than 1, see GM.

Below we give a very short description of the various properties, ordered according to type. Afterwards a more precise grammar, with properties ordered alphabetically.

Root node properties

The ordering of the properties in the root node is not significant.

Players

PB and PW give the Black and White players, and BR and WR their rank. BC and WC are used for their countries. BT and WT for their teams.

Event

EV and RO give event and round. DT and PC the date and place.

Game details

TM gives the time available to each player. OT ("Overtime") gives the type of byo-yomi, and the number and length of periods. One also uses LC and LT for the number and the length of the periods.

KM gives the komi paid by Black to White.

HA gives the handicap.

SZ gives the board size.

AB and AW ("Add Black", "Add White") give the black and white setup stones. For search patterns one uses AB, AW, AE ("Add Empty") to denote positions that must be Black, White and Empty.

PL ("Player") tells which player moves first.

RU gives the rule set used.

RE gives the result.

TB and TW give the final black and white territory.

MN gives the number of moves of the game.

GC gives arbitrary comments.

Meta properties

CA gives the character set of the SGF file.

FF ("File Format") gives the number (1-4) of the SGF standard used.

AP gives the application program that produced this SGF file.

SO gives the source of information.

US gives the user who typed it in.

AN ("Annotator") gives the author of the comments.

CP gives copyright information.

GN ("Game Name") gives a name to the SGF file, such as its filename.

Node properties

Move properties

B and W give moves by Black and White.

C gives a comment.

Various time properties indicate time used so far, time used for this move, time still available for the rest of the game. In particular, BL and WL are used for the time left for Black (after a black move) or White (after a white move). In Canadian byo-yomi, OB and OW are used for the number of Black (White) moves that still have to be played in this period. In Japanese byo-yomi, OB and OW are used for the number of byo-yomi periods remaining.

Markup properties

In order to be able to refer to specific moves or positions, one uses markup. How markup is realized upon game display depends on the capabilities and properties of the displaying program. The SGF standard is about describing possibly commented games, not about displaying them, so the descriptions below can only be hints.

SL selects a number of unnamed points.

CR, SQ, TR, MA mark the indicated points with a circle, square, triangle, 'X'.

LB labels the indicated point with the given label.

N names its node.

MN (outside the root node) assigns a number to its node, probably for display purposes.

Display properties

Commented games are printed in a journal, or output on a screen. Various properties can be added to the SGF file to direct this printing or display. Various viewers or sgf-to-image converters recognize properties such as VW ("View"), the part of the board that should be displayed, and possibly further indications about whether captured stones have to be shown, whether move numbers are shown, and if so, what numbers, whether there are coordinates around the board, and which type, whether hoshi should be shown, whether stones are shown in 3D style, etc. etc.

Extended properties

The properties mentioned above were of the type meant to be automatically retrievable from a data base. Event, names, dates, results are likely search fields. For free form information one uses extended properties. The event can be "EV[60th Honinbo Title]", with extended information "EVX[Sponsored by ..]". The date can be "DT[1980]", with extended information "DTX[Broadcast 1980-09-22]" or "DTX[Published 1980-11]". Unusual results can be described in an extended result field: "REX[The players went for a nap, and it was decided that both lost]".

There have been various other proposals to represent such information. One such proposal suggests to separate the free text part from the formalized part by a colon, as in DT[1934:Spring], DT[1845-03-03,04:Played nonstop through the night], DT[1844-04-15:Tenpo 15-II-28], DT[:Early Qing era]. Another proposal suggests to use further elements of the PropValue sequence. No actual usage of such proposals is known.

Units

Time properties require a unit of time. One uses 's', 'm', 'h' for seconds, minutes and hours. If no unit is given, the time is taken to be in seconds.

List of properties

Here we give details for the above-mentioned properties, in alphabetical order.

According to the grammar, a Property consists of an identifier (PropIdent) followed by one or more values (PropValues).

Value types

The syntax for the PropValues is one of

syntax meaning
text Text of arbitrary length and contents
simpletext Text intended to take a single display line. Newline characters are equivalent to spaces
point Board position, e.g. cq. A point is a pair of coordinates, the horizontal coordinate first, where columns and rows are numbered 'a', 'b', 'c', ... In particular, the upper lefthand corner is aa, and on a 19x19 board the lower righthand corner is ss.
move point (denoting a move at this point) or empty (denoting Pass)
pointlist List of board positions, compressed aa:bc, or not aa,ab,ac,ba,bb,bc. (That is, a colon-separated pair of board positions denotes all positions in the rectangle of which the given positions are top left and bottom right corners.) The list may be empty. With property XY this becomes XY[aa:bc] or XY[aa][ab][ac][ba][bb][bc] or XY[].
pointpair point ':' point
labeledpoint point ':' simpletext
labellist Possibly empty list of labeledpoints
color 'B' | 'W'
number An optional sign ('+' or '-' or nothing) followed by a nonempty sequence of digits in 0..9
numberpair number ':' number
real number | number '.' number
time real optionally followed by 's', 'm' or 'h'

List

Below we list the properties standardized in this document in alphabetical order.

The identifier is given in the column headed prop, the types of the list of values in the column headed args, and the description in the final column. In some cases further details are given belowe this table.

prop meaning arg syntax description
AB Add Black pointlist Add black setup or handicap stones, e.g., AB[dp][pd]
AE Add Empty pointlist Make the indicated fields empty
AN Annotator simpletext Author of the game analysis
AP Application simpletext SGF editor used
AW Add White pointlist Add white stones
B Black move move Make a black move
BC Black Country simpletext Country of the black player, e.g. CN, JP, KR, TW
BL Black time Left time Time left after this move
BR Black Rank simpletext Rank of black player, e.g. Honorary Meijin, 9p, 6d ama, 15k
BT Black Team simpletext Team of the black player
C Comment text Any comment
CA Character set simpletext Character set of the text fields (e.g., UTF-8, GB2312, SJIS, ISO-2022-KR, ISO-8859-15)
CP Copyright simpletext Copyright holder
CR Circle pointlist Mark the indicated stones with a circle
DT Date simpletext Date, given in Western style, preferably in ISO standard form
EV Event simpletext Tournament or other event
FF File Format number SGF file format, e.g., FF[1], FF[3], FF[4]
GC Game Comment text Any comment. E.g. GC[This is Cho Chikun's 1000th win as a professional], GC[Translated by Megumi Kawai], GC[Wang Lei made an illegal ko capture on move 223 in what should have been a void game due to triple ko. He was ruled to have passed on that move and the position was restored with Zhou to move]
GM Game number GM[1]: Go. Higher numbers denote other games, see GM
GN Game Name simpletext Filename, number or identifier associated with the game
HA Handicap number Number of handicap stones
JD Japanese Date simpletext Date, given in Japanese Era, Year, Month, Day format
KM Komi real Possibly negative number of points paid by Black to White
LB Label labellist E.g. LB[], LB[kg:a], LB[fm:F][fj:E][fk:D][gj:C][gk:B][hi:A]
LC number Number of byo-yomi periods
LT time Length of byo-yomi periods
MA Mark pointlist Mark the indicated stones with an 'X'
MN Move Number number In the rootnode: the number of moves of the game. Elsewhere: the request to label the move with the given number
N Node name simpletext E.g. N[Diagram 1], N[Sacrifice]
OB Overtime Black number Number of black moves still to make in this byo-yomi period
OH Old Handicap simpletext E.g. OH[(B)BW] for sen-ai-sen, a handicap where the weaker player takes Black two out of every three games, and takes Black in the present game
OT Overtime simpletext Description of the type of byo-yomi used
OW Overtime White number Number of white moves still to make in this byo-yomi period
PB Player Black simpletext Name of black player
PC Place simpletext Location of the game
PL Player to play color Player to go first (B or W)
PW Player White simpletext Name of white player
RE Result simpletext E.g. Jigo, B+3.5, W+T, Void. See RE
RO Round simpletext Tournament stage or round
RU Rules simpletext Rule set in use
SL Selected points pointlist Select a number of points in order to refer to them in a comment
SO Source simpletext Game source. E.g. SO[weiqi.com.tom], SO[Kido yearbook 1969]
SQ Square pointlist Mark the indicated stones with a square
SZ Size number | numberpair E.g. SZ[19], or SZ[2:9] for a 2x9 board (with 2 columns and 9 rows)
TB Territory Black pointlist Black territory
TM Time time Total time for each player
TR Triangle pointlist Mark the indicated stones with a triangle
TW Territory White pointlist White territory
US User simpletext User who entered the game
VW View pointlist Part of the board to be shown. A subsequent VW[] removes this restriction
W White move move Make a white move
WC White Country simpletext Country of the white player
WL White time Left time Time left after this move
WR White Rank simpletext Rank of white player
WT White Team simpletext Team of the white player

Of the above, the six properties BC, JD, LC, LT, OH, WC are not in FF[4].

Details

BR, WR
BR[value] and WR[value] have as value the rank(s) of the black and white players. If a player has several ranks, these can be separated by commas. For example, PB[Mukai Chiaki] BR[5p, Female Honinbo] PW[Fujisawa Rina] WR[2p, Aizu Cup]. When there are several black or white players, their ranks are separated by "&", and given in the same order as the names (see PB, PW).
DT
DT[value] has as value an ISO standard date, when available. For example DT[1946-07-31]. When less information is available, one can omit day, as in DT[1946-07], and month, as in DT[1946]. When more information is available, one can add the starting time, as in DT[2000-01-01 00:53]. For games played over several days, the dates can be given separated by commas, or by .. or ~ to denote an interval. Such a sequence of dates can be condensed, as in DT[1946-07-31,08-15,16,17]. A Japanese era date is given via the JD property.
GM
GM[number] defines the type of game in the present game record. Valid numbers are:

1 Go
2 Othello
3 Chess
4 Gomoku/Renju
5 Nine Men's Morris
6 Backgammon
7 Xiangqi
8 Shogi
9 Lines of Action
10 Ataxx
11 Hex
12 Jungle
13 Neutron
14 Phutball
15 Quadrature
16 Trax
17 Tantrix
18 Amazons
19 Octi
20 Gess
21 Twixt
22 Zèrtz
23 Plateau
24 Yinsh
25 Punct
26 Gobblet
27 Hive
28 Exxit
29 Hnefatafl
30 Kuba
31 Tripples
32 Chase
33 Tumbling Down
34 Sahara
35 Byte
36 Focus
37 Dvonn
38 Tamsk
39 Gipf
40 Kropki


In many cases this is only a reserved number. No actual use is being made of SGF game records in most of these playing communities. However, GNU backgammon uses SGF. There is some use for LOA (Lines of Action).

KM
KM[value] has as value a decimal number, positive or negative. For example, KM[7.5]. One might wish to recognize values postfixed by "子", for example KM[3.75子] or KM[3又3/4子], for games under Chinese rules. This "子" postfix can be regarded as a unit of 2 points. Similarly, "目" represents a unit of 1 point, and can be ignored.
OH
("Old Handicap". Property introduced by GoGoD.) Examples:
name notation comments
tagai-sen OH[B-W] Equal rank players, alternate in taking black
sen-ai-sen OH[B-W-B] Weaker player gets black two out of three games
josen OH[B] Weaker player always gets black
sen-ni-sen OH[B-2-B] Weaker player plays black, and gets a HA[2] every third game
sen-ni OH[B-2] Weaker player plays black, and gets a HA[2] every other game
It is not necessary to indicate the weaker player (since he takes black), except in the case of sen-ai-sen, where OH[(B)-W-B] and OH[B-(W)-B] denote a game with the weaker player taking black respectively white. Often, the hyphens are omitted.
PB, PW
In Pair Go two black players and two white players play in turn. In Relay Go, several black players and several white players play in turn. In such a case, the PB and PW values list the players with player names separated by "&". Similarly, the BR and WR values list player ranks separated by "&". For example, PB[Segoe Kensaku & Makino Jozan & Fujisawa Kuranosuke] BR[8d & 3d & 7d] PW[Go Seigen & Shinohara Masami & Hayashi Yutaro] WR[8d & 6d & 7d]. An alternative markup style includes the ranks with PB, PW separated from the player name by a comma, and omits the BR and WR properties. For example, PB[Segoe Kensaku, 8d & Makino Jozan, 3d & Fujisawa Kuranosuke, 7d] PW[Go Seigen, 8d & Shinohara Masami, 6d & Hayashi Yutaro, 7d].
RE
RE[value] has as value "B+" or "W+" when Black or White won, possibly followed by a number if they won by that number of points, or followed by "R", "T", or "F" when the game ended by resign, time exceeded, or forfeit. The value for jigo is "Jigo". If the rules say that jigo counts as a win for Black or for White, the value is "B+0" or "W+0". The value may also be "Void" (with explanation, such as "triple ko", elsewhere). Or "Unfinished". Or "Playing", in case this is a live game, still being played. Or "?", if the result is unknown.
RU
RU[value] has as value the string "Japanese" or "Chinese" or "Ing" in case these rules are used. A year or other description can be added when important. Other values may occur.

Historical remarks

Property identifiers

Modern property names are upper case only. In ancient game records one meets "Black[jj]" instead of "B[jj]", and "REsult[B+3]" instead of "RE[B+3]", and "SiZe[19]" instead of "SZ[19]", and "CoPyright[...]" instead of "CP[...]". Software handling SGF must be able to accept lower case letters in property names (and ignore them). Similarly, one meets "RE[B+Resign]", "RE[B+Time]", "RE[B+Forfeit]" instead of "RE[B+R]", "RE[B+T]", "RE[B+F]".

Usually, the first symbol of a property identifier is upper case, but one meets "GaMe[1]goWriteVersion[1.4e]".

There have been times that a property name had at most two upper case letters. Modern software accepts property names of arbitrary length. Property names like KGSDE, KGSSB, KGSSW and MULTIGOGM are common.

Escapes

The above grammar only gives special status to the "]" symbol, and no other symbols need escaping than this, and a "\" that is not meant to be an escape. Earlier grammars also made the colon ":" special, and one meets escaped colons, and escaped multibyte characters of which the first byte is '\' or ']' or ':'. Such escapes can be ignored.

Individual properties

AP
In the FF[4] standard this has syntax "simpletext ':' simpletext", where the idea is that the colon separates program name and version. In practice one sees e.g. GoGoD95 and Free Climber 0 . 8 . 11 . 61 without colon.
B, W
Since ss is the lower right hand corner of a 19x19 board, tt was used to denote a pass. But larger boards are sometimes used, and a pass should now be written B[] or W[].
CA
In FF[4] the default character set was ISO-8859-1 aka Latin-1. Today this is unreasonable, and UTF-8 is the new default. On the one hand because there is no reason to expect Western Europe to be the main producer of SGF files. On the other hand, because ASCII and UTF-8 can be recognized with high confidence, while ISO-8859-1 cannot easily be recognized.
CR
It is common for display programs to indicate the last move in some way, e.g. with a circle. Sometimes a game record is redundantly marked up with lots of CR[]s, since each move was the last one when it was played. Of course such last-move circles need not be indicated in the game record.
PL
The color of the player to move is indicated by B or W, sometimes Black or White. In older files one also sees PL[1] and PL[2] instead of PL[B] and PL[W]. (This is according to the alphabetical list in the FF[3] standard.)
RE
In the FF[4] standard the result Jigo is represented as RE[0]. In practice one finds collections where the result is missing for all games with Jigo result. Writing RE[Jigo] is more robust.
TM
In the FF[4] standard the parameter is a real. In practice this is inconvenient, and one encounters TM[5400] and TM[300] where TM[15h] and TM[5h] was meant. Using units leads to fewer errors.

Obsolete properties

In the FF[4] and earlier standards several properties are described that are omitted here. Nothing is wrong with properties outside the present text, except that they are not standardized here, and their function will depend on the software used.

Examples are AR (Arrow), BM (Bad move), BS (Black Species), CH (Check mark), DD (Dim points), DG (Diagram), DI (Difficulty of a problem), DM (even position), DO (Doubtful move), EL (Eval. comp move), EX (Expected move), FG (Figure), GB (Good for Black), GW (Good for White), HO (Hotspot), KO (illegal move), ID (Game identifier), IL (Illegal move), IT (Interesting move), L (Letter), LN (Line), LT (Loss on Time is enforced), M (Mark), OM (Moves per overtime), ON (Opening Name), OP(Overtime period), OV (Operator overhead), PM (Print move mode), RG (Region), SC (Secure stones), SE (Self test), SI (positions marked with a Sigma), ST (Style), TC (Territory Count), TE (Tesuji), UC (Unclear position), V (Value), WS (White Species).

General remarks

Coordinates

The default SGF coordinates have origin upper-left and rows and columns labeled a..s (on a 19x19 board). The standard coordinates used in the West have origin bottom-left, with rows numbered 1..19, and columns A..T where the I is skipped. It is easy to extend software to understand both systems of coordinates.

Time properties

At the end of the game, one wishes to write how much time the game took. Nowadays that often takes the form of a comment: C[Time used: B: 9h59m, W: 4h58m]. Sometimes TM is misused here. One might consider adding a property TU ("Time used").

Overtime has many forms. Canadian byo-yomi gives each player an indefinite number of byo-yomi periods in each of which a fixed number of moves has to be played. For Canadian byo-yomi, the OT description should mention (i) that Canadian byo-yomi is used, (ii) the length of each period, (iii) the number of moves/period. For example: OT[30 moves / 10m]. After each move one can give OB and BL or OW and WL to indicate the number of moves still to make, and the time available for that. After the last move in a period, with t seconds left, it is better to give OB[0]BL[t] than to give say OB[30]BL[600] (in case of 30 moves in a 10-minute period), since the latter does not provide new information, and the former enables one to see how much time was spent on this move.

Japanese byo-yomi gives each player a fixed number of byo-yomi periods of fixed length, for example 10 periods of 1 minute each. Time counting starts anew for each new move. Over the entire game each player can use the given number of full byo-yomi periods, and an arbitrary number of partial periods. For Japanese byo-yomi, the OT description should mention (i) that Japanese byo-yomi is used, (ii) the number of periods, (iii) the length of each period. For example: OT[10x1m]. Equivalently, LC[10] LT[60] (or TC[10] TT[60]). After each move one might write the number of remaining periods, say again using properties OB and OW.

In a match under the Ing rules, players can buy additional time a few times, and pay with points (as if komi was increased or decreased). For example, a player may be able to buy an additional 20m time period at most three times, and lose when the last period is exhausted. The price of a period is 2 points. The default length of a period is 1/6th of the length of the main time, as given by TM. Of interest is the number of time periods each player bought (and when), and their length when not default. The default komi is 8 (with jigo a win for Black). A way to represent the points paid might be to add them to / subtract them from the komi, so that KM[14] indicates that Black bought three additional periods. Details can be given in a Game Comment.

Comment properties

One needs at least three different types of comment. Comments on the moves are given with C. Comments on the game are given with GC. Comments on the SGF file are needed, but no standard label is in common use.

Private properties

In order to avoid future conficts, one can use a prefix for private properties. For example, KGS uses or used a few properties with KGS- prefix describing the counting process.

prop meaning arg syntax description
KGSDE Dead pointlist List of stones regarded dead
KGSSB Score Black real Total number of points for Black
KGSSW Score White real Total number of points for White

This shows the use of a private prefix to avoid name clashes. Should it become desirable to standardize these properties, they should probably be called DE, SB and SW.

Multigo uses MULTIGOGM and MULTIGOBM (bookmark).

GoGoD PBase uses or used the rootnode property KK (label or key, used in problem collections).

The Dragon Go Server uses or used a private property XM to indicate the number of the next move, as in XM[245], used as a rootnode property. Elsewhere XM[] is used to indicate the last move of the actual game, in the NodeSequence of that last move, often in a situation with variations.

SmartGo uses or used a largish number of private properties:

TY (Moyo points), TZ (Unsettled points), TC (Blocks that can be captured tactically), TP (One of several blocks can be captured tactically), TK (Ko points: Control of territory depends on winning a ko, for both sides), KB (Ko points favorable for black: Black can play to control those points, white can make a ko), KW (Ko points favorable for white: White can play to control those points, black can make a ko), E (List of points where stones are captured, or empty list in the root), EG (For Environmental Go: list of coupon values), TU (The time used to solve a problem), NN (The number of nodes examined to solve a problem), NL (The number of evaluations or leaf nodes), MD (The maximal depth reached during a search), DE (Depth: the number of plies searched), PD (Partial Depth: the number of top level moves at deepest search), MO (The move motive), MM (All move motives), PN (Proof number), DN (Disproof number), QE, QB, QW (in search patterns: Not Empty, Not Black, Not White).

SGF2 properties

There is an SGF dialect with files starting out "(" instead of "(;", and with property names GM, EV, KM, DT replaced by GK, TE, KO, RD. Sometimes this dialect is attributed to Nihon Ki-in. One also sees HD for the combination HA plus AB.

Compressed sequences

The typical SGF file has lots of redundancy: a long string B[xx]W[yy]B[zz]... of moves with alternating colors, where the first color is clear from the context. SmartGo uses or has used the compressed version S[xxyyzz...].

Appendix

What property identifiers are used in practice? For a sample of 1682246 SGF files the tables below gives the frequencies with which the various properties occurred.

Occurrences in the rootnode

We list the 83 properties that occurred in the rootnode at least 250 times.

1623694 PW
1623630 PB
1607378 RE
1569449 DT
1358775 EV
1355769 KM
1269764 WR
1263389 BR
974499 SZ
797402 FF
766395 PC
619910 CA
531252 RO
518901 GM
405107 AP
390167 SO
299563 US
290200 RU
267817 TM
247780 C
247621 HA
232721 AB
188867 WC
188854 BC
160079 OT
151411 GN
144047 AW
124418 ST
64752 MN
59405 WT
59345 BT
39891 TI
32230 LC
30942 GC
28474 LT
21894 B
16821 TE
16777 KO
16692 RD
13813 CP
12419 MULTIGOGM
12025 GK
9482 OH
6078 ID
5690 TC
5345 WS
5345 BS
4468 LB
4426 TR
3976 TL
3343 PL
2970 VW
2370 AN
2252 LKM
1766 W
1701 ZZ
1581 JD
1000 N
988 PTM
899 WL
899 BL
895 PM
885 FG
858 CO
854 PDT
854 GE
854 DP
854 DI
846 TEV
814 MA
652 OM
533 CR
513 DTX
490 EVX
437 KI
428 SY
421 BP
419 WP
402 AX
393 PI
323 AE
310 T
258 WV


This starts out with lots of well-known properties, and the first one not documented yet is ST.

(This ST is a backwards-compatibility hack. It is a metaproperty intended to tell the current display program which display program was used by the author of the comments. Some display programs would automatically assign labels a, b, c, ... to siblings of the current node. Some would assign such labels to children of the current node. The current user interface might wish to know what style was used originally and mimic the behavior of the old display program, so that letters used in comments are understandable. This metaproperty also stores another bit of information, namely whether upon display there should be automatic board markup for variations. Of the above occurrences of ST, 96% are ST[2]: no markup.)

TI is "Tournament Index", an integer numbering tournaments. Of course B and W in the root node are mistakes. The properties TE, KO, RD, GK are aliases for EV, KM, DT, GM in SGF2 style. MULTIGOGM (with value 0 or 1) is probably left by the Go editor MultiGo. TC can be Territory Count or Tail Comment.

A non-commented game record often has metainformation only in rootnode and final node. Since comments in the final node are a bit fragile (comments are often stripped away outside the rootnode), some sites use TC[text] in the root node, and C[text] on the final node (with the same text).

TL and LKM and PTM and PDT are aliases for TM and KM and TM and DT. The property ZZ gives an explanation (in Chinese) of the following property label.

For example, "ZZ[白方姓名]PW[河野临]ZZ[白方英文]WE[Kono Rin]ZZ[白方段位]WR[九段]" says "ZZ[White player name]PW[河野临]ZZ[White player English name]WE[Kono Rin]ZZ[White player rank]WR[9p]".
An interesting reflexive metaproperty. This usage violates the rule that each property label may occur at most once in a node. Property names that occur in this context: SZ, RM, EV, GT, GB, GS, SK, DT, DC, PC, PB, BE, BR, BK, PW, WE, WR, WK, KM, SO, US, RE, RC, NT, OA, OB, ON, OC, OT, LC.

In goproblems one sees GE (problem type, e.g. "joseki", "life and death", "tesuji", "endgame") and DI ("Difficulty", with values e.g. 7k, 1d) and CO and DP (two integers with unknown meaning). Of course one also needs to describe what variations are correct and wrong, but usually that is done with a formalized comment.

TEV looks like a mistake or a year, OM gives the total number of moves, or the number of moves per Canadian byoyomi period, KI ("Integer Komi") gives twice the komi, SY ("System") gives the Go Editor version (e.g. "Cgoban 1.9.2", "Written by GoBase-0.0.9", "Converted by sgf2misc-2.7.2").

BP and WP form a solution to what to do in the case of multiple players. One writes BP[player1][player2][player3]. A different solution uses PI ("People Involved") and labels: PI[B1 : Takahara Shuuji : 3p][W1 : Hasegawa Sunao : 9p].

There is an FF[5] proposal. Of the files investigated, 392 claimed FF[5]. (Rough distribution: no FF: 880000, FF[4]: 606000, FF[3]: 187000, FF[1]: 4200, FF[5]: 392, FF[03]:4.) These were roughly the same files that used AX (a small integer with unknown meaning), PI (as above) and OT with a similar labeled style, e.g. OT[TM:A:10800][TG:1][OP:A:60][OM:A:1][NO:-1].

The Aya program uses T for "time used for the present move". (It measures in seconds, with is not optimal, since there are frequent 0's. Giving one decimal when the number of seconds is less than 10, say, is probably better.) However, in most of the files investigated T had values like 8h, with T an alias for TM, or values like Region:Korea.

GoWrite is an SGF editor, and WV gives the goWriteVersion, e.g. WV[1.4p,Unregistered version]. WX gives the goWriteeXtension, e.g. WX[DefPic=001].

Occurrences in non-rootnodes

We list the 32 properties that occurred outside the rootnode at least 600 times. (This cutoff is chosen so as to get rid of typos and properties like PB, PW, SZ, KM, RE that should have been in the rootnode.)

172139927 B
171372522 W
1754547 C
373777 LB
162999 N
93424 CR
86632 BL
86590 WL
69765 TR
58734 MN
35589 AB
33034 AW
18993 OB
18973 RN
18769 PT
18302 OW
15380 AE
10703 MA
10644 ID
9201 L
8439 SQ
8153 TW
8013 TB
5299 SI
5220 TL
3501 BM
1993 M
1066 E
872 FG
695 WS
695 BS
652 V


The top of the table is as expected: lots of moves B, W, a few more Black moves than White. On average 102.3 Black moves per game and 101.9 White moves per game, for an average game length of 204 moves.

Comments C, markup with LB, N, CR, TR, MN, MA, SQ, M, time left and byo-yomi periods left BL, WL, OB, OW.

The AB and AW were mostly mistakes (these should have been B, W or should have been in the rootnode). Some of the AB, AW, AE come from an old style of handling variations, where first the actual move is given, and then variations are noted as children of that node (instead of siblings), by undoing the actual move with AE, and adding the variation move with AB or AW. (Such a node ;AB[xy]AE[zw] shifts the node numbering by 1, and is undesirable for many other reasons as well. Many viewers do not support it.)

RN and PT belong to a certain dialect of SGF2 and are found at the start of a variation. ID is used by some to number the moves. L gives a labeling letter. TW and TB are the territories given at the end of the game. SI is a FF[3] property that gives a position marked with a sigma. I do not know what the nonroot TL is. One sees some moves marked with TL[0,0]BL[0].

Judgments BM ("Bad Move"), GB ("Good for Black"), GW ("Good for White"), IT ("Interesting move"), EX ("Expected move"), DO ("Doubtful move") are not seen very often (in this sample of SGF files).

Property E gives (in its own node) the stones captured by the preceding move. E.g. ;B[io];E[ip] and ;W[dj];E[dk][ek]. (It can be viewed as a command to empty these positions.) Such a property is useful only for very weak viewers, that cannot compute chains and liberties themselves. But if used they should have been in the same node as the capturing move.

FG ("Figure"), V ("Value"). I do not know what the nonroot BS, WS are, probably mistakes.


Contents