Skip to content

Latest commit

 

History

History
367 lines (265 loc) · 16.9 KB

sep-0011.md

File metadata and controls

367 lines (265 loc) · 16.9 KB

Preamble

SEP: 0011
Title: Txrep: human-readable low-level representation of Stellar transactions
Author: David Mazières
Status: Active
Created: 2018-08-31
Updated: 2021-10-09
Version: 1.1.0

Simple Summary

Txrep is a human-readable representation of Stellar transactions that functions like an assembly language for XDR.

Abstract

This document specifies txrep, a human-readable format for Stellar transactions. Txrep is unambigous and machine-parsable. Binary XDR transactions can be decompiled into txrep format and recompiled to the exact same binary bytes. Txrep is designed to be parsed directly into data structures generated by XDR compilers, ideally by the very XDR code generated by those compilers.

Motivation

Bug reports, test vectors, semantic specifications in CAP documents, and Stellar documentation all need a way to talk concretely about the contents of transactions. Without a canonical textual format, different documents will each devise their own way of describing transactions, sometimes ambiguously, making it hard to relate information from multiple sources.

Furthermore, advanced Stellar users and devlopers need a way to craft arbitrary transactions, including potentially invalid ones for test cases. This functionality is currently available from the Stellar laboratory transaction builder, but that is not a good solution for high security, as it requires a browser (and in practice requires trusting an HTTPS certificate). Moreover, the transactions one builds in a web browser cannot easily be audited before signing, or described in documentation, or scripted, or placed under version control.

Finally, "dumb" ink-on-paper contracts may need to specify Stellar transactions unambiguously. For example, a legal contract promising to deliver tokens with a lock-up period in exchange for a wire transfer may want to specify the exact lock-up mechanism used through a human readable description of the Stellar transactions involved. Using txrep, this description can be unambiguous.

Specification

A txrep file consists of a number of lines, each describing the value of a field. We describe the line format using a simple BNF-like notation in which brackets ([...]) indicate optional contents, pipe (|) indicates alternatives, and an asterisk (*) indicates zero or more repetitions of the previous symbol. Literal text (e.g., :, ., [, ], ._present, .len, or _) indicates the occurence of those specific characters in the txrep source.

Each line of txrep has the following format:

line = field : SP* value [comment] LF | : comment LF | LF

comment = zero or more characters other than LF

SP = space (ASCII 32)

LF = newline (ASCII 10)

Blank lines are allowed, and any line starting with a colon (:) is a full-line comment. The field and value formats are described below.

Fields

A field has the following syntax:

field = tag selector* [pseudoselector]

selector = . tag | [ integer ]

tag = letter wordchar*

pseudoselector = . _inner* _present | .len

integer = A decimal integer

letter = Any letter

wordchar = Any letter, any digit, or _

Disregarding pseudoselector, each field names a field in an XDR TransactionEnvelope data structure. The name is generated by joining XDR field names with a period (.), similar to the syntax used for accessing nested fields in C++ or go representations of XDR data structures. Array elements (for both fixed- and variable-length arrays) are indexed using square brackets. As in C, arrays are 0-based. Pseudoselectors of the form _inner_present (or _inner_inner_present, etc.) are reserved for fields of type pointer to pointer. Such types do not appear in the current version of TransactionEnvelope, but might in the future, or txrep might be applied to other XDR data types that contain such nested pointers.

As an example, the field tx.timeBounds.minTime names the minTime field in the tx field of a TransactionEnvelope, as follows:

typedef uint64 TimePoint;
struct TimeBounds {
    TimePoint minTime;
    TimePoint maxTime; // 0 here means no maxTime
};
struct Transaction {
    /* ... */
    TimeBounds* timeBounds;
    Operation operations<100>;
    /* ... */
};

Pointers and variable-length arrays use pseudoselectors to describe their state.

Pointers use the ._present pseudoselector with value true or false to indicate that the field is present or NULL, respectively. For example, a transaction with timebounds might be specified like this (using comments to annotate the times):

tx.timeBounds._present: true
tx.timeBounds.minTime: 1535756672 (Fri Aug 31 16:04:32 PDT 2018)
tx.timeBounds.maxTime: 1567292672 (Sat Aug 31 16:04:32 PDT 2019)

A transaction without timebounds would contain this line:

tx.timeBounds._present: false

For fields of type pointer-to-pointer, the pseudoselector ._present describes whether the outermost pointer is NULL or not. If ._present is true, then ._inner_present describes whether the nested pointer is NULL or not. _inner can be repeated as many times as needed for arbitrarily nested pointers.

Variable-length arrays use the pseudo-selector .len with an integer value to indicate the number of elements of the variable-length array. Only indices from 0 to len-1 are meaningful. For example, a transaction with one operation might look like this:

tx.operations.len: 1
tx.operations[0].sourceAccount._present: false
tx.operations[0].body.type: PAYMENT
...

As with any XDR code, implementations must avoid pre-allocating arrays of size len to avoid memory exhaustion attacks, and should instead repeatedly double arrays while deserializing txrep (thereby achieving amortized linear running time without allocating memory that is more than a constant factor larger than the size of the txrep).

Versioned structures

For backwards compatibility, a few versioned structures are inlined as if their fields belonged to the parent datatype. An example of such a structure is TransactionV1Envelope, defined in the following XDR

union TransactionEnvelope switch (EnvelopeType type) {
case ENVELOPE_TYPE_TX_V0:
    TransactionV0Envelope v0;
case ENVELOPE_TYPE_TX:
    TransactionV1Envelope v1;
case ENVELOPE_TYPE_TX_FEE_BUMP:
    FeeBumpTransactionEnvelope feeBump;
};

struct TransactionV1Envelope {
    Transaction tx;
    DecoratedSignature signatures<20>;
};

When such structures declared in field names of the form v followed by decimal digits, the selector (e.g., v1) is omitted from txrep, yielding the following:

type: ENVELOPE_TYPE_TX
tx.sourceAccount: GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN
tx.fee: 100
...

Without this rule, the fields would be called tx.v1.sourceAccount, etc.

The complete list of versioned structures is:

  • All types that match the regular expression TransactionV[0-9]+Envelope.

Values

Most XDR types are rendered using C syntax. Specifically:

Integers

All integers (signed and unsigned, 32- and 64-bit) are represented as C integers (decimal by default, but prefix 0x can be used for hex and 0 for octal).

Booleans

bool values are true or false

Enums

Enums are represented by the bare keyword of the value. They can also be specified numerically as "Type#Number" (e.g., MemoType#3 is equivalent to MEMO_HASH).

Strings

string values are represented as double-quoted interpreted string literals, in which non-ASCII bytes are represented with hex escapes ("\xff"), the " and \ characters can be escaped with another \ (e.g., "\\"), and \n designates a newline.

Opaque

opaque values are represented as an unquoted hexadecimal string (using lower-case case a...f) with an even number of digits. An exception is that the 0-length opaque vector is represented as "0" (a single digit). Implementations are encouraged to add the comment "bytes" so that it reads "0 bytes" to further distinguish the 0-length vector from the vector with a single byte 0x00 (rendered "00").

Aggregate Values

A few aggregate values are special-cased:

AlphaNum4, AlphaNum12

The AlphaNum4 and AlphaNum12 struct types are both rendered as a string Code:IssuerAccountID.

"Code" must consist of printable ASCII characters (octets 0x21 through 0x7e).

The sequence \x introduces a hex escape sequence, e.g., \x00 to introduce a 0-valued byte. Otherwise, \ escapes the next character, so \\ is required to introduce a backslash.

Stellar disallows assets of type ASSET_TYPE_CREDIT_ALPHANUM12 that have fewer than 5 bytes, but such assets can be represented in binary XDR, and so in txrep an AlphaNum12 value with fewer than 5 bytes in its assetCode is rendered with trailing \x00 (escaped NUL bytes) to as to make the length 5---e.g., the 12-byte asset code ABC is rendered ABC\x00\x00. Note that stellar-core disallows non ASCII bytes in AssetCode fields, so the primary use of this feature is to construct or examine invalid transaction test cases.

Asset, TrustLineAsset

The Asset, and TrustLineAsset union types are rendered as a string in one of three forms dependent on their discriminant:

  • native (or any string up to 12 characters not containing an unescaped colon) for the native asset, i.e. ASSET_TYPE_NATIVE.

  • Code:IssuerAccountID as defined for AlphaNum4 or AlphaNum12 for issued assets, i.e. ASSET_TYPE_CREDIT_ALPHANUM4 and ASSET_TYPE_CREDIT_ALPHANUM12.

  • LiquidityPoolID:lp for liquidity pool shares, i.e. assets of type ASSET_TYPE_POOL_SHARE.

    "LiquidityPoolID" is the hex encoded PoolID using a lowercase alphabet.

    "lp" is the string "lp".

AllowTrustOp

The asset field of AllowTrustOp is rendered the same as the Code in Asset, only without the trailing ":IssuerAccountID" (since the issuer is the sourceAccount of the operation).

PublicKey, SignerKey, MuxedAccount

PublicKey, SignerKey, and MuxedAccount are rendered as unquoted strings in strkey format, as specified in SEP-0023.

Others

Any fields in the XDR TransactionEnvelope structure that are not specified in a txrep description are to be interpreted as false for bool, zero (for numeric values, enums, fixed-length opaque), or zero-length (for strings, variable-length arrays, and variable-length opaque). The _present pseudo-selector, if unspecified, defaults to true if any field or value of the pointer's data structure is specified, and otherwise defaults to _false.

Normalized txrep

Fields in txrep are specified in an order-independent way. If a field appears twice, the second value overrides the first. This allows one to update a txrep-format transaction by appending lines to a file. However, in some cases, such as when disassembling binary transactions, it is useful to transform transactions into normalized form, for instance so two transactions can be more easily compared, or so users inspecting a transaction see a more predictable format.

Normalized txrep format is a txrep format with the following additional restrictions:

  • Every field and pseudofield in the the binary XDR transaction must appear exactly once in the description. Extraneous fields or fields that do not appear because of NULL pointers or incompatible union discriminants must not appear.

  • Fields in structs and unions must appear in the exact order they appear in the XDR file, which is also the order in which they are marshaled for XDR binary format. In particular, this also requires array elements to be listed in order from 0 to the length of the array.

  • Pseudo-fields must appear immediately before the values they affect. In particular, the ptr._present: true field must appear immediately before the value of ptr, and vector.len must appear immediately before vector[0].

  • Codes in Asset and AllowTrustOp should escape \, : and any byte outside the range 0x21-0x7e, but no other bytes. Trailing \x00 should not be shown except as needed to show an asset of type ASSET_TYPE_CREDIT_ALPHANUM12 fewer than 5 non-zero bytes.

  • Enums must be shown as their symbolic value, rather than "Type#Number", unless there is no symbolic value.

  • The native asset should be rendered as XLM for the Stellar public network, TestXLM for the Stellar test network, and, in the absence of a convention for any other network, native.

One possible use of non-normalized txrep is to allow users to discover missing fields. A tool that allows users to construct transactions in txrep format can translate the transactions to normalized format to highlight any missing fields along with their default values. In contexts where users should not leave any fields missing, tools can refuse to accept non-normalized txrep.

Rationale

Txrep is designed to be easy to read into and generate from structures created by XDR compilers using code generated by XDR compilers. Writing txrep simply requires printing each field and value in an XDR structure instead of rendering it in binary format. Txrep can also trivially be read into a key-value map, as almost every language has functions to read files line-by-line and break strings at a specific character (to separate field and value at :). The map can then be consulted in XDR deserialization routines to read txrep.

Tailoring txrep specifically to XDR means it does not introduce a second native representation for transactions. Third party libraries for other popular formats such as JSON and YAML generally cannot be parsed directly into a TransactionEnvelope structure generated by an XDR compiler, in part because of XDR's tagged unions. Moreover, the ability to parse and generate txrep with nothing more than an XDR implementation reduces the attack surface of programs that process txrep.

Because each line of txrep is entirely self-contained, one can excerpt any subset of a transaction with no ambiguity. Because txrep takes the last value of a field, one can overwrite any previously set transaction field. This is convenient for scripts that may wish to tweak values by appending lines to a transaction template.

A few special cases (for signers, accounts, and assets) make the output easier for humans to process by providing compatibility with other tools.

The ._present pseudoselector was selected to avoid conflicting with fields in the same data structure, since the XDR specification disallows fields that start with _. There is no similar ambiguity possible for .len, whose only siblings are bracketed array indices.

Test Cases

The following binary transaction:

AAAAAgAAAAArFkuQQ4QuQY6SkLc5xxSdwpFOvl7VqKVvrfkPSqB+0AAAAGQApSmNAAAAAQAAAAEAAAAAW4nJgAAAAABdav0AAAAAAQAAABZFbmpveSB0aGlzIHRyYW5zYWN0aW9uAAAAAAABAAAAAAAAAAEAAAAAQF827djPIu+/gHK5hbakwBVRw03TjBN6yNQNQCzR97QAAAABVVNEAAAAAAAyUlQyIZKfbs+tUWuvK7N0nGSCII0/Go1/CpHXNW3tCwAAAAAX15OgAAAAAAAAAAFKoH7QAAAAQN77Tx+tHCeTJ7Va8YT9zd9z9Peoy0Dn5TSnHXOgUSS6Np23ptMbR8r9EYWSJGqFdebCSauU7Ddo3ttikiIc5Qw=

Can be rendered like this (note that comments are optional and may contain implementation-dependent information):

type: ENVELOPE_TYPE_TX
tx.sourceAccount: GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN
tx.fee: 100
tx.seqNum: 46489056724385793
tx.timeBounds._present: true
tx.timeBounds.minTime: 1535756672 (Fri Aug 31 16:04:32 PDT 2018)
tx.timeBounds.maxTime: 1567292672 (Sat Aug 31 16:04:32 PDT 2019)
tx.memo.type: MEMO_TEXT
tx.memo.text: "Enjoy this transaction"
tx.operations.len: 1
tx.operations[0].sourceAccount._present: false
tx.operations[0].body.type: PAYMENT
tx.operations[0].body.paymentOp.destination: GBAF6NXN3DHSF357QBZLTBNWUTABKUODJXJYYE32ZDKA2QBM2H33IK6O
tx.operations[0].body.paymentOp.asset: USD:GAZFEVBSEGJJ63WPVVIWXLZLWN2JYZECECGT6GUNP4FJDVZVNXWQWMYI
tx.operations[0].body.paymentOp.amount: 400004000 (40.0004e7)
tx.ext.v: 0
signatures.len: 1
signatures[0].hint: 4aa07ed0 (GAVRMS4QIOCC4QMOSKILOOOHCSO4FEKOXZPNLKFFN6W7SD2KUB7NBPLN)
signatures[0].signature: defb4f1fad1c279327b55af184fdcddf73f4f7a8cb40e7e534a71d73a05124ba369db7a6d31b47cafd118592246a8575e6c249ab94ec3768dedb6292221ce50c

Implementations

Golang

Txrep is implemented by the Stellar transaction compiler, stc.

JavaScript

Txrep is implemented as a JavaScript library for browser or NodeJS, @stellarguard/txrep.

Flutter SDK

Txrep is implemented by the Flutter SDK. See Soneso/stellar_flutter_sdk.

iOS SDK

Txrep is implemented by the iOS SDK. See example.

PHP SDK

Txrep is implemented by the PHP SDK. See example.