Skip to content

Latest commit

 

History

History
182 lines (119 loc) · 6.71 KB

File metadata and controls

182 lines (119 loc) · 6.71 KB

Output format

Author: Adam Leszczyński <aleszczynski@bersler.com>, version: 1.9.0, date: 2026-02-17

Previous chapter: Architecture and components

Introduction

The output format is fully configurable. Two formats are implemented: JSON and Protocol Buffer. The program architecture makes it straightforward to add additional formats.

JSON format

JSON is the first implemented format and provides fast write performance. The stream is constructed directly from the redo log data. The JSON construction avoids dynamic memory allocation for the main stream object. Instead, the stream is written directly while redo log data is parsed. Internal tests show that the JSON writer is about 2.5× faster than the Protocol Buffer writer, although JSON output may be larger.

Response: scn_val

The field contains the SCN value associated with the payload data.

The value can be stored in:

  • field scn and stored as decimal (default);

  • field scns and stored as a string in hex format "C" styl (example: "scns":"0x0000008a33ac2263"

See: scn parameter for configuration details.

Response: tm_val

The Time field contains the timestamp related to the payload. If a transaction contains multiple DML operations, timestamps for individual DML operations can be distinguished. By default, the timestamp related to the commit record is used.

The value can be stored in:

  • Field "tm" and stored using a number;

  • Field "tms" and stored as a string.

See: timestamp parameter for configuration details.

Response: xid_val

The field contains the transaction ID associated with the payload data. It is not present in checkpoint messages.

The value can be stored in:

  • Field "xid" and stored as a string in hex (default). An example value would be: "xid":"0x0009.003.0000568e".

  • Field "xid" — like previous but using decimal numbers, for example, "xid":"9.3.22158".

  • Field "xidn" and stored as a decimal number, (for example, "xidn":22158).

See: xid parameter for configuration details.

Note
Internally, the transaction ID (XID) is stored using a 64-bit number.

Response: db

The db field contains database name.

See: db parameter for configuration details.

Response: payload.op

The op field contains a string describing the type of the operation. The following operation types are supported:

  • "begin" — begin transaction record;

  • "commit" — commit transaction record;

  • "c" — create record — field would represent INSERT DML operation;

  • "u" — update record — field would represent UPDATE DML operation;

  • "d" — delete record — field would represent DELETE DML operation;

  • "ddl" — DDL operation;

  • "chkpt" — checkpoint record.

Response: payload.schema

A schema field is present only in DML operations and contains an object with the information about schema.

Below are listed the fields of the schema object.

  • "owner" — owner of the schema, optional field, may not be present when schemaless mode is used;

  • "table" — name of the table, in case of schemaless mode the value is OBJ_xxx, where xxx is the object identifier;

  • "obj" — object identifier of the table;

  • "columns" — array of columns (described below).

Response: payload.schema.columns

The schema.columns field is an array of objects, each object describing one column.

The following fields are present in the column object:

  • "name" — name of the column;

  • "type" — type of the column;

  • "length" — length of the column, present for varchar2, raw, char, timestamp, timestamp with time zone, interval year to month, interval day to second, urowid, timestamp with local time zone types;

  • "precision" — precision of the column, present for number type;

  • "scale" — scale of the column, present for number type;

  • "nullable" — true if the column is nullable, false otherwise;

Response: payload.rid

The field contains the row identifier (row ID, rid) of the row.

See: rid parameter for configuration details.

Response: payload.before

The before field contains the old values of the columns. It is present only in update and delete operations. The field is an array of objects, each object describing one column.

Caution

Only data that is present in the redo log is present in the output. For update operations, values may be missing from the list in case the actual value didn’t change.

See: column parameter for configuration details.

Response: payload.after

The before field contains the new values of the columns. It is present only in insert and update operations. The field is an array of objects, each object describing one column.

Caution

Only data that is present in the redo log is present in the output. For update operations, values may be missing from the list in case the actual value didn’t change.

See: column parameter for configuration details.

Response: payload.ddl

The field contains the text of the DDL statement.

The DDL payload elements are not present by default.

See: flags parameter for configuration details.

Response: payload.seq

The field is only present for checkpoint messages. It contains information about the sequence number of the redo log file.

Response: payload.offset

The field is only present for checkpoint messages. It contains information the byte offset of the redo log file associated with the checkpoint record.

Response: payload.redo

The field is only present for checkpoint messages. It contains value 1 for checkpoint messages which are related to redo log file switch.

Response: payload.num

The field contains a consecutive number of the payload data.

See: message parameter for configuration details.

Protocol buffer format

The Protocol buffer format is the second implemented format. The field types and names are the same as in the JSON format, so there is no need to explain them again. The writer of this format constructs objects table by table, column by column, field by field and then serializes them to the output stream. Because every field is allocated separately, the memory consumption is higher than in the JSON writer, and internal tests show that the time of generating the stream is about 2.5 times slower.

Next chapter: Output target