跳转至

替代格式和自定义格式(实验性)

This is the sixth chapter of the Kotlin Serialization Guide. It goes beyond JSON, covering alternative and custom formats. Unlike JSON, which is stable, these are currently experimental features of Kotlin Serialization.
这是《Kotlin 序列化指南》的第六章。它超越了 JSON,涵盖了替代和自定义格式。与稳定的 JSON 不同,这些目前是 Kotlin 序列化的实验性功能。

CBOR (experimental) CBOR(实验性)

CBOR is one of the standard compact binary encodings for JSON, so it supports a subset of JSON features and is generally very similar to JSON in use, but produces binary data.
CBOR 是 JSON 的标准紧凑二进制编码之一,因此它支持 JSON 功能的子集,并且通常与使用的 JSON 非常相似,但会产生二进制数据。

CBOR support is (experimentally) available in a separate org.jetbrains.kotlinx:kotlinx-serialization-cbor:<version> module.
CBOR支持(实验性地)在一个单独的 org.jetbrains.kotlinx:kotlinx-serialization-cbor:<version> 模块中可用。

Cbor class has Cbor.encodeToByteArray and Cbor.decodeFromByteArray functions. Let us take the basic example from the JSON encoding, but encode it using CBOR.
Cbor 类具有 Cbor.encodeToByteArray 和 Cbor.decodeFromByteArray 函数。让我们从JSON编码中获取基本示例,但使用CBOR对其进行编码。

@Serializable
data class Project(val name: String, val language: String)

fun main() {
    val data = Project("kotlinx.serialization", "Kotlin") 
    val bytes = Cbor.encodeToByteArray(data)   
    println(bytes.toAsciiHexString())
    val obj = Cbor.decodeFromByteArray<Project>(bytes)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

We print a filtered ASCII representation of the output, writing non-ASCII data in hex, so we see how all the original strings are directly represented in CBOR, but the format delimiters themselves are binary.
我们打印输出的过滤 ASCII 表示,以十六进制写入非 ASCII 数据,因此我们可以看到所有原始字符串如何直接在 CBOR 中表示,但格式分隔符本身是二进制的。

{BF}dnameukotlinx.serializationhlanguagefKotlin{FF}
Project(name=kotlinx.serialization, language=Kotlin)

In CBOR hex notation, the output is equivalent to the following:
在 CBOR 十六进制表示法中,输出等效于以下内容:

BF                                      # map(*)
   64                                   # text(4)
      6E616D65                          # "name"
   75                                   # text(21)
      6B6F746C696E782E73657269616C697A6174696F6E # "kotlinx.serialization"
   68                                   # text(8)
      6C616E6775616765                  # "language"
   66                                   # text(6)
      4B6F746C696E                      # "Kotlin"
   FF                                   # primitive(*)

Note, CBOR as a format, unlike JSON, supports maps with non-trivial keys (see the Allowing structured map keys section for JSON workarounds), and Kotlin maps are serialized as CBOR maps, but some parsers (like jackson-dataformat-cbor) don’t support this.
请注意,与 JSON 不同,CBOR 作为一种格式支持具有非平凡键的映射(有关 JSON 解决方法,请参阅允许结构化映射键部分),并且 Kotlin 映射序列化为 CBOR 映射,但某些解析器(如 jackson-dataformat-cbor )不支持此功能。

Ignoring unknown keys 忽略未知键

CBOR format is often used to communicate with IoT devices where new properties could be added as a part of a device’s API evolution. By default, unknown keys encountered during deserialization produce an error. This behavior can be configured with the ignoreUnknownKeys property.
CBOR 格式通常用于与 IoT 设备通信,其中可以添加新属性作为设备 API 演进的一部分。默认情况下,在反序列化过程中遇到的未知键会生成错误。可以使用 ignoreUnknownKeys 属性配置此行为。

val format = Cbor { ignoreUnknownKeys = true }

@Serializable
data class Project(val name: String)

fun main() {
    val data = format.decodeFromHexString<Project>(
        "bf646e616d65756b6f746c696e782e73657269616c697a6174696f6e686c616e6775616765664b6f746c696eff"
    )
    println(data)
}

You can get the full code here.
您可以在此处获取完整代码。

It decodes the object, despite the fact that Project is missing the language property.
它解码对象,尽管缺少 Project language 该属性。

Project(name=kotlinx.serialization)

In CBOR hex notation, the input is equivalent to the following:
在 CBOR 十六进制表示法中,输入等效于以下内容:

BF                                      # map(*)
   64                                   # text(4)
      6E616D65                          # "name"
   75                                   # text(21)
      6B6F746C696E782E73657269616C697A6174696F6E # "kotlinx.serialization"
   68                                   # text(8)
      6C616E6775616765                  # "language"
   66                                   # text(6)
      4B6F746C696E                      # "Kotlin"
   FF                                   # primitive(*)

Byte arrays and CBOR data types

字节数组和 CBOR 数据类型

Per the RFC 7049 Major Types section, CBOR supports the following data types:
根据 RFC 7049 主要类型部分,CBOR 支持以下数据类型:

  • Major type 0: an unsigned integer
    主要类型 0:无符号整数
  • Major type 1: a negative integer
    主要类型 1:负整数
  • Major type 2: a byte string
    主要类型 2:字节字符串
  • Major type 3: a text string
    主要类型 3:文本字符串
  • Major type 4: an array of data items
    主要类型 4:数据项数组
  • Major type 5: a map of pairs of data items
    主要类型 5:数据项对的映射
  • Major type 6: optional semantic tagging of other major types
    主要类型6:其他主要类型的可选语义标记
  • Major type 7: floating-point numbers and simple data types that need no content, as well as the “break” stop code
    主要类型 7:浮点数和不需要内容的简单数据类型,以及“中断”停止代码

By default, Kotlin ByteArray instances are encoded as major type 4. When major type 2 is desired, then the @ByteString annotation can be used.
默认情况下,Kotlin ByteArray 实例编码为主要类型 4。当需要主要类型 2 时, @ByteString 可以使用注释。

@Serializable
data class Data(
    @ByteString
    val type2: ByteArray, // CBOR Major type 2
    val type4: ByteArray  // CBOR Major type 4
)        

fun main() {
    val data = Data(byteArrayOf(1, 2, 3, 4), byteArrayOf(5, 6, 7, 8)) 
    val bytes = Cbor.encodeToByteArray(data)   
    println(bytes.toAsciiHexString())
    val obj = Cbor.decodeFromByteArray<Data>(bytes)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

As we see, the CBOR byte that precedes the data is different for different types of encoding.
正如我们所看到的,对于不同类型的编码,数据前面的 CBOR 字节是不同的。

{BF}etype2D{01}{02}{03}{04}etype4{9F}{05}{06}{07}{08}{FF}{FF}
Data(type2=[1, 2, 3, 4], type4=[5, 6, 7, 8])

In CBOR hex notation, the output is equivalent to the following:
在 CBOR 十六进制表示法中,输出等效于以下内容:

BF               # map(*)
   65            # text(5)
      7479706532 # "type2"
   44            # bytes(4)
      01020304   # "\x01\x02\x03\x04"
   65            # text(5)
      7479706534 # "type4"
   9F            # array(*)
      05         # unsigned(5)
      06         # unsigned(6)
      07         # unsigned(7)
      08         # unsigned(8)
      FF         # primitive(*)
   FF            # primitive(*)

ProtoBuf (experimental) ProtoBuf(实验性)

Protocol Buffers is a language-neutral binary format that normally relies on a separate “.proto” file that defines the protocol schema. It is more compact than CBOR, because it assigns integer numbers to fields instead of names.
协议缓冲区是一种与语言无关的二进制格式,通常依赖于定义协议架构的单独“.proto”文件。它比 CBOR 更紧凑,因为它为字段而不是名称分配整数。

Protocol buffers support is (experimentally) available in a separate org.jetbrains.kotlinx:kotlinx-serialization-protobuf:<version> module.
协议缓冲区支持(实验性地)在单独的 org.jetbrains.kotlinx:kotlinx-serialization-protobuf:<version> 模块中可用。

Kotlin Serialization is using proto2 semantics, where all fields are explicitly required or optional. For a basic example we change our example to use the ProtoBuf class with ProtoBuf.encodeToByteArray and ProtoBuf.decodeFromByteArray functions.
Kotlin 序列化使用 proto2 语义,其中所有字段都是显式必填字段或可选字段。对于一个基本示例,我们将示例更改为将 ProtoBuf 类与 ProtoBuf.encodeToByteArray 和 ProtoBuf.decodeFromByteArray 函数一起使用。

@Serializable
data class Project(val name: String, val language: String)

fun main() {
    val data = Project("kotlinx.serialization", "Kotlin") 
    val bytes = ProtoBuf.encodeToByteArray(data)   
    println(bytes.toAsciiHexString())
    val obj = ProtoBuf.decodeFromByteArray<Project>(bytes)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

{0A}{15}kotlinx.serialization{12}{06}Kotlin
Project(name=kotlinx.serialization, language=Kotlin)

In ProtoBuf hex notation, the output is equivalent to the following:
在 ProtoBuf 十六进制表示法中,输出等效于以下内容:

Field #1: 0A String Length = 21, Hex = 15, UTF8 = "kotlinx.serialization"
Field #2: 12 String Length = 6, Hex = 06, UTF8 = "Kotlin"

Field numbers 字段编号

By default, field numbers in the Kotlin Serialization ProtoBuf implementation are automatically assigned, which does not provide the ability to define a stable data schema that evolves over time. That is normally achieved by writing a separate “.proto” file. However, with Kotlin Serialization we can get this ability without a separate schema file, instead using the ProtoNumber annotation.
默认情况下,Kotlin Serialization ProtoBuf 实现中的字段号是自动分配的,这无法定义随时间推移的稳定数据架构。这通常是通过编写一个单独的“.proto”文件来实现的。但是,通过 Kotlin 序列化,我们可以在没有单独的架构文件的情况下获得此功能,而是使用 ProtoNumber 注释。

@Serializable
data class Project(
    @ProtoNumber(1)
    val name: String, 
    @ProtoNumber(3)
    val language: String
)

fun main() {
    val data = Project("kotlinx.serialization", "Kotlin") 
    val bytes = ProtoBuf.encodeToByteArray(data)   
    println(bytes.toAsciiHexString())
    val obj = ProtoBuf.decodeFromByteArray<Project>(bytes)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

We see in the output that the number for the first property name did not change (as it is numbered from one by default), but it did change for the language property.
我们在输出中看到第一个属性 name 的数字没有更改(因为它默认从 1 开始编号),但 language 该属性的数字确实发生了变化。

{0A}{15}kotlinx.serialization{1A}{06}Kotlin
Project(name=kotlinx.serialization, language=Kotlin)

In ProtoBuf hex notation, the output is equivalent to the following:
在 ProtoBuf 十六进制表示法中,输出等效于以下内容:

Field #1: 0A String Length = 21, Hex = 15, UTF8 = "kotlinx.serialization" (total 21 chars)
Field #3: 1A String Length = 6, Hex = 06, UTF8 = "Kotlin"

Integer types 整数类型

Protocol buffers support various integer encodings optimized for different ranges of integers. They are specified using the ProtoType annotation and the ProtoIntegerType enum. The following example shows all three supported options.
协议缓冲区支持针对不同范围的整数优化的各种整数编码。它们是使用 ProtoType 注解和 ProtoIntegerType 枚举指定的。以下示例显示了所有三个受支持的选项。

@Serializable
class Data(
    @ProtoType(ProtoIntegerType.DEFAULT)
    val a: Int,
    @ProtoType(ProtoIntegerType.SIGNED)
    val b: Int,
    @ProtoType(ProtoIntegerType.FIXED)
    val c: Int
)

fun main() {
    val data = Data(1, -2, 3) 
    println(ProtoBuf.encodeToByteArray(data).toAsciiHexString())
}

You can get the full code here.
您可以在此处获取完整代码。

  • The default is a varint encoding (intXX) that is optimized for small non-negative numbers. The value of 1 is encoded in one byte 01.
    默认值是针对小非负数优化的变量编码 ( intXX )。的 1 值以一个字节编码 01 。
  • The signed is a signed ZigZag encoding (sintXX) that is optimized for small signed integers. The value of -2 is encoded in one byte 03.
    有符号是针对小有符号整数优化的有符号之字折线编码 ( sintXX )。的 -2 值以一个字节编码 03 。
  • The fixed encoding (fixedXX) always uses a fixed number of bytes. The value of 3 is encoded as four bytes 03 00 00 00.
    固定编码 ( fixedXX ) 始终使用固定的字节数。的 3 值编码为 4 个字节 03 00 00 00 。

uintXX and sfixedXX protocol buffer types are not supported.
uintXX 并且 sfixedXX 不支持协议缓冲区类型。

{08}{01}{10}{03}{1D}{03}{00}{00}{00}

In ProtoBuf hex notation the output is equivalent to the following:
在 ProtoBuf 十六进制表示法中,输出等效于以下内容:

Field #1: 08 Varint Value = 1, Hex = 01
Field #2: 10 Varint Value = 3, Hex = 03
Field #3: 1D Fixed32 Value = 3, Hex = 03-00-00-00

Lists as repeated fields 作为重复字段列出

By default, kotlin lists and other collections are representend as repeated fields. In the protocol buffers when the list is empty there are no elements in the stream with the corresponding number. For Kotlin Serialization you must explicitly specify a default of emptyList() for any property of a collection or map type. Otherwise you will not be able deserialize an empty list, which is indistinguishable in protocol buffers from a missing field.
默认情况下,kotlin 列表和其他集合表示为重复字段。在协议缓冲区中,当列表为空时,流中没有具有相应编号的元素。对于 Kotlin 序列化,您必须为集合或映射类型的任何属性显式指定默认值 emptyList() 。否则,您将无法反序列化空列表,该空列表在协议缓冲区中无法与缺失字段区分开来。

@Serializable
data class Data(
    val a: List<Int> = emptyList(),
    val b: List<Int> = emptyList()
)

fun main() {
    val data = Data(listOf(1, 2, 3), listOf())
    val bytes = ProtoBuf.encodeToByteArray(data)
    println(bytes.toAsciiHexString())
    println(ProtoBuf.decodeFromByteArray<Data>(bytes))
}

You can get the full code here.
您可以在此处获取完整代码。

{08}{01}{08}{02}{08}{03}
Data(a=[1, 2, 3], b=[])

In ProtoBuf diagnostic mode the output is equivalent to the following:
在 ProtoBuf 诊断模式下,输出等效于以下内容:

Field #1: 08 Varint Value = 1, Hex = 01
Field #1: 08 Varint Value = 2, Hex = 02
Field #1: 08 Varint Value = 3, Hex = 03

Packed fields 填充字段

Collection types (not maps) can be written as packed fields when annotated with the @ProtoPacked annotation. Per the standard packed fields can only be used on primitive numeric types. The annotation is ignored on other types.
集合类型(不是地图)在使用 @ProtoPacked 注释进行注释时可以写入压缩字段。根据标准,打包字段只能用于基元数值类型。在其他类型上,注释将被忽略。

Per the format description the parser ignores the annotation, but rather reads list in either packed or repeated format.
根据格式说明,解析器会忽略注释,而是读取打包或重复格式的列表。

ProtoBuf schema generator (experimental)

ProtoBuf 模式生成器(实验性)

As mentioned above, when working with protocol buffers you usually use a “.proto” file and a code generator for your language. This includes the code to serialize your message to an output stream and deserialize it from an input stream. When using Kotlin Serialization this step is not necessary because your @Serializable Kotlin data types are used as the source for the schema.
如上所述,在使用协议缓冲区时,您通常使用“.proto”文件和语言的代码生成器。这包括将消息序列化为输出流并从输入流反序列化的代码。使用 Kotlin 序列化时,不需要此步骤,因为 @Serializable Kotlin 数据类型用作架构的源。

This is very convenient for Kotlin-to-Kotlin communication, but makes interoperability between languages complicated. Fortunately, you can use the ProtoBuf schema generator to output the “.proto” representation of your messages. You can keep your Kotlin classes as a source of truth and use traditional protoc compilers for other languages at the same time.
这对于 Kotlin 到 Kotlin 的通信非常方便,但使语言之间的互操作性变得复杂。幸运的是,您可以使用 ProtoBuf 模式生成器来输出消息的“.proto”表示形式。您可以将 Kotlin 类保留为事实来源,同时将传统的 protoc 编译器用于其他语言。

As an example, we can display the following data class’s “.proto” schema as follows.
例如,我们可以显示以下数据类的“.proto”模式,如下所示。

@Serializable
data class SampleData(
    val amount: Long,
    val description: String?,
    val department: String = "QA"
)
fun main() {
  val descriptors = listOf(SampleData.serializer().descriptor)
  val schemas = ProtoBufSchemaGenerator.generateSchemaText(descriptors)
  println(schemas)
}

You can get the full code here.
您可以在此处获取完整代码。

Which would output as follows.
这将输出如下。

syntax = "proto2";


// serial name 'example.exampleFormats08.SampleData'
message SampleData {
  required int64 amount = 1;
  optional string description = 2;
  // WARNING: a default value decoded when value is missing
  optional string department = 3;
}

Note that since default values are not represented in “.proto” files, a warning is generated when one appears in the schema.
请注意,由于默认值未在“.proto”文件中表示,因此当架构中出现默认值时,会生成警告。

See the documentation for ProtoBufSchemaGenerator for more information.
有关更多信息,请参见 ProtoBufSchemaGenerator 的文档。

Properties (experimental)

属性(实验性)

Kotlin Serialization can serialize a class into a flat map with String keys via the Properties format implementation.
Kotlin Serialization 可以通过 Properties 格式实现将类序列化为带有 String 键的平面映射。

Properties support is (experimentally) available in a separate org.jetbrains.kotlinx:kotlinx-serialization-properties:<version> module.
属性支持(实验性地)在单独的 org.jetbrains.kotlinx:kotlinx-serialization-properties:<version> 模块中可用。

@Serializable
class Project(val name: String, val owner: User)

@Serializable
class User(val name: String)

fun main() {
    val data = Project("kotlinx.serialization",  User("kotlin"))
    val map = Properties.encodeToMap(data)
    map.forEach { (k, v) -> println("$k = $v") }
}

You can get the full code here.
您可以在此处获取完整代码。

The resulting map has dot-separated keys representing keys of the nested objects.
生成的映射具有表示嵌套对象键的点分隔键。

name = kotlinx.serialization
owner.name = kotlin

Custom formats (experimental)

自定义格式(实验性)

A custom format for Kotlin Serialization must provide an implementation for the Encoder and Decoder interfaces that we saw used in the Serializers chapter.
Kotlin 序列化的自定义格式必须为我们在序列化程序一章中看到的编码器和解码器接口提供实现。
These are pretty large interfaces. For convenience the AbstractEncoder and AbstractDecoder skeleton implementations are provided to simplify the task. In AbstractEncoder most of the encodeXxx methods have a default implementation that delegates to encodeValue(value: Any) — the only method that must be implemented to get a basic working format.
这些都是相当大的接口。为方便起见,提供了 AbstractEncoder 和 AbstractDecoder 框架实现来简化任务。在 AbstractEncoder 中, encodeXxx 大多数方法都有一个委托给 encodeValue(value: Any) 的默认实现,这是获取基本工作格式所必须实现的唯一方法。

Basic encoder 基本编码器

Let us start with a trivial format implementation that encodes the data into a single list of primitive constituent objects in the order they were written in the source code. To start, we implement a simple Encoder by overriding encodeValue in AbstractEncoder.
让我们从一个简单的格式实现开始,该实现按照它们在源代码中的编写顺序将数据编码到基元组成对象的单个列表中。首先,我们通过 encodeValue 覆盖 AbstractEncoder 来实现一个简单的 Encoder。

class ListEncoder : AbstractEncoder() {
    val list = mutableListOf<Any>()

    override val serializersModule: SerializersModule = EmptySerializersModule()

    override fun encodeValue(value: Any) {
        list.add(value)
    }
}

Now we write a convenience top-level function that creates an encoder that encodes an object and returns a list.
现在,我们编写一个方便的顶级函数,该函数创建一个编码器,该编码器对对象进行编码并返回列表。

fun <T> encodeToList(serializer: SerializationStrategy<T>, value: T): List<Any> {
    val encoder = ListEncoder()
    encoder.encodeSerializableValue(serializer, value)
    return encoder.list
}

For even more convenience, to avoid the need to explicitly pass a serializer, we write an inline overload of the encodeToList function with a reified type parameter using the serializer function to retrieve the appropriate KSerializer instance for the actual type.
为了更加方便,为了避免显式传递序列化程序的需要,我们 inline 使用 serializer 函数编写带有 reified 类型参数的 encodeToList 函数重载,以检索实际类型的相应 KSerializer 实例。

inline fun <reified T> encodeToList(value: T) = encodeToList(serializer(), value)

Now we can test it.
现在我们可以测试它了。

@Serializable
data class Project(val name: String, val owner: User, val votes: Int)

@Serializable
data class User(val name: String)

fun main() {
    val data = Project("kotlinx.serialization",  User("kotlin"), 9000)
    println(encodeToList(data))
}

You can get the full code here.
您可以在此处获取完整代码。

As a result, we got all the primitive values in our object graph visited and put into a list in serial order.
结果,我们访问了对象图中的所有基元值,并按顺序放入列表中。

[kotlinx.serialization, kotlin, 9000]

By itself, that’s a useful feature if we need compute some kind of hashcode or digest for all the data that is contained in a serializable object tree.
就其本身而言,如果我们需要为可序列化对象树中包含的所有数据计算某种哈希码或摘要,这是一个有用的功能。

Basic decoder 基本解码器

A decoder needs to implement more substance.
解码器需要实现更多的实质内容。

  • decodeValue — returns the next value from the list.
    decodeValue — 返回列表中的下一个值。
  • decodeElementIndex — returns the next index of a deserialized value. In this primitive format deserialization always happens in order, so we keep track of the index in the elementIndex variable. See the Hand-written composite serializer section on how it ends up being used.
    decodeElementIndex — 返回反序列化值的下一个索引。在这种原始格式中,反序列化总是按顺序进行,因此我们跟踪 elementIndex 变量中的索引。请参阅手写复合序列化程序部分,了解它最终的使用方式。
  • beginStructure — returns a new instance of the ListDecoder, so that each structure that is being recursively decoded keeps track of its own elementIndex state separately.
    beginStructure — 返回 ListDecoder 的新实例,以便以递归方式解码的每个结构都单独跟踪自己的 elementIndex 状态。
class ListDecoder(val list: ArrayDeque<Any>) : AbstractDecoder() {
    private var elementIndex = 0

    override val serializersModule: SerializersModule = EmptySerializersModule()

    override fun decodeValue(): Any = list.removeFirst()

    override fun decodeElementIndex(descriptor: SerialDescriptor): Int {
        if (elementIndex == descriptor.elementsCount) return CompositeDecoder.DECODE_DONE
        return elementIndex++
    }

    override fun beginStructure(descriptor: SerialDescriptor): CompositeDecoder =
        ListDecoder(list)
}

A couple of convenience functions for decoding.
一些方便的解码功能。

fun <T> decodeFromList(list: List<Any>, deserializer: DeserializationStrategy<T>): T {
    val decoder = ListDecoder(ArrayDeque(list))
    return decoder.decodeSerializableValue(deserializer)
}

inline fun <reified T> decodeFromList(list: List<Any>): T = decodeFromList(list, serializer())

That is enough to start encoding and decoding basic serializable classes.
这足以开始编码和解码基本的可序列化类。

fun main() {
    val data = Project("kotlinx.serialization",  User("kotlin"), 9000)
    val list = encodeToList(data)
    println(list)
    val obj = decodeFromList<Project>(list)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

Now we can convert a list of primitives back to an object tree.
现在我们可以将基元列表转换回对象树。

[kotlinx.serialization, kotlin, 9000]
Project(name=kotlinx.serialization, owner=User(name=kotlin), votes=9000)

Sequential decoding 顺序解码

The decoder we have implemented keeps track of the elementIndex in its state and implements decodeElementIndex. This means that it is going to work with an arbitrary serializer, even the simple one we wrote in the Hand-written composite serializer section. However, this format always stores elements in order, so this bookkeeping is not needed and undermines decoding performance. All auto-generated serializers on the JVM support the Sequential decoding protocol (experimental), and the decoder can indicate its support by returning true from the CompositeDecoder.decodeSequentially function.
我们实现的解码器会跟踪 elementIndex 其状态并实现 decodeElementIndex 。这意味着它将与任意序列化程序一起使用,甚至是我们在手写复合序列化程序部分中编写的简单序列化程序。但是,这种格式始终按顺序存储元素,因此不需要这种簿记,并且会破坏解码性能。JVM 上所有自动生成的序列化程序都支持顺序解码协议(实验性),解码器可以通过从 CompositeDecoder.decodeSequentially 函数返回 true 来指示其支持。

class ListDecoder(val list: ArrayDeque<Any>) : AbstractDecoder() {
    private var elementIndex = 0

    override val serializersModule: SerializersModule = EmptySerializersModule()

    override fun decodeValue(): Any = list.removeFirst()

    override fun decodeElementIndex(descriptor: SerialDescriptor): Int {
        if (elementIndex == descriptor.elementsCount) return CompositeDecoder.DECODE_DONE
        return elementIndex++
    }

    override fun beginStructure(descriptor: SerialDescriptor): CompositeDecoder =
        ListDecoder(list) 

    override fun decodeSequentially(): Boolean = true
}        

You can get the full code here.
您可以在此处获取完整代码。

Adding collection support

添加集合支持

This basic format, so far, cannot properly represent collections. In encodes them, but it does not keep track of how many elements there are in the collection or where it ends, so it cannot properly decode them. First, let us add proper support for collections to the encoder by implementing the Encoder.beginCollection function. The beginCollection function takes a collection size as a parameter, so we encode it to add it to the result. Our encoder implementation does not keep any state, so it just returns this from the beginCollection function.
到目前为止,这种基本格式无法正确表示集合。In 对它们进行编码,但它不跟踪集合中有多少元素或它结束的位置,因此它无法正确解码它们。首先,让我们通过实现 Encoder.beginCollection 函数,为编码器添加对集合的适当支持。该 beginCollection 函数将集合大小作为参数,因此我们对它进行编码以将其添加到结果中。我们的编码器实现不保留任何状态,因此它只是从 beginCollection 函数返回 this 。

class ListEncoder : AbstractEncoder() {
    val list = mutableListOf<Any>()

    override val serializersModule: SerializersModule = EmptySerializersModule()

    override fun encodeValue(value: Any) {
        list.add(value)
    }                               

    override fun beginCollection(descriptor: SerialDescriptor, collectionSize: Int): CompositeEncoder {
        encodeInt(collectionSize)
        return this
    }                                                
}

The decoder, for our case, needs to only implement the CompositeDecoder.decodeCollectionSize function in addition to the previous code.
在我们的例子中,解码器除了前面的代码外,只需要实现 CompositeDecoder.decodeCollectionSize 函数。

The formats that store collection size in advance have to return true from decodeSequentially.
提前存储集合大小的格式必须 true 从 decodeSequentially 返回。

class ListDecoder(val list: ArrayDeque<Any>, var elementsCount: Int = 0) : AbstractDecoder() {
    private var elementIndex = 0

    override val serializersModule: SerializersModule = EmptySerializersModule()

    override fun decodeValue(): Any = list.removeFirst()

    override fun decodeElementIndex(descriptor: SerialDescriptor): Int {
        if (elementIndex == elementsCount) return CompositeDecoder.DECODE_DONE
        return elementIndex++
    }

    override fun beginStructure(descriptor: SerialDescriptor): CompositeDecoder =
        ListDecoder(list, descriptor.elementsCount)

    override fun decodeSequentially(): Boolean = true

    override fun decodeCollectionSize(descriptor: SerialDescriptor): Int =
        decodeInt().also { elementsCount = it }
}

That is all that is needed to support collections and maps.
这就是支持馆藏和地图所需的全部内容。

@Serializable
data class Project(val name: String, val owners: List<User>, val votes: Int)

@Serializable
data class User(val name: String)

fun main() {
    val data = Project("kotlinx.serialization",  listOf(User("kotlin"), User("jetbrains")), 9000)
    val list = encodeToList(data)
    println(list)
    val obj = decodeFromList<Project>(list)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

We see the size of the list added to the result, letting the decoder know where to stop.
我们看到列表的大小添加到结果中,让解码器知道在哪里停止。

[kotlinx.serialization, 2, kotlin, jetbrains, 9000]
Project(name=kotlinx.serialization, owners=[User(name=kotlin), User(name=jetbrains)], votes=9000)

Adding null support 添加 null 支持

Our trivial format does not support null values so far. For nullable types we need to add some kind of “null indicator”, telling whether the upcoming value is null or not.
到目前为止,我们的简单格式不支持 null 值。对于可为 null 的类型,我们需要添加某种“空指示符”,告诉即将到来的值是否为 null。

In the encoder implementation we override Encoder.encodeNull and Encoder.encodeNotNullMark.
在编码器实现中,我们重写 Encoder.encodeNull 和 Encoder.encodeNotNullMark。

    override fun encodeNull() = encodeValue("NULL")
    override fun encodeNotNullMark() = encodeValue("!!")

In the decoder implementation we override Decoder.decodeNotNullMark.
在解码器实现中,我们重写 Decoder.decodeNotNullMark。

    override fun decodeNotNullMark(): Boolean = decodeString() != "NULL"

Let us test nullable properties both with not-null and null values.
让我们测试具有非 null 和 null 值的可 null 属性。

@Serializable
data class Project(val name: String, val owner: User?, val votes: Int?)

@Serializable
data class User(val name: String)

fun main() {
    val data = Project("kotlinx.serialization",  User("kotlin") , null)
    val list = encodeToList(data)
    println(list)
    val obj = decodeFromList<Project>(list)
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

In the output we see how not-null!! and NULL marks are used.
在输出中,我们可以看到如何使用 not-null !! 和 NULL marks。

[kotlinx.serialization, !!, kotlin, NULL]
Project(name=kotlinx.serialization, owner=User(name=kotlin), votes=null)

Efficient binary format 高效的二进制格式

Now we are ready for an example of an efficient binary format. We are going to write data to the java.io.DataOutput implementation. Instead of encodeValue we must override the individual encodeXxx functions for each of ten primitives in the encoder.
现在,我们已经准备好了一个高效的二进制格式示例。我们将把数据写入 java.io.DataOutput 实现。相反, encodeValue 我们必须覆盖编码器中十个基元中每个基元的单个 encodeXxx 函数。

class DataOutputEncoder(val output: DataOutput) : AbstractEncoder() {
    override val serializersModule: SerializersModule = EmptySerializersModule()
    override fun encodeBoolean(value: Boolean) = output.writeByte(if (value) 1 else 0)
    override fun encodeByte(value: Byte) = output.writeByte(value.toInt())
    override fun encodeShort(value: Short) = output.writeShort(value.toInt())
    override fun encodeInt(value: Int) = output.writeInt(value)
    override fun encodeLong(value: Long) = output.writeLong(value)
    override fun encodeFloat(value: Float) = output.writeFloat(value)
    override fun encodeDouble(value: Double) = output.writeDouble(value)
    override fun encodeChar(value: Char) = output.writeChar(value.code)
    override fun encodeString(value: String) = output.writeUTF(value)
    override fun encodeEnum(enumDescriptor: SerialDescriptor, index: Int) = output.writeInt(index)

    override fun beginCollection(descriptor: SerialDescriptor, collectionSize: Int): CompositeEncoder {
        encodeInt(collectionSize)
        return this
    }

    override fun encodeNull() = encodeBoolean(false)
    override fun encodeNotNullMark() = encodeBoolean(true)
}

The decoder implementation mirrors encoder’s implementation overriding all the primitive decodeXxx functions.
解码器实现镜像编码器的实现,覆盖所有基元 decodeXxx 函数。

class DataInputDecoder(val input: DataInput, var elementsCount: Int = 0) : AbstractDecoder() {
    private var elementIndex = 0
    override val serializersModule: SerializersModule = EmptySerializersModule()
    override fun decodeBoolean(): Boolean = input.readByte().toInt() != 0
    override fun decodeByte(): Byte = input.readByte()
    override fun decodeShort(): Short = input.readShort()
    override fun decodeInt(): Int = input.readInt()
    override fun decodeLong(): Long = input.readLong()
    override fun decodeFloat(): Float = input.readFloat()
    override fun decodeDouble(): Double = input.readDouble()
    override fun decodeChar(): Char = input.readChar()
    override fun decodeString(): String = input.readUTF()
    override fun decodeEnum(enumDescriptor: SerialDescriptor): Int = input.readInt()

    override fun decodeElementIndex(descriptor: SerialDescriptor): Int {
        if (elementIndex == elementsCount) return CompositeDecoder.DECODE_DONE
        return elementIndex++
    }

    override fun beginStructure(descriptor: SerialDescriptor): CompositeDecoder =
        DataInputDecoder(input, descriptor.elementsCount)

    override fun decodeSequentially(): Boolean = true

    override fun decodeCollectionSize(descriptor: SerialDescriptor): Int =
        decodeInt().also { elementsCount = it }

    override fun decodeNotNullMark(): Boolean = decodeBoolean()
}

We can now serialize and deserialize arbitrary data. For example, the same classes as were used in the CBOR (experimental) and ProtoBuf (experimental) sections.
现在,我们可以序列化和反序列化任意数据。例如,与 CBOR(实验)和 ProtoBuf(实验)部分中使用的类相同。

@Serializable
data class Project(val name: String, val language: String)

fun main() {
    val data = Project("kotlinx.serialization", "Kotlin")
    val output = ByteArrayOutputStream()
    encodeTo(DataOutputStream(output), data)
    val bytes = output.toByteArray()
    println(bytes.toAsciiHexString())
    val input = ByteArrayInputStream(bytes)
    val obj = decodeFrom<Project>(DataInputStream(input))
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

As we can see, the result is a dense binary format that only contains the data that is being serialized. It can be easily tweaked for any kind of domain-specific compact encoding.
正如我们所看到的,结果是一个密集的二进制格式,只包含正在序列化的数据。它可以很容易地调整为任何类型的特定于域的紧凑编码。

{00}{15}kotlinx.serialization{00}{06}Kotlin
Project(name=kotlinx.serialization, language=Kotlin)

Format-specific types 特定于格式的类型

A format implementation might provide special support for data types that are not among the list of primitive types in Kotlin Serialization, and do not have a corresponding encodeXxx/decodeXxx function. In the encoder this is achieved by overriding the encodeSerializableValue(serializer, value) function.
格式实现可能会为不在 Kotlin 序列化中的基元类型列表中且没有相应的 encodeXxx / decodeXxx 函数的数据类型提供特殊支持。在编码器中,这是通过覆盖 encodeSerializableValue(serializer, value) 函数来实现的。

In our DataOutput format example we might want to provide a specialized efficient data path for serializing an array of bytes since DataOutput has a special method for this purpose.
在我们的 DataOutput 格式示例中,我们可能希望提供一个专门的高效数据路径来序列化字节数组,因为 DataOutput 具有用于此目的的特殊方法。

Detection of the type is performed by looking at the serializer.descriptor, not by checking the type of the value being serialized, so we fetch the builtin KSerializer instance for ByteArray type.
类型的检测是通过查看 serializer.descriptor 来执行的,而不是通过检查 value 要序列化的类型来执行的,因此我们获取 ByteArray 类型的内置 KSerializer 实例。

This an important difference. This way our format implementation properly supports Custom serializers that a user might specify for a type that just happens to be internally represented as a byte array, but need a different serial representation.
这是一个重要的区别。这样,我们的格式实现可以正确地支持自定义序列化程序,用户可以为恰好在内部表示为字节数组但需要不同串行表示的类型指定自定义序列化程序。

private val byteArraySerializer = serializer<ByteArray>()

Specifically for byte arrays, we could have also used the builtin ByteArraySerializer function.
特别是对于字节数组,我们还可以使用内置的 ByteArraySerializer 函数。

We add the corresponding code to the Encoder implementation of our Efficient binary format. To make our ByteArray encoding even more efficient, we add a trivial implementation of encodeCompactSize function that uses only one byte to represent a size of up to 254 bytes.
我们将相应的代码添加到高效二进制格式的 Encoder 实现中。为了使我们 ByteArray 的编码更加高效,我们添加了一个简单的 encodeCompactSize 函数实现,该函数仅使用一个字节来表示最多 254 字节的大小。

    override fun <T> encodeSerializableValue(serializer: SerializationStrategy<T>, value: T) {
        if (serializer.descriptor == byteArraySerializer.descriptor)
            encodeByteArray(value as ByteArray)
        else
            super.encodeSerializableValue(serializer, value)
    }

    private fun encodeByteArray(bytes: ByteArray) {
        encodeCompactSize(bytes.size)
        output.write(bytes)
    }

    private fun encodeCompactSize(value: Int) {
        if (value < 0xff) {
            output.writeByte(value)
        } else {
            output.writeByte(0xff)
            output.writeInt(value)
        }
    }            

A similar code is added to the Decoder implementation. Here we override the decodeSerializableValue function.
类似的代码将添加到解码器实现中。在这里,我们重写 decodeSerializableValue 函数。

    @Suppress("UNCHECKED_CAST")
    override fun <T> decodeSerializableValue(deserializer: DeserializationStrategy<T>, previousValue: T?): T =
        if (deserializer.descriptor == byteArraySerializer.descriptor)
            decodeByteArray() as T
        else
            super.decodeSerializableValue(deserializer, previousValue)

    private fun decodeByteArray(): ByteArray {
        val bytes = ByteArray(decodeCompactSize())
        input.readFully(bytes)
        return bytes
    }

    private fun decodeCompactSize(): Int {
        val byte = input.readByte().toInt() and 0xff
        if (byte < 0xff) return byte
        return input.readInt()
    }

Now everything is ready to perform serialization of some byte arrays.
现在一切准备就绪,可以对某些字节数组进行序列化。

@Serializable
data class Project(val name: String, val attachment: ByteArray)

fun main() {
    val data = Project("kotlinx.serialization", byteArrayOf(0x0A, 0x0B, 0x0C, 0x0D))
    val output = ByteArrayOutputStream()
    encodeTo(DataOutputStream(output), data)
    val bytes = output.toByteArray()
    println(bytes.toAsciiHexString())
    val input = ByteArrayInputStream(bytes)
    val obj = decodeFrom<Project>(DataInputStream(input))
    println(obj)
}

You can get the full code here.
您可以在此处获取完整代码。

As we can see, our custom byte array format is being used, with the compact encoding of its size in one byte.
正如我们所看到的,我们正在使用自定义字节数组格式,其大小的紧凑编码在一个字节中。

{00}{15}kotlinx.serialization{04}{0A}{0B}{0C}{0D}
Project(name=kotlinx.serialization, attachment=[10, 11, 12, 13])

This chapter concludes Kotlin Serialization Guide.
本章是《Kotlin 序列化指南》的总结。