本文共 12524 字,大约阅读时间需要 41 分钟。
kafka应用于区块链
by Luc Russell
卢克·罗素(Luc Russell)
Blockchain technology and Apache Kafka share characteristics which suggest a natural affinity. For instance, both share the concept of an ‘immutable append only log’. In the case of a Kafka partition:
区块链技术和Apache Kafka具有共同的特征,这暗示了自然的亲和力。 例如,两者共享“不可变的仅追加日志”的概念。 如果是Kafka分区:
Each partition is an ordered, immutable sequence of records that is continually appended to — a structured commit log. The records in the partitions are each assigned a sequential id number called the offset that uniquely identifies each record within the partition []
每个分区都是有序的,不变的记录序列,这些记录连续地附加到结构化的提交日志中。 分区中的每个记录均分配有一个顺序ID号,称为偏移量,该ID唯一地标识分区中的每个记录[ ]
Whereas a blockchain can be described as:
而区块链可以描述为:
a continuously growing list of records, called blocks, which are linked and secured using cryptography. Each block typically contains a hash pointer as a link to a previous block, a timestamp and transaction data []
不断增长的记录列表(称为块),这些记录使用密码进行链接和保护。 每个块通常包含一个哈希指针(作为指向前一个块的链接),时间戳和交易数据[ ]
Clearly, these technologies share the parallel concepts of an immutable sequential structure, with Kafka being particularly optimized for high throughput and horizontal scalability, and blockchain excelling in guaranteeing the order and structure of a sequence.
显然,这些技术共享不可变顺序结构的并行概念,其中Kafka特别针对高吞吐量和水平可伸缩性进行了优化,而区块链在保证序列的顺序和结构方面表现出色。
By integrating these technologies, we can create a platform for experimenting with blockchain concepts.
通过集成这些技术,我们可以创建一个试验区块链概念的平台。
Kafka provides a convenient framework for distributed peer to peer communication, with some characteristics particularly suitable for blockchain applications. While this approach may not be viable in a trustless public environment, there could be practical uses in a private or consortium network. See for further ideas on how this could be implemented.
Kafka为分布式对等通信提供了方便的框架,具有一些特别适合于区块链应用程序的特征。 尽管此方法在不信任的公共环境中可能不可行,但在私有或联盟网络中可能会有实际用途。 有关如何实现此功能的更多想法,请参见 。
Additionally, with some experimentation, we may be able to draw on concepts already implemented in Kafka (e.g. sharding by partition) to explore solutions to blockchain challenges in public networks (e.g. scalability problems).
另外,通过一些试验,我们也许能够利用已经在Kafka中实现的概念(例如,按分区分片)来探索解决公共网络中的区块链挑战(例如,可伸缩性问题)的解决方案。
The purpose of this experiment is therefore to take a simple blockchain implementation and port it to the Kafka platform; we’ll take Kafka’s concept of a sequential log and guarantee immutability by chaining the entries together with hashes. The blockchain
topic on Kafka will become our distributed ledger. Graphically, it will look like this:
因此,本实验的目的是采用简单的区块链实现并将其移植到Kafka平台。 我们将采用Kafka的顺序日志的概念,并通过将条目与哈希值链接在一起来确保不变性。 卡夫卡上的blockchain
主题将成为我们的分布式账本。 在图形上,它将如下所示:
Kafka is a streaming platform designed for high-throughput, real-time messaging, i.e. it enables publication and subscription to streams of records. In this respect it is similar to a message queue or a traditional enterprise messaging system. Some of the characteristics are:
Kafka是用于高吞吐量,实时消息传递的流媒体平台,即,它可以发布和订阅记录流。 在这方面,它类似于消息队列或传统的企业消息传递系统。 一些特征是:
High throughput: Kafka brokers can absorb gigabytes of data per second, translating into millions of messages per second. You can read more about the scalability characteristics in .
高吞吐量:Kafka代理可以每秒吸收千兆字节的数据,每秒可以转换成数百万条消息。 您可以在了解有关可伸缩性特征的更多信息。
Competing consumers: Simultaneous delivery of messages to multiple consumers, typically expensive in traditional messaging systems, is no more complex than for a single consumer. This means we can design for , guaranteeing that each consumer will receive only one of the messages and achieving a high degree of horizontal scalability.
竞争的消费者:向多个消费者同时传递消息(通常在传统消息传递系统中价格昂贵)并不比单个消费者复杂。 这意味着我们可以为进行设计,从而确保每个消费者仅接收到一条消息,并实现高度的水平可扩展性。
In Kafka, each topic is split into partitions, where each partition is a sequence of records which is continually appended to. This is similar to a text log file, where new lines are appended to the end. The entries in the partition are each assigned a sequential id, called an offset, which uniquely identifies the record.
在Kafka中,每个主题都分为多个分区,每个分区都是一系列记录,这些记录不断地附加到该记录上。 这类似于文本日志文件,该文件的末尾添加了新行。 每个分区中的条目都分配有一个顺序ID,称为偏移量,用于唯一标识记录。
The Kafka broker can be queried by offset, i.e. a consumer can reset its offset to some arbitrary point in the log to retrieve records from that point forward.
可以通过偏移量查询Kafka代理,即,使用者可以将其偏移量重置为日志中的任意点,以从该点开始检索记录。
Full source code is .
完整的源代码 。
Some understanding of blockchain concepts: The tutorial below is based on implementations from and , both excellent practical introductions. The following tutorial builds heavily on these concepts, while using Kafka as the message transport. In effect, we’ll port a Python blockchain to Kafka, while maintaining most of the current implementation.
对区块链概念的一些理解:以下教程基于和 ,它们都是出色的实用介绍。 下面的教程在将Kafka用作消息传输的同时,也以这些概念为基础。 实际上,我们将在保持大多数当前实现的同时将Python区块链移植到Kafka。
: docker-compose is used to run the Kafka broker.
:docker-compose用于运行Kafka代理。
: This is a useful tool for interacting with Kafka (e.g. publishing messages to topics)
:这是与Kafka进行交互的有用工具(例如,将消息发布到主题)
On startup, our Kafka consumer will try to do three things: initialize a new blockchain if one has not yet been created; build an internal representation of the current state of the blockchain topic; then begin reading transactions in a loop:
在启动时,我们的Kafka消费者将尝试做三件事:如果尚未创建一个新的区块链,则对其进行初始化; 建立区块链主题当前状态的内部表示; 然后开始循环读取事务:
The initialization step looks like this:
初始化步骤如下所示:
First, we find the highest available offset on the blockchain topic. If nothing has ever been published to the topic, the blockchain is new, so we start by creating and publishing the genesis block:
首先,我们在区块链主题上找到最高的可用偏移量。 如果尚未对该主题发布任何东西,则区块链是新的,因此我们首先创建和发布创世块:
In read_and_validate_chain()
, we’ll first create a consumer to read from the blockchain
topic:
在read_and_validate_chain()
,我们首先创建一个消费者以读取read_and_validate_chain()
blockchain
主题:
Some notes on the parameters we’re creating this consumer with:
关于我们使用以下方法创建此使用者的一些注意事项:
Setting the consumer group to the blockchain
group allows the broker to keep a reference of the offset the consumers have reached, for a given partition and topic
将消费者组设置为blockchain
组可允许经纪人针对给定的分区和主题保留消费者已达到的偏移量的参考
auto_offset_reset=OffsetType.EARLIEST
indicates that we’ll begin downloading messages from the start of the topic.
auto_offset_reset=OffsetType.EARLIEST
表示我们将从主题的开头开始下载消息。
auto_commit_enable=True
periodically notifies the broker of the offset we’ve just consumed (as opposed to manually committing)
auto_commit_enable=True
定期通知经纪人我们刚刚消耗的偏移量(与手动提交相对)
reset_offset_on_start=True
is a switch which activates the auto_offset_reset
for the consumer
reset_offset_on_start=True
是一个为使用者激活auto_offset_reset
的开关
consumer_timeout_ms=5000
will trigger the consumer to return from the method after five seconds if no new messages are being read (we’ve reached the end of the chain)
consumer_timeout_ms=5000
将在五秒钟后触发使用者返回方法,如果没有新消息被读取(我们已经到达链的末尾)
Then we begin reading block messages from the blockchain
topic:
然后,我们开始从区块blockchain
主题中读取阻止消息:
For each message we receive:
对于每条消息,我们收到:
At the end of this process, we’ll have downloaded the whole chain, discarding any invalid blocks, and we’ll have a reference to the offset of the latest block.
在此过程结束时,我们将下载整个链,丢弃所有无效块,并且将引用最新块的偏移量。
At this point, we’re ready to create a consumer on the transactions
topic:
至此,我们准备在transactions
主题上创建使用者:
Our example topic has been created with two partitions, to demonstrate how partitioning works in Kafka. The partitions are set up in the docker-compose.yml
file, with this line:
我们的示例主题已经创建了两个分区,以演示分区在Kafka中的工作方式。 分区在docker-compose.yml
文件中设置,行如下:
KAFKA_CREATE_TOPICS=transactions:2:1,blockchain:1:1
KAFKA_CREATE_TOPICS=transactions:2:1,blockchain:1:1
transactions:2:1
specifies the number of partitions and the replication factor (i.e. how many brokers will maintain a copy of the data on this partition).
transactions:2:1
指定分区数和复制因子(即,有多少代理将在此分区上维护数据副本)。
This time, our consumer will start from OffsetType.LATEST
so we only get transactions published from the current time onwards.
这次,我们的使用者将从OffsetType.LATEST
开始,因此我们仅从当前时间开始发布交易。
By pinning the consumer to a specific partition of the transactions
topic, we can increase the total throughput of all consumers on the topic. The Kafka broker will evenly distribute incoming messages across the two partitions of the transactions topic, unless we specify a partition when we publish to the topic. This means each consumer will be responsible for processing 50% of the messages, doubling the potential throughput of a single consumer.
通过将消费者固定在transactions
主题的特定分区上,我们可以增加该主题上所有消费者的总吞吐量。 Kafka代理将在事务主题的两个分区之间平均分配传入消息,除非在发布到该主题时指定一个分区。 这意味着每个使用者将负责处理50%的消息,使单个使用者的潜在吞吐量增加一倍。
Now we can begin consuming transactions:
现在我们可以开始使用交易了:
As transactions are received, we’ll add them to an internal list. Every three transactions, we’ll create a new block and call mine()
:
收到交易后,我们会将其添加到内部列表中。 每三笔交易,我们将创建一个新块并调用mine()
:
If new blocks have already been appended, we’ll make use of the read_and_validate_chain
from before, this time supplying our latest known offset to retrieve only the newer blocks.
如果已经添加了新的块,那么我们将使用之前的read_and_validate_chain
,这次提供我们最新的已知偏移量以仅检索较新的块。
docker-compose up -d
docker-compose up -d
2. Run a consumer on partition 0:
2.在分区0上运行使用者:
python kafka_blockchain.py 0
python kafka_blockchain.py 0
3. Publish 3 transactions directly to partition 0:
3.直接将3个事务发布到分区0:
4. Check the transactions were added to a block on the blockchain
topic:
4.检查将交易添加到关于区块blockchain
的区块:
kafkacat -C -b kafka:9092 -t blockchain
kafkacat -C -b kafka:9092 -t blockchain
You should see output like this:
您应该看到如下输出:
To balance transactions across two consumers, start a second consumer on partition 1, and remove -p 0
from the publication script above.
要在两个使用者之间平衡事务,请在分区1上启动另一个使用者,然后从上面的发布脚本中删除-p 0
。
Kafka can provide the foundation for a simple framework for blockchain experimentation. We can take advantage of features built into the platform, and associated tools like kafkacat, to experiment with distributed peer to peer transactions.
Kafka可以为区块链实验的简单框架提供基础。 我们可以利用平台内置的功能以及诸如kafkacat之类的相关工具来试验分布式对等事务。
While scaling transactions in a public setting presents one set of issues, within a private network or consortium, where real-world trust is already established, transaction scaling might be achieved via an implementation which takes advantage of Kafka concepts.
虽然在公共场所扩展事务会带来一系列问题,但在已经建立了真实世界信任的专用网络或财团内部,可以通过利用Kafka概念的实现来实现事务扩展。
翻译自:
kafka应用于区块链
转载地址:http://fjgwd.baihongyu.com/