Consume kafka data and then check the database. How to deal with this scenario?

now there is a scenario in which you subscribe to kafka"s topic, to get the message in real time, then check the database through the message content to get some fields necessary to build the document, and then consume the constructed document.

now a big problem is that spending power can"t keep up. I don"t know if there is a good solution in this scenario.


is the consumer currently using multithreading? Multi-instance deployment can be adopted. Each instance is a process, that is, a consumer, each consumer reuses a thread pool, consumes


multiple consumers asynchronously, and each consumer uses a thread pool to asynchronously process


I also encounter the problem of low consumption power when consuming kafka with python. I use kafka-python, and then multiprocess processing. But every process in this scheme is blocked by Synchronize. It is understood that aiokafka can be used instead of implementation. One more thing to share.

there are two ways to avoid blocking calls:
. Run each blocked version
in a separate thread. Convert each blocking operation to a non-blocking asynchronous call

Menu