Is the order of filebeat output to es, log out of order?

mainly wants to use filebeat to output directly to es, without using logstash. Used for log crawling.

problem: after filebeat outputs the log to es, when looking at the log, it is found that the order in the log is out of order.
developers seem to have a lot of effort.
for example:

        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 871 DEBUG xxx
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 871 DEBUG 
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 872 INFO 
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 869 INFO 
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 871 DEBUG
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 871 INFO 
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 869 INFO  
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 869 DEBUG 
        April 12th 2018, 15:44:29.443   2018-04-12 15:44:25 870 DEBUG

the following milliseconds are cluttered without a normal sort input to es.

ask anyone who knows more about filebeat to help answer!

Thank you!

Mar.02,2021

it depends on whether your log is collected from one machine or multiple machines; whether your filebeat and your es are on the same machine or belong to two machines.

as I understand it, if the ide/en/beats/filebeat/1.1/configuration-filebeat-options.html-sharp_publish_async" rel=" nofollow noreferrer "> publish_async option is not enabled in the filebeat configuration, then the output of a single filebeat instance is always in the same file order.
but if you deploy multiple filebeat, on multiple servers Because each system clock may not be exactly the same at all times (resulting in the log recording time may not be the same on each server), and transmitted through the network, there is no guarantee that the data received by es is the same as that indicated in the log.
even if there is only one filebeat instance, if it is deployed on a different server from es, it only eliminates the problem of system clock. However, network problems may still cause late logs to be received by es first.

if there is a requirement for log timing, it is best to parse the log time into es timestamp. through a logstash,


recently solved this problem. Our scenario is to output json-file in docker environment, and the log driver is nanosecond precision, but the application (such as java) is millisecond. In es, timestamp is date, which means millisecond precision. This means that even if you pass nanosecond logs to es, you will lose precision because of the data type. Our scenario is more complicated than this, because there are so many nanosecond logs in the same millisecond. If you just give it to es to sort, a tragedy will happen. After the loss of precision, the logs will be out of order.
solution:

  1. obtain the source of the log itself through source to ensure that the log sources are not serialized with each other
  2. retains the precision of nanosecond logs. When passed to es, nanosecond logs are saved separately as a string type timestamp using pipeline (for the reasons above)
  3. if es finds disorder through timestamp sorting, then sort strings through this nanosecond time

the solution has been tested and has been installed

Menu