How is it that the file size in linux: ls-l is larger than that obtained by du?

View the files under a topc of kafka

[root@localhost TOPIC_QUEUE_ID-0]-sharp ls-l
total 2932
-rw-r--r-- 1 root root 10485760 Oct 30 2017 00000000000003771019.index
-rw-r--r-- 1 root root 2985451 Oct 30 2017 00000000000003771019.log
-rw-r--r-- 1 root root 10485756 Oct 26 2017 00000000000003771019.timeindex
[root@localhost TOPIC_QUEUE_ID-0]-sharp du-m 00000000000003771019.log
3 00000000000003771019.log
[root@localhost TOPIC_QUEUE_ID-0]-sharp du-m 000000000003771019 .index
1 00000000000003771019.index
[root@localhost TOPIC_QUEUE_ID-0]-sharp du-k 00000000000003771019.index
8 00000000000003771019.index
[root@localhost TOPIC_QUEUE_ID-0]-sharp du-k 00000000000003771019.log
2920 00000000000003771019.log

index file is 10m under ls-l, but 8KB
under du-k. What"s going on?

Mar.28,2021

because what you see is sparse files

for example, the hard disk file of a virtual machine is a typical sparse file. And the official document of kafka also says that the log file is to create a sparse file. For example, the following content is extracted from the official document:

< table > < thead > < tr > < th > PROPERTY < / th > < th > DEFAULT < / th > < th > DESCRIPTION < / th > < / tr > < / thead > < tbody > < tr > < td > log.index.size.max.bytes < / td > < td > 10 * 1024 * 1024 < / td > < td > The maximum size in bytes we allow for the offset index for each log segment. Note that we will always pre-allocate a sparse file with this much space and shrink it down when the log rolls. If the index fills up we will roll a new log segment even if we haven't reached the log.segment.bytes limit. < / td > < / tr > < / tbody > < / table >

Note the description of the official document. Each log segment is a pre-created sparse file

Menu