On the implementation of Linux Pipeline

ask a question about Linux pipes
for example, the command cat a.log | head
if the a.log size is 1T, how is this command executed?

how does the program cat execute after head prints the first ten lines?
where is the data that is not written to the pipeline? How did cat end? How did it happen?

Jun.22,2021

this actually has something to do with the shell you use. Take the commonly used bash, for example:

  1. execution of pipe commands: open a new child process 1 to run cat a.log , open another process 2 to run head , and create an unnamed pipe, redirect the standard output of 1 to the write end of the pipe, and redirect the standard input of 2 to the read side of the pipe; shell will wait for all commands to be executed before returning.
  2. according to the above, after head exits, how to execute cat and how to execute
  3. all the data is written to the pipeline, but when shell closes all references to the pipeline, the data is dropped by drop; the cat is finished; of course, it is realized by using the system call pipe
  4. .

it is recommended that you take a look at the contents of the / proc/$ {pid} / folder, and you will immediately have a clearer understanding of the pipeline implementation of linux.

after you start the program, find a way to grab your pid (such as opening another terminal using ps aux | grep ), and then go to the folder under pid to see how the pipeline implements this operation.

the design of the pipeline is the most basic flow processing. If you look at the popular flow processing framework at present, it basically draws lessons from the design idea of the pipeline

.

Linux pipe is implemented by caching buffer. And the real pipeline is also very similar, FIFO mode, first-in, first-out, the pipe itself has a certain volume, when the cache is not full, read and write are not affected, but when the buffer is full, write will be blocked. Not until there is room to write again. However, the size of buffer is related to the implementation of the system, and different systems are different, and the same system may change dynamically.

Under

bash, you can test the buffer size through the following script

M=0; while printf A; do >&2 printf "\r$((PPM)) B"; done | sleep 999
Menu