Linux cpu has a high load, and there are no cpu-intensive processes in top.

the, load average load with 16 cores of cpu is stable at about 12.
check with top that there are no processes with high cpu usage.
how can I find out what the problem is?
with three consecutive screenshots, you can see that php-fpm and nginx take up the most cpu, but it is not too high to occupy 1% as a web server. P is also used to sort in top, and the corresponding process is never found. It has also been rebooted, and the problem has not been solved. The top screenshot is operated under the use of root. Is it possible that the process is hidden and the server is hacked?

:
:Linux web02 2.6.18-407.el5xen -sharp1 SMP Wed Nov 11 08:54:02 EST 2015 x86_64 x86_64 x86_64 GNU/Linuxperfyum
vmstatio

iotopiostat iocentos5.6perfcpu 16 75%us

that"s right, as a web server, I turned off the web service, and the load has not changed so high all the time. The system is really old, and 5.5 centos mirror is no longer supported, mainly because there are several other web servers, database servers, 4-5, which are the same as this system on hand. The configuration of the system is almost the same, but this cpu always occupies a high level. I can"t see the progress in top, so I really don"t know where to start to find the problem. In terms of network traffic? From the firewall? Or something else.


load average also contains uninterruptible sleep processes. Use iostat to see if you are performing disk IO.


is mainly occupied by user user mode. You can take a look at the hotspot function with perf top. According to the hotspot function, you can further infer which process or program you may belong to. The
perf tool is available in current mainstream distributions, but it may not be available if the version is older. The subject had better post the version information of the system.
is similar to the following:

hidtak@hidtak:~$ uname -a
Linux hidtak 4.15.0-20-generic -sharp21-Ubuntu SMP Tue Apr 24 06:16:15 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Intel's VTune is another method (trial version is fine), and older versions can also be used.

< hr >

May 31, 2018.31 update:
the kernel version of the subject is too old, perf supports at least 3.0 kernels.

this kind of problem is troublesome to analyze, because it may be the underlying code BUG or mechanism defect; the lower version of the kernel lacks some analysis tools and tools, which is even more troublesome.

I think we can only use the stupid method of elimination, because the subject has mentioned that it will reappear after restart. (is it an immediate reappearance? Or after doing something? Or after a business request? ).
then a series of services and applications will be pulled up during system startup. Just turn it off and rule it out one by one.

< hr >

in addition, in principle, the 2.6 kernel version statistics CPU (core) occupancy rate is to read / proc/stat file, and the statistical process occupancy rate is to read / proc/ < pid > / stat file.

then there are several situations that I can think of:

1) the BUG, of TOP does not show full? Or is it sorted in the wrong way? The default is to sort by CPU!
this is easy to confirm, just see if all the pid under / proc are displayed in TOP. Note that TOP is paged; you can use the top batch option to output to a file.

2) the kernel or underlying code BUG.
you can look at Google-related cases, and maybe you can find similar BUG cases.

3) the kernel was attacked by hackers and malicious code was injected. As a result, no node corresponding to PID was generated in / proc/, and the process information could not be read by TOP.
A hacker usually doesn't just take up CPU, and sends out messages, otherwise it doesn't have any practical significance to him, so you can monitor network activities to see if there are any traces. For example, you can grab packets to see if there are any abnormal network addresses and abnormal data sending and receiving behaviors. Of course,
could also be a virus that simply consumes CPU.
whether it is a virus or Trojan, it also exists after reboot, indicating that it has an execution file on disk and is set to start automatically, which must be reflected in the startup configuration, and it can also be found by elimination.

the above is for reference.


ps H-eo user,pid,ppid,tid,time,%cpu,cmd-- sort=%cpu
I found such a command, and then I saw in the result that the process that occupied cpu was occupied by the system.
root 30849 1 30858 03:16:37 99.7 [kacpi_notify]
root 30849 1 30859 03:16:35 99.7 [kacpi_notify]
root 30849 1 30860 03:16:36 99.7 [kacpi_notify]
root 30849 1 30861 03:16:37 99.7 [kacpi_notify]
root 30849 1 30863 03:16:37 99.7 [kacpi_notify]
root 30849 1 30864 03:16:38 99.7 [kacpi_notify]
root 30849 1 30865 03:16:36 99.7 [kacpi_notify]
root 30849 1 30866 03:16:38 99.7 [kacpi_notify]
you can see this process in iotop. However, there is no cpu usage in io, and the notification process of [kacpi_notify] acpi process cannot be seen in top. Acpi is the interface of the power supply, which should be due to the fact that the machine is too old and has been used for more than ten years.

Menu