The cms yonggc frequency suddenly skyrocketed and the cpu was full-how to locate the cause of this phenomenon?

1. Yesterday noon server monitoring hint: server xx.xx machine cpu utilization exceeded 20% for 36 times in a row
2. Server basic information: javaweb project, provide query, export and other functions.
3. The monitoring information is as follows:
yonggc Monitoring:

clipboard.png
:

clipboard.png
cpu:

clipboard.png
full-gc:

clipboard.png

clipboard.png

at that time, the gclog: of the machine
above is the monitoring information ~
due to the fact that other colleagues manually performed a fullgc at 14: 32, the memory was reclaimed and the calm was restored.
at that time, Startup parameters:-Xms4g-Xmx4g-XX:PermSize=300m-XX:MaxPermSize=300m-XX:NewSize=2g-XX:MaxNewSize=2g-XX:SurvivorRatio=20-XX:MaxTenuringThreshold=10-XX:+UseParNewGC-XX:+UseConcMarkSweepGC-XX:ParallelGCThreads=4-XX:+CMSScavengeBeforeRemark-XX:+CMSParallelRemarkEnabled-XX:+ScavengeBeforeFullGC-XX:+UseCMSCompactAtFullCollection-XX:CMSInitiatingPermOccupancyFraction=70-XX:+ExplicitGCInvokesConcurrent-XX:CMSFullGCsBeforeCompaction=5-XX:+CMSClassUnloadingEnabled-XX:+UseCMSInitiatingOccupancyOnly-XX:CMSInitiatingOccupancyFraction=30-Xloggc:/tomcat7-gc-cms.log-XX:+PrintGCDateStamps-XX:+PrintGCDetails-XX:+PrintHeapAtGC-XX:+PrintReferenceGC-XX:+PrintTenuringDistribution
by viewing Tomcat7-gc-cms.log analysis process:
14:23 there is a fullgc log location:
fullgc--start

2018-04-16T14:23:45.747+0800: 356595.145: [CMS-concurrent-mark: 1.508/1.538 secs] [Times: user=12.08 sys=0.00, real=1.54 secs] 
2018-04-16T14:23:45.748+0800: 356595.145: [CMS-concurrent-preclean-start]
2018-04-16T14:23:45.748+0800: 356595.145: [Preclean SoftReferences, 0.0005960 secs]2018-04-16T14:23:45.748+0800: 356595.145: [Preclean WeakReferences, 0.0001810 secs]2018-04-16T14:23:45.748+0800: 356595.146: [Preclean FinalReferences, 0.0000650 secs]2018-04-16T14:23:45.748+0800: 356595.146: [Preclean PhantomReferences, 0.0000160 secs]2018-04-16T14:23:45.756+0800: 356595.153: [CMS-concurrent-preclean: 0.008/0.009 secs] [Times: user=0.07 sys=0.00, real=0.01 secs] 
2018-04-16T14:23:45.756+0800: 356595.154: [CMS-concurrent-abortable-preclean-start]
{Heap before GC invocations=652 (full 1):
 par new generation   total 2001856K, used 1915155K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K, 100% used [0x00000006ed400000, 0x00000007619e0000, 0x00000007619e0000)
  from space 95296K,   9% used [0x00000007619e0000, 0x0000000762244da8, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 629223K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:45.976+0800: 356595.373: [GC2018-04-16T14:23:45.976+0800: 356595.373: [ParNew2018-04-16T14:23:45.988+0800: 356595.385: [SoftReference, 0 refs, 0.0000510 secs]2018-04-16T14:23:45.988+0800: 356595.385: [WeakReference, 25 refs, 0.0000140 secs]2018-04-16T14:23:45.988+0800: 356595.385: [FinalReference, 45 refs, 0.0000290 secs]2018-04-16T14:23:45.988+0800: 356595.385: [PhantomReference, 0 refs, 0.0000100 secs]2018-04-16T14:23:45.988+0800: 356595.385: [JNI Weak Reference, 0.0000100 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    3537312 bytes,    3537312 total
- age   2:    1668536 bytes,    5205848 total
- age   3:     337152 bytes,    5543000 total
- age   4:      14496 bytes,    5557496 total
- age   5:      16456 bytes,    5573952 total
- age   6:      10904 bytes,    5584856 total
- age   7:      12664 bytes,    5597520 total
- age   8:      14056 bytes,    5611576 total
- age   9:      11808 bytes,    5623384 total
- age  10:      18184 bytes,    5641568 total
: 1915155K->8115K(2001856K), 0.0122020 secs] 2544378K->637347K(4099008K), 0.0124670 secs] [Times: user=0.04 sys=0.00, real=0.01 secs] 
Heap after GC invocations=653 (full 1):
 par new generation   total 2001856K, used 8115K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767edcca0, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 629232K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}
2018-04-16T14:23:46.346+0800: 356595.743: [CMS-concurrent-abortable-preclean: 0.274/0.590 secs] [Times: user=4.61 sys=0.00, real=0.59 secs] 
2018-04-16T14:23:46.351+0800: 356595.748: [GC[YG occupancy: 1249269 K (2001856 K)]{Heap before GC invocations=653 (full 1):
 par new generation   total 2001856K, used 1249269K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,  65% used [0x00000006ed400000, 0x00000007390109d8, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767edcca0, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 629232K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:46.351+0800: 356595.748: [GC2018-04-16T14:23:46.351+0800: 356595.748: [ParNew2018-04-16T14:23:46.363+0800: 356595.760: [SoftReference, 0 refs, 0.0000360 secs]2018-04-16T14:23:46.363+0800: 356595.760: [WeakReference, 24 refs, 0.0000140 secs]2018-04-16T14:23:46.363+0800: 356595.760: [FinalReference, 48 refs, 0.0000200 secs]2018-04-16T14:23:46.363+0800: 356595.760: [PhantomReference, 0 refs, 0.0000100 secs]2018-04-16T14:23:46.363+0800: 356595.760: [JNI Weak Reference, 0.0000090 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    2807512 bytes,    2807512 total
- age   2:    1240272 bytes,    4047784 total
- age   3:     487568 bytes,    4535352 total
- age   4:      86712 bytes,    4622064 total
- age   5:       8384 bytes,    4630448 total
- age   6:      12944 bytes,    4643392 total
- age   7:       9152 bytes,    4652544 total
- age   8:      10872 bytes,    4663416 total
- age   9:      11560 bytes,    4674976 total
- age  10:       8456 bytes,    4683432 total
: 1249269K->5485K(2001856K), 0.0116790 secs] 1878501K->634726K(4099008K), 0.0119130 secs] [Times: user=0.04 sys=0.00, real=0.02 secs] 
Heap after GC invocations=654 (full 1):
 par new generation   total 2001856K, used 5485K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   5% used [0x00000007619e0000, 0x0000000761f3b450, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 629241K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143890K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}
2018-04-16T14:23:46.363+0800: 356595.760: [Rescan (parallel) , 0.0037940 secs]2018-04-16T14:23:46.367+0800: 356595.764: [weak refs processing2018-04-16T14:23:46.367+0800: 356595.764: [SoftReference, 14978 refs, 0.0015960 secs]2018-04-16T14:23:46.368+0800: 356595.766: [WeakReference, 449 refs, 0.0000620 secs]2018-04-16T14:23:46.369+0800: 356595.766: [FinalReference, 1472 refs, 0.0051820 secs]2018-04-16T14:23:46.374+0800: 356595.771: [PhantomReference, 1 refs, 0.0000130 secs]2018-04-16T14:23:46.374+0800: 356595.771: [JNI Weak Reference, 0.0000100 secs], 0.0069140 secs]2018-04-16T14:23:46.374+0800: 356595.771: [class unloading, 0.0269340 secs]2018-04-16T14:23:46.401+0800: 356595.798: [scrub symbol table, 0.0164990 secs]2018-04-16T14:23:46.417+0800: 356595.815: [scrub string table, 0.0036960 secs] [1 CMS-remark: 629241K(2097152K)] 634726K(4099008K), 0.0799200 secs] [Times: user=0.12 sys=0.00, real=0.08 secs] 
2018-04-16T14:23:46.431+0800: 356595.828: [CMS-concurrent-sweep-start]
2018-04-16T14:23:46.802+0800: 356596.199: [CMS-concurrent-sweep: 0.363/0.371 secs] [Times: user=2.89 sys=0.00, real=0.37 secs] 
2018-04-16T14:23:46.802+0800: 356596.199: [CMS-concurrent-reset-start]
2018-04-16T14:23:46.835+0800: 356596.233: [CMS-concurrent-reset: 0.033/0.033 secs] [Times: user=0.27 sys=0.00, real=0.04 secs] 
**fullgc-end**
fullgcyonggc:
{Heap before GC invocations=654 (full 1):
 par new generation   total 2001856K, used 1912541K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K, 100% used [0x00000006ed400000, 0x00000007619e0000, 0x00000007619e0000)
  from space 95296K,   6% used [0x00000007619e0000, 0x0000000761fb7548, 0x00000007676f0000)
  to   space 95296K,   0% used [0x00000007676f0000, 0x00000007676f0000, 0x000000076d400000)
 concurrent mark-sweep generation total 2097152K, used 259145K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143429K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
2018-04-16T14:23:47.037+0800: 356596.434: [GC2018-04-16T14:23:47.037+0800: 356596.434: [ParNew2018-04-16T14:23:47.049+0800: 356596.446: [SoftReference, 0 refs, 0.0000370 secs]2018-04-16T14:23:47.049+0800: 356596.446: [WeakReference, 26 refs, 0.0000130 secs]2018-04-16T14:23:47.049+0800: 356596.446: [FinalReference, 43 refs, 0.0000470 secs]2018-04-16T14:23:47.049+0800: 356596.446: [PhantomReference, 0 refs, 0.0000090 secs]2018-04-16T14:23:47.049+0800: 356596.446: [JNI Weak Reference, 0.0000100 secs]
Desired survivor size 48791552 bytes, new threshold 10 (max 10)
- age   1:    4083168 bytes,    4083168 total
- age   2:    1029504 bytes,    5112672 total
- age   3:     362720 bytes,    5475392 total
- age   4:      28200 bytes,    5503592 total
- age   5:      56616 bytes,    5560208 total
- age   6:       8384 bytes,    5568592 total
- age   7:      12944 bytes,    5581536 total
- age   8:       7328 bytes,    5588864 total
- age   9:       8472 bytes,    5597336 total
- age  10:      11560 bytes,    5608896 total
: 1912541K->8400K(2001856K), 0.0122020 secs] **2171686K->267554K**(4099008K), 0.0124480 secs] [Times: user=0.05 sys=0.00, real=0.01 secs] 
Heap after GC invocations=655 (full 1):
 par new generation   total 2001856K, used 8400K [0x00000006ed400000, 0x000000076d400000, 0x000000076d400000)
  eden space 1906560K,   0% used [0x00000006ed400000, 0x00000006ed400000, 0x00000007619e0000)
  from space 95296K,   8% used [0x00000007676f0000, 0x0000000767f24158, 0x000000076d400000)
  to   space 95296K,   0% used [0x00000007619e0000, 0x00000007619e0000, 0x00000007676f0000)
 concurrent mark-sweep generation total 2097152K, used 259153K [0x000000076d400000, 0x00000007ed400000, 0x00000007ed400000)
 concurrent-mark-sweep perm gen total 307200K, used 143429K [0x00000007ed400000, 0x0000000800000000, 0x0000000800000000)
}

the old age has indeed come down: 2171686K-> 267554K
but why is it recycled frequently in the new generation?
Thread gclog- link: https://pan.baidu.com/s/1jz5I. password: ddrz

for the time being, my own solution is to expand the new generation to 3G
have you ever encountered this kind of problem?

Mar.11,2021

1. Judging from the curves of the first and third graphs, high CPU--> frequent ygc results in
2. Usually, the benefit of adjusting survivorRatio is not so obvious, and the overall income is not as good as adjusting the size of the Cenozoic generation. From the gc log, the older generation only uses more than 600m when it is the biggest, so you can choose to expand the size of the new generation and see if there are any problems.
3. The frequent gc, of the new generation has something to do with the nature of your service. provides query, export and other functions . For the export function, look at the implementation mechanism. If it is an one-time load to memory calculation, and then write to response, at last, then memory will be occupied during this period. If there are more similar export requests at the same time, the pressure of the new generation of garbage collection will be great. Judging from the peak usage of the older generation, there is no memory leak, that is, these objects are basically recycled after a few ygc. As a result, the new generation continues to produce objects, continue to recycle parts, and continue to be promoted to the older generation. The frequent ygc, is mainly caused by the fact that the speed of garbage collection can not keep up with the speed of creating new objects.


it is normal to recycle frequently in the new generation.
see a parameter:
-XX:SurvivorRatio=20
is the Survivor area a little small?


survivor is deliberately set to be smaller, so that objects that cannot be recycled can be accelerated into the old age


it is almost impossible to find the reason just by looking at the memory log. You must combine the access log and the business log to locate which piece of business logic the problem may occur, and then troubleshoot in the code.

Menu