tomcat 中TIMED_WAITING 100 个线程,导致它在线程总数超过 200 时停止

最近,我们的一个生产tomcat服务器变得无响应,因为tomcat的繁忙线程激增至200个。当我们在重新启动之前进行线程转储时,我们得到了100个处于TIMED_WAITING状态的线程,例如以下3个线程:

""http-bio-7007"-exec-241" daemon prio=10 tid=0x00002aaab107b000 nid=0x59df waiting on condition [0x0000000051239000]
java.lang.Thread.State: TIMED_WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
   at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
   at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
   at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:662)

""http-bio-7007"-exec-237" daemon prio=10 tid=0x00002aaab186e000 nid=0x596d waiting on condition [0x000000004d1f9000]
java.lang.Thread.State: TIMED_WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
   at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
   at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
   at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:662)

""http-bio-7007"-exec-236" daemon prio=10 tid=0x00002aaab1118000 nid=0x596c waiting on condition [0x000000004e50c000]
java.lang.Thread.State: TIMED_WAITING (parking)
   at sun.misc.Unsafe.park(Native Method)
   - parking to wait for  <0x0000000580d877d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
   at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:198)
   at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2025)
   at java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:424)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:86)
   at org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:32)
   at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:945)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
   at java.lang.Thread.run(Thread.java:662)

我们有4个应用程序的线程池(例如pool-4-thread-20等),每个线程也有20个线程,所以我不确定这100个线程在哪个阻塞队列上等待?我们使用带有休眠状态的c3P0连接池,这似乎不是导致这种情况的原因。

任何想法 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject 是什么?


答案 1

当我们修复由c3p0管理的泄漏数据库连接的代码时,这个问题得到了修复。在我们的代码中,在最终块中关闭实体管理器之前,我们没有专门在 catch 块中调用 rollback() 的流,因此,如果异常连接没有返回到池,并且如果异常频率很高(超过超时间隔内的池大小),则所有其他进程线程将堆积起来以获取连接。


答案 2

任何想法 java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject 是什么?

条件对象在队列内部用于同步不同线程对队列的访问。

它是标准的堆栈跟踪,当执行程序池的线程处于空闲状态并等待新任务时。