Thursday, July 15, 2010

Java ThreadPool and Large Training

A little shocking discovery we had on the Java ThreadPoolExecutor implementation: our large training cluster would somehow never filled up with a specific kind of training task. The culprit was eventually identified in the ThreadPoolExecutor, which gracefully queued up all tasks. I always thought new threads will be created when the queue is not empty and the number of threads is less than the maxPoolSize. Obviously, I did not read the fine-print in JavaDoc. There were quite a few different strategies described in ThreadPoolExecutor: Direct Handoffs (never queue, always create new thread till reach max), Unbounded queues (always queue, and never reach max threads) and Bounded queues (queue to a certain size, then start ceating threads till max threads, then rejects).

My assumption, Direct Handoffs+Unbounded Queue (use as many threads till reach maxPoolSize, then queue up unbounded), was obviously not one of them, even though I would think this is definitely favourable strategy in most cases. Anyway, we implemented this using a customized LinkedBlockingQueue.

Lessons learned: (like dealing with the bank,) always read the fine-print;).

3 comments:

  1. Do your guys use a pure java trainer for AMs or make system calls to binaries? That may make a quite big difference on the design of the parallelization platform. However I wonder if there exists toolkits for AM training in pure java...

    ReplyDelete