Friday, May 21, 2010

GPU and Speech Processing

We have been planning this for ages: utilizing GPU in nexiwave's speech processing. Now, we finally did it. Before the official release to our production decoding environment, our test environment shows dramatic speed improvement with GPU. A few things that are very worth noting:
  • do follow best practices from nividia:0
  • Critical best practices:
  • find those loops
  • memory coalesce
The first is quite obvious. The second is subtle. Anyway, our speed data as follows:

with GPU:
[java] Score 2134 0.0000s 0.0000s 0.0570s 0.0084s 17.9790s
[java] Grow 11238 0.0000s 0.0000s 0.1950s 0.0040s 45.2970s
[java] Scoring-Cuda 2124 0.0010s 0.0000s 0.0020s 0.0010s 2.1580s

without GPU (64 bit):
[java] Score 2134 0.0000s 0.0000s 0.4000s 0.0488s 104.1230s
[java] Grow 11238 0.0000s 0.0000s 0.1980s 0.0032s 36.0170s

without GPU (32 bit):
[java] Score 9607 0.0000s 0.0000s 0.2150s 0.0343s 329.9720s
[java] Prune 57599 0.0000s 0.0000s 0.0910s 0.0005s 30.7650s
[java] Grow 57642 0.0000s 0.0000s 0.7900s 0.0059s 339.4380s



Quite amazing, isn't it? Acoustic scoring time reduced from 104s -> 17s. Also, note the GPU only took 2s.

edit: If you must know how much was shipped to GPU: that's 15 million loops per 0.1s of audio

0 comments:

Post a Comment