Showing posts with label speech recognition. Show all posts
Showing posts with label speech recognition. Show all posts

Friday, May 21, 2010

GPU and Speech Processing

We have been planning this for ages: utilizing GPU in nexiwave's speech processing. Now, we finally did it. Before the official release to our production decoding environment, our test environment shows dramatic speed improvement with GPU. A few things that are very worth noting:
  • do follow best practices from nividia:0
  • Critical best practices:
  • find those loops
  • memory coalesce
The first is quite obvious. The second is subtle. Anyway, our speed data as follows:

with GPU:
[java] Score 2134 0.0000s 0.0000s 0.0570s 0.0084s 17.9790s
[java] Grow 11238 0.0000s 0.0000s 0.1950s 0.0040s 45.2970s
[java] Scoring-Cuda 2124 0.0010s 0.0000s 0.0020s 0.0010s 2.1580s

without GPU (64 bit):
[java] Score 2134 0.0000s 0.0000s 0.4000s 0.0488s 104.1230s
[java] Grow 11238 0.0000s 0.0000s 0.1980s 0.0032s 36.0170s

without GPU (32 bit):
[java] Score 9607 0.0000s 0.0000s 0.2150s 0.0343s 329.9720s
[java] Prune 57599 0.0000s 0.0000s 0.0910s 0.0005s 30.7650s
[java] Grow 57642 0.0000s 0.0000s 0.7900s 0.0059s 339.4380s



Quite amazing, isn't it? Acoustic scoring time reduced from 104s -> 17s. Also, note the GPU only took 2s.

edit: If you must know how much was shipped to GPU: that's 15 million loops per 0.1s of audio