Ben's Tech Blog @ nexiwave.com: speech recognition

We have been planning this for ages: utilizing GPU in nexiwave's speech processing. Now, we finally did it. Before the official release to our production decoding environment, our test environment shows dramatic speed improvement with GPU. A few things that are very worth noting:

do follow best practices from nividia:0
Critical best practices:
find those loops
memory coalesce

The first is quite obvious. The second is subtle. Anyway, our speed data as follows:


with GPU:
     [java] Score           2134    0.0000s   0.0000s   0.0570s   0.0084s   17.9790s
     [java] Grow            11238   0.0000s   0.0000s   0.1950s   0.0040s   45.2970s
     [java] Scoring-Cuda    2124    0.0010s   0.0000s   0.0020s   0.0010s   2.1580s

without GPU (64 bit):
     [java] Score           2134    0.0000s   0.0000s   0.4000s   0.0488s   104.1230s
     [java] Grow            11238   0.0000s   0.0000s   0.1980s   0.0032s   36.0170s

without GPU (32 bit):
[java] Score                9607    0.0000s   0.0000s   0.2150s   0.0343s   329.9720s
     [java] Prune                57599   0.0000s   0.0000s   0.0910s   0.0005s   30.7650s
     [java] Grow                 57642   0.0000s   0.0000s   0.7900s   0.0059s   339.4380s

Quite amazing, isn't it? Acoustic scoring time reduced from 104s -> 17s. Also, note the GPU only took 2s.

edit: If you must know how much was shipped to GPU: that's 15 million loops per 0.1s of audio

Ben's Tech Blog @ nexiwave.com

Friday, May 21, 2010

GPU and Speech Processing

Blogs I Read

Followers

Blog Archive