Roadmap Phases

1 · Phrase completion

Our first competition is to deliver on the core premise behind our latency innovation, i.e. that LLMs, just like native speakers, generally know what is about to said before every word is uttered. This is a significant capability in its own right, even without appllying it to translation.

2 · Real-time text translation

The next step in our product development is to deliver a streaming service to translate text between languages, using a novel confidence measure to dynamically adjust the level of prediction to optimise for accuracy anywhere on the spectrum from completely predictable quotations to lists of random words. Think of translating subtitles into any language with minimal lag.

3 · Low-latency audio ingest; text output

This is well-troden ground for our Chief Scientist, who have been working on low-latency speech processing models since 2020. His team at Neurence, succeeded in reducing the latency of an end-to-end morphing process (including vocoding) down to 50ms.This will further speed up the ability to offer translated subtitles in for live speech, e.g. in meetings.

4 · Speech-to-speech

The final stage is to output speech. The ultimate latency optimisation is likely to come from a true multi-modal single-shot transformation, but there are further optimisations for which speech-mode LLMs are superior to legacy models. For example, LLMs are already capable of dealing with the divergence in how emotions are expressed between different languages, without needing explicit supervised training.

© BabelBit 2025