← Explore Product Gaps
Product Gap

Babelfish

Impact:
Huge
Project Size:
Large

What

Real-time translation infrastructure for multi-language deliberative processes that captures all audio and written inputs (microphone, documents, handwritten notes) and provides instantaneous translation to each participant’s selected language through on-screen display or in-ear devices.

Why

Enable language access for diverse geographies, reduce human translator costs , increase representation and scale , improve deliberation quality by allowing participants to speak in their native language, and enhance equity by removing language barriers.

Problem Definition

There is a growing interest and necessity in hosting deliberations engaging multiple languages, to allow participation for wider geographies and not use a common exchange language for deliberation, like English or French. However, doing live translation reliably is incredibly resource intensive and normally involves each participant (or small cluster of participants) having their own human translator. This creates access and scalability problems.

Definition of Success

Effectively be able to capture all important inputs (voice and written) and translate them in <3 seconds with 90%+ accuracy for each participant into their own selected language. Cost: <$500/participant. Save ~$50k-100k on translation services by human translators, save $50k on wasted process time navigating translation friendly group formats and related downtime.

Requirements

  • Real-time translation.
  • Understanding of all languages (Start with 20 most popular, and build towards all that have enough data). Including accents.
  • Optical Character Recognition and live document scanning, including high quality translation of hand-written text.
  • Connection with microphone and/or in room audio collecting devices.
  • Pre-loaded to understand specific terms and pulls on agreed definitions or contextualisation when necessary.
  • Adapt possibly locally contextual elements to closest possible version with flag of approximate translation.
  • Easy usability e.g. camera access and navigating information. Create text on screen in accessible typeface, size, and colour (adjustable).
  • Challenges to address: must be able to distinguish between multiple different voices; and be capable of use in rooms with multiple conversations happening at once.

Existing Limitations

Right now, either there is an exchange language selected for all participants to deliberate in e.g. English, or each participant (or small cluster of participants) is provided with their own human translator. Any translation of written documents is done using something like Google Translate.

Milestones

  1. On-device laptop translation with 5 languages that can capture audio and translate in real time on screen
  2. Laptop and mobile app that can translate audio in real-time on screen across 20 languages and translate images captured via download or camera (e.g. PDF or post-it notes)
  3. Laptop and mobile app that can translate every possible language in real time verbally to in-ear devices and any connect with any shared data collections devices or repos

Starting Points

Explore integrations of existing technologies into usable UI to text needs e.g. google translate, ChatGPT, OpenAI Whisper