Built-in AI in the wild: A Mastodon translation success story
By Chrome for Developers
Share:
Key Concepts
- Mastodon: A decentralized social network where posts are called "toots."
- Toots: Posts on Mastodon.
- Language Detection: The process of identifying the language of a given text.
- Translation API: A service that translates text from one language to another.
- Progressive Web App (PWA): A type of application software delivered through the web, built using common web technologies including HTML, CSS, and JavaScript.
- Chrome DevTools: A set of web developer tools built into the Google Chrome browser.
- JavaScript: A programming language commonly used for web development.
- Pull Request (PR): A mechanism in software development for submitting changes to a codebase.
- Graffeme: The smallest functional unit of a writing system.
- Debouncing: A programming technique used to control how often a function is called.
- LibreTranslate: An open-source translation library.
- Freeform Prompt API: An API that allows users to interact with language models using natural language prompts.
- Prompt Engineering: The process of designing and refining prompts for language models.
Built-in AI in the Wild: A Mastodon Translation Success Story
This presentation by Thomas Steiner details a successful implementation of built-in AI features within the Mastodon client "Elk," focusing on improving language detection and translation for users.
The Problem: Language Mismatches and Translation Costs on Mastodon
- Language Detection Inefficiency: Mastodon's default language detection is often triggered late in the typing process, and the UI indicator is subtle. This leads to many "toots" being incorrectly labeled with the default language (often English when the UI is in English), even if the content is in another language.
- Example: A Ukrainian musician's post asking for the German word for a specific social awkwardness was initially detected as English.
- Server-Side Translation Costs: Mastodon relies on server-side translation APIs. While free tiers exist, larger instances with many users can incur significant costs, which are often a burden for community-supported server administrators.
- Privacy Concerns with Server-Side Translation: Translating private messages (DMs) requires sending data to a remote server, raising privacy concerns.
The Solution: Elk Client and Built-in AI
Thomas Steiner, a developer on the Chrome team, decided to address these challenges by integrating built-in AI capabilities into the Mastodon client "Elk" (elk.zone). Elk is an open-source PWA built with Vue.
1. Enhanced Language Detection
- Leveraging the Chrome Language Detection API: The Chrome team has developed a "well-paved road" for language detection, making the API simple to use.
- Technical Detail: The API can be demonstrated with two lines of JavaScript in Chrome DevTools:
Theconst detector = await languageDetection.create(); const result = await detector.detect(text);detectfunction returns an array of language candidates with their confidence scores.
- Technical Detail: The API can be demonstrated with two lines of JavaScript in Chrome DevTools:
- Elk's Implementation: A Pull Request (PR) was submitted to Elk to integrate this API.
- Key Features:
- Automatic Detection: Detects the composition language as the user types.
- Real-time Updates: The detected language updates dynamically.
- Reliability: Detection becomes more reliable with a minimum of six "graphemes" (smallest functional units of a writing system).
- User Experience Improvement: Significantly helps users who compose toots in different languages and forget to update the language picker.
- Graceful Degradation: On browsers that don't support the API, nothing happens, ensuring no disruption.
- Code Details: The implementation involves approximately 40 lines of Vue code, including API setup, feature detection, and grapheme counting.
- Addressing an Initial Flaw: An early version of the PR triggered language detection on every
keyupevent, which could be inefficient, especially on less powerful devices. - Community Fix: A follow-up PR introduced low-level debouncing to optimize performance and prevent excessive API calls.
- Low-Performance Optimization: Elk includes a setting to turn off language detection entirely for very low-performance devices.
- Key Features:
2. Efficient and Cost-Effective Translation
- Leveraging the Chrome Translator API: Similar to language detection, the translator API is also a "well-paved road."
- Technical Detail: Demonstrable in DevTools:
const translator = await i18n.createTranslator("de", "en"); // German to English const translatedText = await translator.translate("Guten Morgen"); // "Good morning"
- Technical Detail: Demonstrable in DevTools:
- Elk's Translation Stack: Elk utilizes its own infrastructure powered by the open-source translation library LibreTranslate.
- Benefits:
- No API Quota Costs: Eliminates the per-translation costs associated with commercial APIs.
- Server Maintenance Costs: While free from API quotas, there are still server maintenance and operational costs.
- Benefits:
- Implementation Details:
- Progressive Enhancement: The translator API is added as a progressive enhancement, meaning it's available if supported by the browser.
- View Front-end Integration: The implementation primarily hooks up the Vue front-end code to the translation logic.
- Pre-Translation Language Detection: A crucial detail is that language detection is performed before translation. This allows Elk to catch mislabeled toots and translate them correctly, rather than attempting to translate based on an incorrect source language.
3. Availability of Changes
- The merged PRs are not yet in the latest core release of Elk (January).
- Users can access the latest features by switching to the "I'm feeling lucky" release at
main.elk.zone, which is deployed directly from the main branch.
Experimentation with Freeform Prompt API
- The Challenge of Exploration: While the Freeform Prompt API is also a "paved road," understanding its potential applications is the challenge.
- Applying to the "Weird Hobby": Thomas Steiner experimented with the API to find German words for specific concepts.
- Process:
- Create a Session: Initialize a session with the language model, specifying input and output types as "text" and setting the language to English. (Note: The output was not set to German due to security approvals).
- Prompt Engineering: Craft a prompt to guide the model. For example: "What is the German word for when you wave back at someone who wasn't waving at you? Respond with just one word. It can be absurdly long and ridiculous. No dashes."
- Outcome: The model provided a response, demonstrating the potential of prompt engineering to elicit specific outputs, even for humorous or niche requests. The example highlights the model's ability to generate creative, albeit sometimes absurd, compound words, a characteristic of the German language.
- Process:
Conclusion and Key Takeaway
- Leveraging AI for Specific Tasks: Artificial intelligence is excellent for tasks it's designed for, such as language detection and translation.
- The Importance of Human Intelligence: However, it's crucial not to forget the "human touch" and human intelligence, especially when dealing with nuanced or creative applications.
- Resources:
- Learn more about built-in AI on
developer.chrome.com/docs/ai/builtin. - Sign up for the early preview program for updates on new APIs and features.
- Learn more about built-in AI on
Thomas Steiner concludes by offering his assistance for anyone needing the German word for something, reinforcing the personal and engaging nature of his presentation.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Built-in AI in the wild: A Mastodon translation success story". What would you like to know?
Chat is based on the transcript of this video and may not be 100% accurate.