Revolutionizing Typing Experiences: Google's AI-Driven Proofread Feature

Google's innovative Proofread feature, powered by a server-side Large Language Model, revolutionizes typing experiences by providing seamless sentence and paragraph corrections with a single tap.

Revolutionizing Typing Experiences: Google’s AI-Driven Proofread Feature

In a groundbreaking development, a Google research team has introduced Proofread, an innovative feature powered by a server-side Large Language Model (LLM). This feature allows for seamless sentence and paragraph corrections with a single tap, revolutionizing the typing experience. Launched on Pixel 8 devices, it benefits thousands of users daily.

AI-driven typing accuracy in one tap

Gboard, Google’s keyboard for mobile devices, utilizes statistical decoding to offer a smooth typing experience. It features both automatic and manual error correction capabilities, ensuring user-friendly interactions. Leveraging the impressive capabilities of LLMs, Gboard enhances sentence- and paragraph-level corrections, making typing more efficient.

The System Behind Proofread

The system comprises four key components: data generation, metrics design, model tuning, and model serving. A sophisticated error synthesis framework generates datasets by incorporating common keyboard errors to simulate user input. Additional steps ensure the data distribution closely aligns with the Gboard domain.

Error synthesis framework for data generation

Multiple metrics are designed to evaluate the model from various perspectives. Given the variability in possible answers for longer texts, key metrics include checks for grammatical errors and semantic consistency based on LLMs.

Model Tuning and Serving

The model undergoes Supervised Fine-Tuning followed by Reinforcement Learning (RL) tuning. During the RL tuning stage, Global Reward and Direct Reward techniques are employed, significantly enhancing the model’s performance. Results indicate that RL tuning reduces grammatical errors, decreasing the Bad ratio of the PaLM2-XS model by 5.74%.

Model tuning techniques for improved performance

The model is deployed on TPU v5 in the Cloud with optimized latency achieved through quantization, bucketing, input segmentation, and speculative decoding. Speculative decoding alone reduces median latency by 39.4%.

The Future of Typing Experiences

This work showcases the substantial potential of LLMs to improve typing experiences by providing high-quality sentence and paragraph corrections. It highlights the transformative power of LLMs in user input interactions and suggests a fundamental improvement in how we engage with our devices.