Navigate your inbox and calendar in a flash, with an AI-powered search assistant for emails

Introduction

Ask away with Ask AI

Superhuman’s Ask AI product has users declaring: “I can’t live without it!”

Searching through the 45,873 emails in your inbox and finding yourself unable to recall the right keyword or fumbling with Gmail tags is an all-too-common frustration for busy people who spend their days in email and calendars. Superhuman set out to solve this challenge with Ask AI, its AI-powered search assistant. Designed to transform how users navigate their inboxes and calendars, Ask AI delivers instant, context-aware answers to even the most complex queries – such as “When did I meet the founder of that series A startup for lunch?”

Problem

Who, what, when, where, why use email keyword search?

Superhuman noticed there was one area that users were spending a significant amount of time – email and calendar search. For up to 35 minutes per week, users tried to recall exact phrases and sender names using the traditional keyword search in their email clients.

The team realized that a semantic search experience could improve productivity and help users spend less time searching.

In the past few months since the release of Ask AI, Superhuman has already seen users cut search time by 5 minutes every week, for a 14% time savings.
Cognitive architecture

Transforming queries into insightful responses

When initially designing the Ask AI architecture, the Superhuman team used a single prompt LLM that performed retrieval augmented generation (RAG). The goal was to empower users to query their inboxes and calendars and retrieve relevant tasks, events, or messages. 

The diagram below above shows their first version, which generated retrieval parameters using JSON mode that were passed through hybrid search and heuristic reranking before the LLM produced an answer.

However, the single-prompt design had a few shortcomings. First, the LLM did not always follow task-specific instructions reliably. They also found that the LLM struggled to reason about dates accurately (e.g. identify upcoming deadlines). Their system also only handled certain search types well – such as finding flights or summarizing company updates – but not others such as calendar availability or complex multi-step searches

These limitations pushed the Superhuman team to transition to a more complex cognitive architecture. Their new agent architecture (as shown in the diagram below) could understand user intent and provide more accurate responses. It worked as follows:

1. Query classification and parameter generation

When a user submits a query, two parallel processes occur for the Ask AI agent: 

  • Tool classification: The system classifies the query based on user intent to determine which tools or data sources to activate. The classifier identifies whether the query requires:some text
    • 1) Email search only 
    • 2) Email + calendar event search 
    • 3) Checking availability 
    • 4) Scheduling an event 
    • 5) Direct LLM response without tools.
  • Metadata extraction: Simultaneously, the system extracts relevant tool parameters such as time filters, sender names, or relevant attachments. These will be used in retrieval to narrow the scope of search to improve accuracy. 

This tool classification ensures that only relevant tools are invoked, which improves response quality. It will also be used in the response generation step (to specify which prompts to use).

2. Task-specific tool use

Once the query is classified, the appropriate tools would be called. If the task required search, it would be passed into the search tool (with a hybrid semantic + keyword search) with reranking algorithms to prioritize the most relevant information.

3. Response generation:

Based on the classification in step 1, the system would select different prompts and preferences. Prompts would contain context-specific instructions with query-specific examples, and also encoded user preferences. The LLM, guided by a system prompt with clear instructions and encoded user preferences, would then synthesize information to generate a tailored response. 

Instead of relying on one large, all-encompassing prompt, the Ask AI agent used task-specific guidelines during post-processing. This allowed the agent to maintain consistent quality across diverse tasks.

By transitioning to this parallel, multi-process architecture, Superhuman created a more reliable agent and also hit these RAG expectations:
  • Sub-2-second responses to maintain a smooth user experience
  • Reduced hallucinations through post-processing layers and brief follow-up
Prompt engineering

Double dipping

To ensure consistent quality across responses, Superhuman implemented a few different prompt engineering strategies. First, they structured their prompts by adding in chatbot rules to define system behavior, task-specific guidelines, and semantic few-shot examples to guide the LLM. This nesting of rules helped the LLM reliably follow instructions. 

The most interesting technique the Superhuman team adopted was "double dipping" instructions. By repeating key instructions in both the initial system prompt and final user message, they ensured that essential guidelines were rigorously followed. This dual reinforcement of instructions helped maintain clarity and consistency, leading to more reliable outputs.

Evaluation

Validating results with feedback

When starting to test Ask AI’s performance, Superhuman first tested against a static dataset of questions and answers. They looked at retrieval accuracy based on this test set and would compare how changes to their prompt impacted accuracy. 

The team also adopted a "launch and learn" approach, systematically rolling out Ask AI to more users. First, they collected thumbs up / thumbs down feedback from internal pod stakeholders. Then, they launched the feature to the whole company with the same method.

Once they received enough positive feedback, Ask AI was launched to a dedicated AI beta group, then to their community champions, and eventually a beta waitlist. This strategy allowed the Superhuman team to identify the most pressing user needs and prioritize improvements accordingly – leading to a four-month testing process that culminated in a GA launch.

UX

Dual power: Integrating Ask AI for email search flexibility

Ask AI integrates into Superhuman's email app interface in two key ways:

1. Within the search bar, where users can toggle between traditional search and Ask AI.

2. As a chat-like interface, where users can ask follow-up questions and see the conversation history.

The team deliberated a lot on whether to integrate Ask AI solely in search, solely as an agent, or both. Ultimately, through user feedback and testing, they found that there was value to users in both options – so they kept both interfaces available. 

With Ask AI, users also have the flexibility to choose between semantic or regular search, offering greater control over their search experience. To avoid incorrect answers, Ask AI would also validate uncertain results with the user before providing a final answer. As such, the Superhuman team paid careful attention to response speed, aiming to provide answers as quickly as possible while maintaining accuracy

Conclusion

Smarter searches, happier users

Superhuman's Ask AI represents a thoughtful approach to transforming email search through AI. By honing in on user needs, iterating quickly, and employing clever prompting techniques like double dipping instructions, they've created a tool that slashes search time and improves the overall email experience.

As AI continues to advance, tools like Ask AI pave the way for more capable assistants that seamlessly blend into our everyday workflows.

And that's not all...
Discover more breakout AI agent stories below from the most cutting-edge companies.
Breakout Agentic Apps
Go back to main page
Read next story

Perplexity

Ready to start shipping 
reliable GenAI apps faster?

LangChain, LangSmith, and LangGraph are critical parts of the reference 
architecture to get you from prototype to production.