Designing a Conversation AI experience on Miko mini robot
Conversation AI UX | Design for Personality
Team
-
UX Lead – Sharang
-
Visual Design – Dakshita, Channy
-
Product Lead – Ayush, Akshat Adani
-
Engineering Lead - Omkar

Background & context
Miko is a brand of educational robots designed for children aged 6–10 years old, to engage children through interactive conversations, storytelling and educational games.
Miko Mini is a voice-first device which was launched in 2023, to give a hands-free experience for young users.
As Principal Designer, I led the design of the initial voice experience. This case study highlights key components of building a Conversational AI experience.
The industrial design was crafted in collaboration with Pentagram Studio, while both software and hardware development were handled in-house by the Miko team.
Table of contents
-
Design research
-
Voice navigation, components and triggers
-
Design for errors and safety
-
Design for personality
-
User testing and takeways

youtube.com/@bensonvan3044/
1. Design Research
This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps


2. Voice navigation, Agent states and triggers
This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps

This voice flow illustrates key user-facing agent states on Miko Mini—such as idle, listening, thinking, acting, and responding. These states signal the AI agent’s behavior, helping users interpret its actions, anticipate responses, and maintain a sense of clarity and control throughout the interaction.

This is the backend logic that runs in sync with the user-facing voice flow. It accounts for various agent states as well as system-level error conditions such as server downtime, connectivity issues, ASR failures, and backend processing errors
User- driven triggers

User-triggered voice navigation on Miko Mini includes:
-
Wake word activation: Saying “Hey Miko” initiates interaction.
-
Explicit commands: Users follow with requests like “Tell me a story” or “Play a song.”
-
Follow-up prompts: Continued dialogue is possible without repeating the wake word—for example, “Play story of Red Riding Hood.”
System- driven triggers

These triggers enable the system to perform automated tasks like slot-filling, content alerts, scheduled actions or even expressive behaviors that reflect the robot’s personality.
These system-driven triggers fall under two broad categories:
-
Content engagement: Triggers like first session of the day, incoming calls from parents, alarms or reminders, and expressions like “I’m feeling bored” prompt the system to initiate relevant content or suggestions. A personalized recommendation system was designed to surface relevant content. These suggestions adapt over time based on demographics and user behavior.
-
System maintenance: Events such as firmware updates, low battery alerts, or loss of internet connectivity prompt fallback responses or maintenance-related actions.
Design for keyword spotting and user input scenarios
The system recognizes voice commands using keyword spotting a technique that continuously listens in the background to detect specific trigger words or phrases.
Based upon entity identification, the system can respond by serving the request or surfacing the recommendation system or slot filling.


Navigating across content
Since Miko Mini is a voice first device, the user can uses a wake word trigger "Hey Miko" and follow up prompt like "Play Riddles", "Increase Brightness", "Call Parent" to navigate across functionalities.

System state overlaps
In voice-first devices, the transition between system states is not always linear and certain states can take precedence over others.
For example, when the system is in a Listening state, should it disregard or respond to incoming calls from a parent?
This dynamic interplay between states is crucial for ensuring effective interaction, as the system must determine how to prioritize various inputs in real-time.

3. Designing for errors and safety
AI outputs are inherently probabilistic and subject to errors ranging from hallucinations and bias to contextual misalignments. We built safeguards to handle edge cases, uncertainty, and misrecognition gracefully

User input errors
These erros occur when Miko can’t understand what was said maybe it was too quiet, unclear or there are false positives and false negatives.

System level errors
These erros happen when something inside Miko isn’t working like no internet, low battery, or a needed update. Miko will let you know gently so you can help fix it.
Designing for child safety
Safety was a foundational principle while developing Miko Mini, a companion learning robot for children. Given the high stakes of child-AI interaction, we prioritized minimizing harmful content and inappropriate behavior.
This involved optimizing the underlying language model to filter profanity, bias, and harmful outputs.
We collaborated with an AI linguist designer to fine-tune responses around sensitive and developmental-age-appropriate keywords
Key safeguards:
-
Multi-layered content filtering: Combined prompt hygiene, keyword blacklisting, and response post-processing to catch and filter offensive or unsafe outputs in real time.
-
Bias and harm audits: We stress-tested the model using simulated child interactions to uncover implicit biases or unsafe edge cases.
4. Designing for Personality
Designing for personality for Miko Mini refers to crafting a consistent, emotionally resonant character that feels alive, relatable, and engaging to children.
It combines external expressiveness (through voice, visuals, and motion) with internal consistency such as values, emotional intelligence, and knowledge boundaries.
Here is the personality framework for social robot like Miko Mini

Designing for internal personality traits
To a child, Miko isn’t just a robot , it's a friend- it has a internal personality like how it understands emotions, stays curious, and speaks with warmth and car
🌈 Backstory and worldview: Miko’s personality is intentionally designed as an ENFP — imaginative, enthusiastic, and empathetic. It brings a curious, optimistic worldview, helping children see challenges as playful opportunities to learn.
💖 Emotional intelligence: Miko adapts its tone and responses to mirror and validate children's feelings. (E.g., “That must feel frustrating. Want to talk about it?”).
This emotional responsiveness creates trust and models healthy emotional expression.
🛡️ Knowledge and values boundaries: Miko is trained to speak within child-appropriate topics trained on 3rd party content like Wolfram and in-house created child apt content.
🗣️ Communication style and tone: Miko’s language reflects its personality: warm, encouraging, playful. It avoids slang or sarcasm and instead uses child-friendly phrases like “Bot-tastic!” or “Oops-bloops!”. Its grammar is carefully crafted to model correct speech while also being fun and emotionally expressive.
Design for expressiveness or Multi-modal feedback
Miko Mini employs multi-modal feedback mechanisms to enhance user interactions and provide a seamless experience. These mechanisms include LED states, GUI interface, motion, SFX (sound effects), and voice-over
🙂 GUI: The graphical user interface (GUI) displays expressions states, voice skill thumbnails, and system errors. It helps users understand the robot's current status and interact effectively
🗣️ Voice-over: Voice-over provides clear auditory feedback and guidance to users. It ensures accessibility and enhances user understanding of system states or errors. (E.g., When the battery is low: “My battery is running low, please recharge me soon.”)
💡LED States: The LED system provides visual cues based on the robot’s status or activity. Different colors and animations are used to convey specific information.
🛞 Motion: Miko Mini uses dynamic facial expressions and physical movements to make interactions more engaging. These motions complement other feedback modes.
🎵 SFX (Sound Effects) Sound effects are integrated to reinforce feedback and add personality to interactions. For example, a cheerful chime when a voice skill is activated.





User testing
To shape a child-centered and globally resonant experience, we conducted over 60 qualitative beta sessions with families across India and the U.S.
This included online interviews with users aged 4–10 in the U.S., segmented into key developmental stages and in-person observational studies with children in Mumbai and Bangalore.
These sessions revealed nuanced patterns in how kids interact with Miko based on age and context. Beyond live testing, we also extracted valuable insights from post-purchase JTBD interviews with parents, customer support tickets, Mixpanel data, and Amazon reviews allowing us to capture both in-the-moment behaviors and longer-term sentiment across diverse touchpoints.
Learnings & Takeways
Designing for any voice-first device involves understanding voice-based triggers, system states, and the importance of multimodal feedback.
As technology continues to evolve, so too do the design principles guiding these innovations, which remain a work in progress.
By prioritizing voice interaction and optimizing user experience, we can create engaging and educational tools for children. Embracing these evolving principles will lead to even more intuitive devices that cater to young users’ needs.