Miko-Mini | sharangsharma.in

Back

Designing a Conversation AI experience on Miko mini robot

Conversation AI UX | Design for Personality

Team

UX Lead – Sharang
Visual Design – Dakshita, Channy
Product Lead – Ayush, Akshat Adani
Engineering Lead - Omkar

Background & context

Miko is a brand of educational robots designed for children aged 6–10 years old, to engage children through interactive conversations, storytelling and educational games.

Miko Mini is a voice-first device which was launched in 2023, to give a hands-free experience for young users.

As Principal Designer, I led the design of the initial voice experience. This case study highlights key components of building a Conversational AI experience.

The industrial design was crafted in collaboration with Pentagram Studio, while both software and hardware development were handled in-house by the Miko team.

Design research
Voice navigation, components and triggers
Design for errors and safety
Design for personality
User testing and takeways

youtube.com/@bensonvan3044/

1. Design Research

This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps

2. Voice navigation, Agent states and triggers

This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps

This voice flow illustrates key user-facing agent states on Miko Mini—such as idle, listening, thinking, acting, and responding. These states signal the AI agent’s behavior, helping users interpret its actions, anticipate responses, and maintain a sense of clarity and control throughout the interaction.

Sharang - All projects - Listening Experience + Recommendation + No Internet_ Backend Erro

This is the backend logic that runs in sync with the user-facing voice flow. It accounts for various agent states as well as system-level error conditions such as server downtime, connectivity issues, ASR failures, and backend processing errors

User- driven triggers

User-triggered voice navigation on Miko Mini includes:

Wake word activation: Saying “Hey Miko” initiates interaction.
Explicit commands: Users follow with requests like “Tell me a story” or “Play a song.”
Follow-up prompts: Continued dialogue is possible without repeating the wake word—for example, “Play story of Red Riding Hood.”

System- driven triggers

These triggers enable the system to perform automated tasks like slot-filling, content alerts, scheduled actions or even expressive behaviors that reflect the robot’s personality.

These system-driven triggers fall under two broad categories:

Content engagement: Triggers like first session of the day, incoming calls from parents, alarms or reminders, and expressions like “I’m feeling bored” prompt the system to initiate relevant content or suggestions. A personalized recommendation system was designed to surface relevant content. These suggestions adapt over time based on demographics and user behavior.
System maintenance: Events such as firmware updates, low battery alerts, or loss of internet connectivity prompt fallback responses or maintenance-related actions.

Design for keyword spotting and user input scenarios

The system recognizes voice commands using keyword spotting a technique that continuously listens in the background to detect specific trigger words or phrases.

Based upon entity identification, the system can respond by serving the request or surfacing the recommendation system or slot filling.

Screenshot 2025-04-03 at 2.07.43 PM 1.jpg

Navigating across content

Since Miko Mini is a voice first device, the user can uses a wake word trigger "Hey Miko" and follow up prompt like "Play Riddles", "Increase Brightness", "Call Parent" to navigate across functionalities.

System state overlaps

In voice-first devices, the transition between system states is not always linear and certain states can take precedence over others.

For example, when the system is in a Listening state, should it disregard or respond to incoming calls from a parent?

This dynamic interplay between states is crucial for ensuring effective interaction, as the system must determine how to prioritize various inputs in real-time.

3. Designing for errors and safety

AI outputs are inherently probabilistic and subject to errors ranging from hallucinations and bias to contextual misalignments. We built safeguards to handle edge cases, uncertainty, and misrecognition gracefully

User input errors

These erros occur when Miko can’t understand what was said maybe it was too quiet, unclear or there are false positives and false negatives.

System level errors

These erros happen when something inside Miko isn’t working like no internet, low battery, or a needed update. Miko will let you know gently so you can help fix it.

Designing for child safety

Safety was a foundational principle while developing Miko Mini, a companion learning robot for children. Given the high stakes of child-AI interaction, we prioritized minimizing harmful content and inappropriate behavior.

This involved optimizing the underlying language model to filter profanity, bias, and harmful outputs.

We collaborated with an AI linguist designer to fine-tune responses around sensitive and developmental-age-appropriate keywords

Key safeguards:

Multi-layered content filtering: Combined prompt hygiene, keyword blacklisting, and response post-processing to catch and filter offensive or unsafe outputs in real time.
Bias and harm audits: We stress-tested the model using simulated child interactions to uncover implicit biases or unsafe edge cases.

4. Designing for Personality

Designing for personality for Miko Mini refers to crafting a consistent, emotionally resonant character that feels alive, relatable, and engaging to children.

It combines external expressiveness (through voice, visuals, and motion) with internal consistency such as values, emotional intelligence, and knowledge boundaries.

Here is the personality framework for social robot like Miko Mini

Designing for internal personality traits

To a child, Miko isn’t just a robot , it's a friend- it has a internal personality like how it understands emotions, stays curious, and speaks with warmth and car

🌈 Backstory and worldview: Miko’s personality is intentionally designed as an ENFP — imaginative, enthusiastic, and empathetic. It brings a curious, optimistic worldview, helping children see challenges as playful opportunities to learn.

💖 Emotional intelligence: Miko adapts its tone and responses to mirror and validate children's feelings. (E.g., “That must feel frustrating. Want to talk about it?”).
This emotional responsiveness creates trust and models healthy emotional expression.

🛡️ Knowledge and values boundaries: Miko is trained to speak within child-appropriate topics trained on 3rd party content like Wolfram and in-house created child apt content.

🗣️ Communication style and tone: Miko’s language reflects its personality: warm, encouraging, playful. It avoids slang or sarcasm and instead uses child-friendly phrases like “Bot-tastic!” or “Oops-bloops!”. Its grammar is carefully crafted to model correct speech while also being fun and emotionally expressive.

Design for expressiveness or Multi-modal feedback

Miko Mini employs multi-modal feedback mechanisms to enhance user interactions and provide a seamless experience. These mechanisms include LED states, GUI interface, motion, SFX (sound effects), and voice-over

🙂 GUI: The graphical user interface (GUI) displays expressions states, voice skill thumbnails, and system errors. It helps users understand the robot's current status and interact effectively

🗣️ Voice-over: Voice-over provides clear auditory feedback and guidance to users. It ensures accessibility and enhances user understanding of system states or errors. (E.g., When the battery is low: “My battery is running low, please recharge me soon.”)

💡LED States: The LED system provides visual cues based on the robot’s status or activity. Different colors and animations are used to convey specific information.

🛞 Motion: Miko Mini uses dynamic facial expressions and physical movements to make interactions more engaging. These motions complement other feedback modes.

🎵 SFX (Sound Effects) Sound effects are integrated to reinforce feedback and add personality to interactions. For example, a cheerful chime when a voice skill is activated.

Screenshot 2025-03-26 at 10.25.27 PM.png

User testing

To shape a child-centered and globally resonant experience, we conducted over 60 qualitative beta sessions with families across India and the U.S.

This included online interviews with users aged 4–10 in the U.S., segmented into key developmental stages and in-person observational studies with children in Mumbai and Bangalore.

These sessions revealed nuanced patterns in how kids interact with Miko based on age and context. Beyond live testing, we also extracted valuable insights from post-purchase JTBD interviews with parents, customer support tickets, Mixpanel data, and Amazon reviews allowing us to capture both in-the-moment behaviors and longer-term sentiment across diverse touchpoints.

Learnings & Takeways

Designing for any voice-first device involves understanding voice-based triggers, system states, and the importance of multimodal feedback.

As technology continues to evolve, so too do the design principles guiding these innovations, which remain a work in progress.

By prioritizing voice interaction and optimizing user experience, we can create engaging and educational tools for children. Embracing these evolving principles will lead to even more intuitive devices that cater to young users’ needs.

Designing a Conversation AI experience on Miko mini robot

Conversation AI UX | Design for Personality

Background & context

Table of contents

youtube.com/@bensonvan3044/

1. Design Research

This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps

2. Voice navigation, Agent states and triggers

This section key user flow for handling voice interactions, keyword spotting, intent matching, and various scenarios for state overlaps

User- driven triggers

System- driven triggers

Design for keyword spotting and user input scenarios

Navigating across content

Since Miko Mini is a voice first device, the user can uses a wake word trigger "Hey Miko" and follow up prompt like "Play Riddles", "Increase Brightness", "Call Parent" to navigate across functionalities.

System state overlaps

3. Designing for errors and safety

AI outputs are inherently probabilistic and subject to errors ranging from hallucinations and bias to contextual misalignments. We built safeguards to handle edge cases, uncertainty, and misrecognition gracefully

User input errors

System level errors

Designing for child safety

4. Designing for Personality

Designing for internal personality traits

Design for expressiveness or Multi-modal feedback

User testing

Learnings & Takeways