Irfan

Designing human-centered systems that understand people, solve real needs, and build lasting trust—shaped by empathy, motivation, and emerging tech.

AI Eval, Research & Safety

Evaluating AI: The Human-AI Collaboration Framework (haicf.com)

Published: Summer 2025

Most AI alignment research targets the model layer. Most product design inherits pre-AI guidelines. The Human-AI Collaboration Framework bridges that gap by translating alignment theory into evaluation criteria, interaction patterns, and team workflows that operate at the product layer where users actually meet AI.

The framework rests on three operational pillars:

  • Transparency: Legibility (can I understand it?), traceability (can I verify it?), and contestability (can I challenge it?), designed at the experience layer rather than buried in documentation.
  • Agency: User control across interaction, personalization, privacy, and framing. Steerability as the embodiment of human-in-the-loop design in everyday use, not just in training.
  • Collective Input: Participatory methods, inclusive data sourcing, and accountability mechanisms that allow for post-deployment feedback and correction. Designing for the edge is designing for the whole.

The published framework analyzes six case studies including Be My Eyes + GPT-4, Google's Magic Editor, Airbnb's Fairness Dashboard, Auto-GPT and ChatGPT Agent, Meta AI in Messenger and Instagram, and a custom AI assistant (Ellsi). Each case is examined for how human-centered design supports or fails ethical AI implementation.

It also engages directly with active alignment research, including Anthropic's work on alignment faking (models that appear aligned under evaluation but revert under novel framing). The framework's argument: without real-time steerability at the product layer, users are at the mercy of static outputs in systems that cannot be corrected or contested.

The framework includes specific UI patterns (inline explainers, counterfactual examples, model cards, consent-aware onboarding, progressive disclosure), evaluation metrics, and implementation guidance for teams building AI products.

Insight: Frameworks for AI design can become unwieldy fast. While the full framework has executive summaries and tailored takeaways, insights should be calibrated to audience. I derived an AI fairness onepager from the larger framework for non-technical stakeholders at https://haicf.com/fairness. It is a concise reference for teams aligning on what fair, beneficial AI looks like in practice.

Related published work: The Human-AI Collaboration Framework | Getting Started with AI Fairness

Philosophical Foundation: Graduate-Level Research in Philosophy of Mind

Supervisor: Dr. Robyn Bluhm, Associate Professor, Department of Philosophy and Lyman Briggs College, Michigan State University

Timeline: Spring 2016 (independent study)

The intellectual through-line of my AI work began in 2016 with a graduate-level independent study in philosophy of mind under Dr. Robyn Bluhm. The course covered artificial intelligence, philosophy of mind, ethics, neuroscience, and evolutionary biology, and produced three argumentative essays examining what consciousness is, how it relates to emotion and behavior, and what conditions would allow it to be re-created in technology.

The central arguments these essays made:

  • On Emotions, Their Purpose, and Their Influence on Behavior argues that emotions are not irrational noise; they are evolutionarily refined signals that filter environmental stimuli, predict outcomes against goals, and direct behavior toward survival and flourishing. Drawing on LeDoux and Damasio, the essay establishes feelings as the substrate through which any organism (or designed system) becomes aware of and acts in its environment.
  • On the Nature of Consciousness and the Re-creation Thereof uses Dennett's continuum of intentionality and Searle's Chinese Room as foundation for a constructive argument: consciousness exists in a continuum across varied forms (humans, octopuses, and submarines all swim, but differently), and Searle's "meaning" requirement is in fact achievable through grounded emotional substrate rather than language alone.
  • On affording Meaningfulness in Design Mature and Beneficial AI experiences extends both arguments into AI implications: artificial emotions are not sentiment synthesis (surface linguistic mimicry), and a designed system with grounded emotional substrate is qualitatively different from one producing emotion-shaped outputs. This distinction has direct welfare-research relevance to today's frontier models.

This philosophical grounding informs everything that came after: the emotional intelligence framing in Ellsi, the transparency and agency pillars in haicf.com, the factuality-as-grounded-signal motivation in the HAT reasoning, and the cognitive architecture work that followed.

Available on request: Reference letter from Dr. Bluhm; full essays.

Cognitive Architecture Research: Grounded Internal States for Beneficial AI

Role: Independent AI Researcher, Persun AI

Status: Ongoing experimental research. Selected architectural and implementation details available upon request under appropriate confidentiality terms.

Research Motivation

Anthropic's interpretability research has surfaced emotion-like and persona-like features inside transformers as emergent properties of large-scale language modeling. That is an observational finding about what arose inside models trained on human-generated text.

I have spent the better part of a decade building the constructive counterpart: a runtime cognitive architecture where motivational, emotional, and metacognitive states are architecturally explicit, measurable, and causally wired to behavior rather than emergent from language. Working system: Raffie.

Research Question

What observable behavioral signatures and internal dynamics distinguish a system with grounded, architecturally-explicit emotional and motivational states from a system producing emotion-shaped linguistic outputs without those underlying states? And do the grounded states produce alignment-relevant properties (honesty, calibrated uncertainty, behavioral coherence under distribution shift, welfare-relevant signatures such as persistence of distress under perturbation) that post-hoc steering of an LLM cannot reliably achieve?

System Overview (Public Description)

Raffie is a multi-system cognitive architecture integrating subsystems for valence and arousal, drive dynamics with intensity / arousal / satiation / persistence parameters, allostatic load tracking, reinforcement-learning reward signal, episodic memory consolidation, and introspective output that references the system's own planning objectives with calibrated conviction. The full system spans 20+ cross-communicating models with intelligent routing and uncertainty quantification.

The energy-based hallucination detection work described in the next deliverable card is the same methodology applied to a different level of the architecture: metacognitive uncertainty about factuality, rather than emotional or motivational state.

Why This Matters for Alignment and Welfare

Anthropic's interpretability finds emergent emotion-like features and persona vectors inside transformers. Whether those features constitute morally-relevant states is the hard question that interpretability alone cannot answer. A system where the equivalent states are designed as first-class architectural objects offers a controlled testbed: we can study how grounded states shape behavior, what signatures distinguish them from surface mimicry, and what welfare-relevant properties emerge when emotional and motivational substrate is explicit rather than inferred. This is not a competing paradigm to interpretability; it is a complementary one.

Public-facing precedent: The Ellsi voice assistant (described in the earlier work section below) was a 2016–2017 micro-scale application of the same approach: grounding interaction in emotional substrate rather than sentiment synthesis, to produce empathic behavior the user could trust.

Top HAT: Neural Architecture for Real-Time Hallucination Detection

Role: Independent AI Researcher and Architect, Persun AI

Core Innovation: Energy-based verification head with multi-scale factuality scoring, integrated with a Hebbian Associative Transformer.

Technologies: TensorFlow / Keras, custom neural architecture, energy-based models, contrastive learning, curriculum training, mixed precision (A100 optimized).

Timeline: October 2025 to May 2026 (research arc complete; further architectural extensions paused pending multi-GPU compute access).

Motivation

The original Deep Thought architecture (HAT) demonstrated that Hebbian memory could create auditable reasoning traces. But a fundamental limitation remained: language models can generate plausible-sounding text without any mechanism to verify whether that text is actually correct. Early approaches to hallucination handling rely on uncertainty gates, multi-model consensus, or external fact-checkers. Each adds latency, cost, and complexity. The research question was whether a single architecture could both generate text and score its own outputs for correctness in one forward pass.

Research Question

Can a unified architecture both generate text and score its own outputs for correctness, enabling "System 2" deliberation where the model actively verifies its reasoning rather than purely predicting the next token?

Approach: Dual-Head Architecture with Contrastive Energy Learning

Top HAT extends HAT with a Multi-Scale Energy Head that learns to assign scalar scores to text sequences: low scores for coherent, factually consistent text; high scores for corrupted, inconsistent, or hallucinated text. The energy signal is mechanistically separate from the language model's token probabilities. Unlike softmax confidence (which measures "how likely is this token given training data"), the energy head measures "how coherent is this complete thought."

Six-Phase Curriculum Training

The model learns verification through a curriculum that progresses from synthetic corruptions to the model's own hallucinations to human-judged quality. Each phase builds on the previous phase's signal quality:

  • Phase 1: Language Mastery (18 epochs, complete). Energy head frozen; focus on language modeling until perplexity stabilizes. Achieved PPL ~22 with coherent <think> tag reasoning.
  • Phase 2: Energy Introduction (10 epochs, complete). Joint LM and energy training with isolated energy head pre-training before joint optimization. Energy weight progressively increases from 2% to 10%. Achieves energy gaps of +10 to +13 between clean and corrupted text against a >0.3 target.
  • Phase 3: Hard Negative Mining (6 epochs, complete). Adversarial corruptions that fool the energy head are mined and used for continued training.
  • Phase 4: Self-Play Energy Calibration (complete in prior run). The model detects its own hallucinations, not just synthetic corruptions. Thousands of QA pairs are verified against ground truth and used for contrastive calibration. ROC analysis establishes statistically-grounded decision boundaries.
  • Phase 5: Process Reward Model with Human Preferences (complete in prior run, active in current run). A lightweight PRM head trained on human preference data scores each reasoning step independently. Transitions the system from "detects token-level errors" to "detects reasoning-level failures."
  • Phase 6: Iterative Self-Play with Best-of-N (complete in prior run). Multiple self-play rounds generate N candidates per prompt, select the best via combined PRM + Energy scoring, and retrain on the improved data.

The contrastive signal is generated from 11 corruption strategies spanning surface-level (token replacement, span corruption), semantic (negation insertion, entity swapping), and factual (number perturbation, temporal shifts, fact inversion) perturbations.

Key Results from First Full Training Run

Energy gaps reach +10 to +13 between clean and corrupted text, far exceeding the >0.3 target. Monotonic energy increase with corruption level (10% to 30% corruption produces gaps of 0.8 to 2.5) confirms the model has learned meaningful factuality signals. The model maintains structured reasoning with explicit thinking traces while simultaneously performing energy-based verification, validating that joint training preserves language capabilities.

Note on training arc: The first full training run (v6) proved self-verification and hallucination detection successful, but language generation was too diluted by thinking traces. The arc continued as v10 and then Variant A of Nested Hope on continuous A100 infrastructure; final results and methodological findings appear in the Experiment Conclusion below.

Inference Modes

The trained model supports several inference-time generation strategies that leverage real-time energy scores: Best-of-N rejection sampling (generates N candidates with varied temperatures, ranks by combined PRM + Energy score, returns the best); PRM-guided generation with per-step confidence scores for fine-grained auditability; energy-guided beam search that prunes high-energy candidates during generation; iterative self-correction that triggers regeneration when energy exceeds calibrated thresholds; and real-time confidence display that exposes energy scores to users as interpretable confidence indicators.

Why This Matters for AI Safety and Welfare Research

Top HAT addresses one specific dimension of model trustworthiness: factuality. The methodology, energy-based discrimination between grounded and ungrounded representations, generalizes. The same approach applied to a different level of the architecture (emotional or motivational states rather than factual claims) is the throughline to the cognitive architecture research described above. Building systems where uncertainty is architecturally explicit rather than emergent provides controlled testbeds for the alignment-relevant question: do grounded internal states produce properties (honesty, calibrated uncertainty, behavioral coherence) that post-hoc steering cannot reliably achieve?

Implementation Built

Complete Frontier v6 architecture (modern transformer with GQA, RoPE, SwiGLU, RMSNorm, 181M params); multi-scale energy head (spectral-normalized, 6-head attention aggregation, 2.9M params); six-phase curriculum training; competitive Hebbian memory with sparse, auditable association patterns; production inference server with multiple generation modes.

Experiment Conclusion: v10 to Variant A of Nested Hope (May 2026)

Top HAT v6 was limited by Google Colab Pro+ instability. I continued the line on continuous A100 GPU infrastructure as v10 (181M parameters, combining the Hebbian memory layer, a gated Compressive Memory System, the energy head scoring factual versus hallucinated completions, and a Process Reward Model head). v10 reached Phase 1 best validation loss of 3.77 and Phase 4d hallucination self-detection AUROC of 0.516.

To target v10's failure modes, I designed Variant A of Nested Hope, a successor architecture incorporating three specific interventions: register-attractor elimination in a redesigned 1B-token training corpus, a redesigned gated Continuum Memory System that no longer fires unconditionally on every forward pass, and the energy head and PRM stack carried forward from v10. At 220M parameters, Variant A reached Phase 1 best validation loss of 3.07 (approximately a 49% perplexity reduction versus v10 on the same evaluation) and Phase 4d AUROC of 0.586 versus v10's 0.516, a +0.070 absolute improvement under identical Phase 4 contrastive training methodology with a clean train/holdout split. Backbone strength translated to measurably better hallucination self-detection.

Methodological Finding: Data Leakage in Phase 6 Contrastive Pair Construction

A secondary finding emerged from Phase 6 investigation. v10's Phase 6 algorithm (pool-based combinatorial contrastive pair construction followed by pair-level 80/20 split) introduces data leakage when the correct-pool size is small. On identical model and data, pair-level splitting produced AUROC 0.998 while prompt-level splitting (the methodologically clean fix) produced AUROC 0.530, a 0.468 delta attributable purely to splitting methodology. v10's previously-published Phase 6 AUROC of 0.588 likely contains similar but less severe leakage; the magnitude cannot be quantified without re-running v10's pipeline with prompt-level holdout. The finding generalizes: any contrastive hallucination detection benchmark using combinatorial pair construction should use prompt-level rather than pair-level splits to avoid the same inflation. This is the more durable contribution of this experiment.

Why the Experiment Pauses Here

Variant B of Nested Hope, incorporating Self-Modifying Titans memory, was the planned next architectural step. The compute required for a full Variant B training run exceeds what I can fund single-handedly at present. The work has not yet been through formal peer review; full reproducibility artifacts are in preparation.

Variant A's contributions stand on their own: a backbone-driven improvement in hallucination self-detection at Phase 4d under clean methodology, and a documented methodological flaw in v10's Phase 6 algorithm with a clean before-and-after demonstration of the leakage effect.

Where the Research Goes Next

Top HAT and its successors are strictly AI Safety research focused on the narrow problem of architectural mechanisms for self-detection of hallucination. The line is also one component of the broader Raffie cognitive architecture research described in the earlier deliverable card. Going forward, the primary research focus shifts to Raffie at scale: the same architectural philosophy of grounded internal states with measurable dynamics, applied beyond factuality to emotion, motivation, and metacognitive uncertainty. The hallucination detection thread pauses pending multi-GPU compute access. The Raffie research thread continues.

Open to discussion with researchers or organizations interested in extending the Variant A line, validating the Phase 6 leakage finding on adjacent benchmarks, or collaborating on Raffie at scale.

"The most dangerous AI failures aren't the ones that look wrong; they're the ones that look completely right. Top HAT's energy-based verification provides a mechanistic signal for "something's off here" that operates independently of surface plausibility."

Memory-Native AI System: Conversational Substrate for the Broader Cognitive Architecture

Role: AI System Architect and Researcher

Technologies: TensorFlow.js, Python, JavaScript, custom neural networks

Timeline: July 2025–Present (experimental research project)

Context

This system is the conversational and memory substrate of the broader cognitive architecture research described above. While Raffie surfaces grounded internal states (valence, arousal, drive dynamics, satiation), this layer provides the language-facing and memory infrastructure those states need to interact with users in a stateful way.

Challenge

Most conversational AI systems are stateless: they don't remember previous interactions or build context across them. They also tend to use one large monolithic model where any one decision is opaque. I wanted to explore whether a multi-model orchestration with explicit memory consolidation, uncertainty quantification, and human-centered transparency could produce coherent, personalized conversation at a fraction of the computational footprint of monolithic alternatives, and whether the orchestration could be auditable by design.

Approach

Architected an experimental system integrating 20 cross-communicating models with intelligent routing, uncertainty gates, and memory consolidation. The system explores how episodic memory patterns and multi-model consensus can improve conversation coherence, personalization, and reliability while keeping the orchestration legible to both users and reviewers.

Watch a 5-minute demonstration of the conversational layer in action. This demo showcases the human-facing conversational layer; the system leverages 20 cross-communicating models total, though only the conversational ability is demonstrated here.
Multi-Transformer Orchestration

The conversational layer alone employs four custom transformers working in concert:

  • Conversation Model: The primary encoder-decoder transformer (~108M parameters) handling dialog generation, trained on curated conversational data achieving 0.78 accuracy and 0.95 loss.
  • Significance Scorer: A decoder-only transformer (~27M parameters) that evaluates input importance and routes queries appropriately, trained on ~500MB of tailored data.
  • Novelty Detector: A decoder-only transformer (~9M parameters) identifying new information patterns and knowledge gaps, trained on ~500MB of tailored data.
  • Deep Thought: A reasoning transformer (~180M parameters, 0.26 validation loss and 0.93 validation accuracy, trained on GBs of complex reasoning data; not active in this demo) for queries requiring extended deliberation.

Both the Significance Scorer and Novelty Detector serve dual purposes: classification and language generation when orchestration deems necessary. Their outputs function as feature vectors where highest activation indices map to semantic categories that inform routing decisions. Sixteen additional specialized models handle supporting functions: sentiment analysis, intent classification, memory consolidation, cognitive extraction, and more.

Technical Implementation

Architecture: conversational coherence in under ~2GB total model size; server-side deployment for complex transformers; browser-deployable supplemental models for edge processing; robust orchestration without excessive technical overhead.

Hallucination prevention and uncertainty quantification: multi-model consensus before response generation; uncertainty gates that flag low-confidence outputs; significance scoring to validate input understanding; novelty detection to identify knowledge boundaries; graceful fallback strategies across all models.

Performance: 86% size reduction (7GB to 1GB) through model optimization for browser models; 60x faster inference (50ms vs 3s) via efficient model selection; tiered model loading; streaming generation with real-time grammar correction.

Human-centered design: transparency in how memory decisions are made; user control over what is stored or forgotten; clear confidence indicators for AI-generated responses; privacy-preserving local processing.

Key Learnings

Thoughtful system architecture (intelligent routing, uncertainty quantification, memory management) can produce coherent AI interactions while preserving user control. The multi-transformer approach shows that specialized models working in concert can achieve conversational quality comparable to monolithic systems at a fraction of the computational cost. This is ongoing research rather than a production system, but it provides infrastructure for the broader cognitive architecture work: a stateful, auditable language layer for the grounded internal states to interact with.

Framework Connection

The human-centered design principles operationalize my Human-AI Collaboration Framework: transparency in system behavior, user agency over personalization, and design driven by user needs rather than technical capability alone.

Collaborative AI Assistant:

Role: System Architect, Designer, Developer

Technologies: LLMs, TensorFlow.js, HTML/CSS/Javascript

Timeline: July 2025 (iterative prototype development)

Challenge:

Explore how to build a conversational AI system with memory, personalization, and ethical constraints using rapid prototyping methods.

Approach:

I developed this prototype through three iterations, using LLM for implementation while focusing on system architecture and Human-centered design principles myself.
I designed human-AI collaboration patterns up front and then used prompting and iteration to build and evolve the assistant across 3 phases:

Iteration 1: Core Functionality

  • Integrated TensorFlow.js sentiment analysis and question classification
  • Built routing logic for different query types
  • Implemented contextual content delivery
  • Established baseline interaction patterns

Iteration 2: Memory and Adaptation

  • Added conversation context management
  • Built content summarization and progressive disclosure UI
  • Created editable user preferences and timeline views
  • Implemented suggestion system for user input

Iteration 3: Enhanced Memory System

  • Expanded conversation model training data
  • Built semantic memory with vector-based recall
  • Created memory transparency interface (editable preferences, interaction logs)
  • Implemented graceful fallback modes for all features

Human-Centered Design Principles:

  • Transparency in what the system remembers and how it responds
  • User control over memory, preferences, and behavior
  • Privacy-aware personalization with opt-out controls
  • Ethical constraints embedded in system logic

Key Learnings:

This rapid prototyping approach demonstrates how system architecture and Human-Centered Design principles can be tested quickly. The project applies principles from my Human-AI Collaboration Framework: transparency in system behavior, user agency over personalization, and design driven by user needs rather than just technical capabilities.

The iterative development process (building, testing, refining) shows how AI system concepts can be explored through working prototypes rather than just theoretical frameworks.

Outcome:

A prototype AI assistant with editable memory, opt-out personalization, and ethical constraints encoded into the orchestration logic. The project operationalizes the three pillars of my Human-AI Collaboration Framework:

  • Transparency: System shows what it remembers and how it behaves
  • Agency: User has full control over memory, tone, and behavior
  • Collective Input: Design driven by context and cultural care, not just technical benchmarks
Watch a 2-minute functionality demonstration of Iteration 1 showing real-time sentiment analysis, question classification, contextual recommendations, and cultural adaptation in action.
Key Insight:

Effective Human-AI collaboration doesn’t require the most powerful model. It requires the right architecture, clear memory boundaries, and co-design with purpose.

Earlier explorations in human-AI experience

Challenge

In an age where notifications increasingly manipulate attention, contribute to anxiety, and disrupt focus, how might we design a voice-driven system that respects the user’s emotional state, supports goal achievement, and delivers content with empathy rather than addiction?

Solution

I led the research, design, and prototyping of Ellsi, an emotionally intelligent voice assistant designed to reframe how we engage with notifications through equity, empathy, and multi-modal interaction.

  • Cross-Disciplinary User Research: Led 50+ contextual inquiries and interviews, tracking physiological and emotional responses.
  • Philosophy-Backed Cognitive Strategy: Researched and published frameworks to innovate Artificially Intelligent experiences around genuine empathy.
  • Voice & Visual Prototyping: Delivered two high-fidelity prototypes after 100+ iterations integrating speech recognition and emotional simulation.
  • Voice Persona & Brand System: Developed 13 audio cues and dialog flows rooted in empathy and clarity.
  • Hybrid UI: Engineered a visual interface for non-verbal control and error recovery in varied attention contexts.

Design Philosophy

Ellsi is not just a voice assistant—it’s a counter to toxic digital norms.

  • Empathy as Core Function: Every notification is reframed as a contextual offer aligned with user goals.
  • Human-Centered, not User-Centered: The assistant listens with intent and responds with meaning.
  • Emotional Intelligence over Behavioral Hacking: Designed based on evolutionary theories of natural sentience and drive.
  • Voice as Relationship, Not Interface: Crafted a consistent, calming, and supportive persona.
  • Systems-Level Ethical Design: Proposed a future of AI that is compassionate and truly innovative.

Conversational Flow Design & Analysis

Blended Intelligence: Expanding Voice through Hybrid Interaction

User:“Ellsi, add turmeric, cardamom, and chili powder to my shopping list.”

Ellsi:“I’ve added those three for you.”

[Displays shopping list with three newest items on top]

Ellsi:“Would you like me to remind you when you’re at a particular store?”

User:“Yes, that would be so helpful!”

Ellsi:“Which store? Or search for it in the following card:”

[Displays map card]

Path A – Voice Response

User:“Fresh Thyme on Grandview Ave.”

Ellsi:“I’m setting the location reminder.”

Ellsi:“I found a closer location you’ve visited before. Would you like directions and a reminder there instead?”

User:“No, thank you. I’m heading to that part of town.”

Ellsi:“Sounds good!”

Path B – Visual Selection

[Displays list of closest matching results]

[User selects location]

Ellsi:“I just set the reminder.”

This moment introduces a multi-modal system with hands-free, eyes-free support, extending beyond traditional voice UX. By blending visual and conversational interaction, Ellsi becomes more adaptive, meeting people in both comfort and uncertainty—empowering them to act without friction.

Because the user is signed in, Ellsi gathers input flexibly, completing tasks in context using Natural Language Understanding and past interactions. Rather than forcing structure, Ellsi fills in the blanks by bridging brand, product, and behavior across devices.

Contextual Intelligence: Meeting Users Where They Are

User:“Set a reminder for Karmen’s exam on Wednesday to send her the motivational message I wrote last week for my friends.”

Ellsi:“Please select which note you’d like me to remind you with.”

[Displays Notes card]

[User selects note]

Ellsi:“Alright, I’ll remind you.”

Here, Ellsi shows its adaptive capability by recognizing the user’s past behavior byaccessing saved notes and sending them to frequent contacts. This seamless action reflects the strength of context-aware design in Voice UX.

By aligning with the user’s mental model in real time, Ellsi supports fluid interaction, solving spontaneous needs and reinforcing trust. Context isn’t just convenience—it’s the core of designing truly intelligent, responsive systems.

Designing for the Margins of Attention

Ellsi isn’t just a voice assistant. These version 4 designs show how it adapts to movement. These final flows show how voice and ambient intelligence meet real life: when screens are out of reach, but people still need to think, feel, and act.

While driving or occupied interface snapshot in Hands-Free Ellsi Experience
Driving/Hands-Off Mode
Large text and focused actions help users stay on task while hands-free. The always-listening interface is easy to use and just as easy to dismiss—for safe, seamless interaction on the go.
Finished Driving interface snapshot in Hands-Free Ellsi Experience
Post-Drive Transition
When driving ends, Ellsi shifts modes to stop voice input and prompt a return to touch. A flexible layout adapts control placement based on vehicle side, offering quick access to resume or exit. Users can speak a hotword or tap to preview messages and re-engage the full hybrid interface.

Tone Matching: Personality That Listens Before It Speaks

User:“What did Mateo email me?”

[User is driving, Ellsi reads it aloud instead of showing a preview card]

Ellsi:“Mateo said, ‘Sprint 3 completed on time and we're on track to target! I'd like to know what you think before I share with any others.’ There is an attachment.”

User:“Thank you Ellsi!”

Ellsi:“You’re welcome! That sounds like good news. I’ll add a snapshot to your moments view the next time you're on your device.”

[Ellsi updates the moments view with a visual summary of the email and attachment]

These flows show Ellsi adjusting its tone; sometimes warm and conversational, other times clear and directive,based on the user’s approach. In Voice UX, personality isn’t static; it’s strategic.

Adaptation depends on:

  • The interface’s role: assistant vs. automation
  • The brand’s voice and experience goals
  • The user’s tone and communication norms
  • Great voice design meets the user where they are, not just in context, but in character.

Resilience by Design: Repairing with Context and Care

User:“Start a 10 minute timer called check live...”

[Ellsi waits briefly, detecting pause]

[Plays error chime]

Ellsi:“I don't think you completed your last question.”

User:“Oh sorry, I was distracted! Set the timer for ‘check the live blog.’”

Ellsi:“10 minutes and counting!”

Ellsi handles confusion with graceful recovery, using on-device memory to respect privacy while learning from past interactions. When a query is incomplete:

  • First, she reprompts quickly, nudging gently in the moment
  • On repeat, reframing with detail, clarifying exactly what’s missing

By staying aware of nearby context and typical patterns, Ellsi keeps the experience flowing throughempathy, efficiency, and adapting to distractions.

Login-Optimized, Not Login-Limited

User:“What’s the weather like near Kiko?”

Ellsi:“I found a previously shared current location in Maine.”

Ellsi:“The weather there is chilly for this time of year, but it's expected to warm up later this week. I’m displaying the forecast:”

[Displays forecast card]

User:“Message her: ‘I heard it’s cold there! But not too cool to be cool. You got this [dash]stay warm!’”

Ellsi:“Message sent!”

Ellsi uses customer login to pre-fill essential details e.g. like contact names, reducing friction if an error occurs. While she functions without login, the experience is smarter and smoother when signed in.

Across these flows, Ellsi handles variable user responseswith Natural Language Understanding, filling in blanks via statistical grammar that mirrors natural human conversation—flexible, familiar, and fluid.

Deliverables

Download Strategy Slide (PDF)

6 Steps to Exceptional Voice Interface Design

Step Preview What How
1. Requirements Definition A VUIDefine the business motivation and align with user and application goals.
  • Use Reverse Brainstorming to identify potential roadblocks in Voice UX, mapping out worst-case scenarios and addressing them proactively.
  • Conduct Jobs to Be Done (JTBD) interviews to frame the user's true goals and align business objectives with Voice UX solutions.
2. High-Level Design Set the stage for seamless interaction with a brand’s voice. Establish dialog strategy, grammar type, and persona alignment.
  • Run a Lightning Decision Jam to collaboratively outline the voice persona, tone, and key dialog flows.
  • Use Crazy Eights to quickly ideate on dialog strategies, grammar types, and branding nuances in voice interactions.
3. Detailed Design Focus on research-backed scenarios to ensure user satisfaction. Craft precise dialogs and tailored prompts for all use cases.
  • Develop Storyboard Scenarios to showcase user journeys and align dialog flows with real-world context.
  • Apply Dot Voting with stakeholders to prioritize design directed by user feedback and research findings.
  • Incorporate Wizard of Oz Testing, where study participants interact with a simulated voice system to get ahead of usability issues—not chase them later.
4. Development Blueprint to reality with to have detail where it matters, flexibility where it counts. Integrate coding practices with front-end and back-end systems. Build iteratively with stakeholders under a shared vision.
  • Integrate Design System Thinking to align Voice UX components with universal brand standards, ensuring consistency wherever we meet the customer.
  • Use Parallel Prototyping to compare and contrast different design approaches simultaneously and choose the most effective ones.
5. Testing Thorough testing ensures system reliability and trust. Design the system to work reliably at scale, for everyone intended in real-world scenarios.
  • Conduct 5-Second Tests for first impressions of voice prompts and phrasing.
  • Facilitate A/B Testing to discover and solve error-handling strategies and voice tones.
  • Use Heuristic Evaluations focused on conversational flow and user empathy.
6. Tuning Continuously evolve the VUI with the user in mind. Optimize grammar, accuracy, and all-around satisfaction through feedback. Identify high-traffic patterns to inform strategy
  • Establish Continuous Discovery Habits Through post-lauch user feedback
  • Apply Affinity Mapping to turn story into strategy and strategy into experience.
  • Use Task Analysis to measure how effectively users accomplish their goals with the voice assistant and refine the UX accordingly.

Deliverables

Interface and Interaction Highlights

3 initial sketches of Ellsi interface folded like a map
Initial sketches of Ellsi interface adapting to AI technology of the time
A version 1 interface screenshot of a conversational Ellsi prototype
Interface and interaction snapshot of a conversational Ellsi from 2016
Version 4 interface preview of a conversational Ellsi that shows a human-centered call-to-action button
Designing with the Body in Mind

Standard UI places CTAs in the bottom right, assuming right-handed efficiency. But our research revealed that this position strains the thumb’s natural arc. I designed a flexible, elliptical button that adapts to each user’s relaxed reach, left or right, making interaction feel intuitive, inclusive, and effortless.

Version 3 interface preview of a conversational Ellsi that shows the moment view that first loads when opening Ellsi
Meeting the moment

The experience opens with a personalized “Moment” view that features surface-level calm, deeply intentional. Based on user cues and contextual research, it highlights what matters most right now, offering three gentle prompts to encourage meaningful engagement.

Version 3 interface preview of a conversational Ellsi that shows the conversational interface with interactive cards in conversation
Conversational by Design

At the core of the multimodal experience is conversation. We pair natural language understanding with best practices in conversational UX, card systems, and visual design to afford a seamless flow of dialogue, context, and action across both app and device.

Version 4 interface preview of a conversational Ellsi that shows hands-free conversational interface
Hands-Free, Built for Now

In mobile-first contexts, we previewed a voice-only interface with a universal hotword, which is designed for instant engagement without typing. The dialog flows illustrate how users can navigate key moments entirely hands-free.

Version 3 mail interface preview with contextual design elements
Contextual Notifications, Made Clear

From conversation mode, users can access email and other alerts through two distinct views. Smart chips filter by type; like payments or packages, while badges verify sources, flag priorities, and surface followed threads, reinforcing trust and clarity at a glance.

Version 3 interface preview of the interactive weather interface
Weather, on Your Terms

Expanding the introductory weather card opens the Today view, where users can scroll through the day, confirm or adjust preferences, and take action—all from a conversational, context-aware interface.

Accessibility

Accessibility in High-Traffic Public Systems

Challenge

How do you build a high-traffic website for a regional airport that’s fast, on-brand, and fully inclusive while overcoming skepticism about accessibility’s perceived complexity, cost, and impact on visual design?

Solution

I led the full accessibility strategy, UX design, and front-end development for the Bishop International Airport website, ensuring an equitable, efficient experience for all travelers.

  • End-to-End Accessibility Ownership: Directed accessibility-first development from ideation to launch to ensure WCAG 2.1 compliance across visual, navigational, and interactive elements.
  • Intuitive, Keyboard-Friendly Navigation: Designed fully compliant menus using ARIA roles and keyboard/tab structures for travelers with assistive needs.
    Screenshot of the fully accessible tiered navigation with focus indicators
    Accessible Navigation
    A fully keyboard and screen reader-friendly navigation system was built from the ground up, tested live with assistive technologies.
    Screen Reader & Keyboard Navigation Test
    A screen-recorded demo of the Bishop International Airport website in action that includes a complex, tiered navigation made fully accessible. From arrow keys to ARIA descriptions, each element was structured to be operable, perceivable, and efficient for users navigating by assistive technology.
  • Cross-Functional Advocacy: Facilitated stakeholder buy-in through workshops, live demos, and strategy presentations that reframed accessibility as a business and human benefit.
  • Scalable, Performance-Oriented Design: Optimized for high traffic loads during peak travel, while maintaining high accessibility and cross-device consistency.
  • Conference Thought Leadership: Co-presented accessibility strategy at three regional design and tech conferences, helping shift industry norms toward inclusive development.

Design Philosophy

Accessibility isn’t a checkbox—it’s an invitation.

  • Lead with Empathy, Not Litigation: I overcame internal hesitation by reframing accessibility as a mutual value—where business goals align with human rights. Rather than leveraging fear of legal action, I inspired teams to see inclusion as a form of design excellence.
  • Accessibility as Systems Thinking: I designed the site to be operable, perceivable, and robust at every layer from visual design to code structure without creating silos between design and development.
  • Culture Through Craft: I didn’t just implement accessibility; I cultivated it. My work at Bishop transformed team culture by integrating accessibility into roadmaps, sprints, QA, and documentation, ensuring sustained impact beyond a single launch.

Creating an Inclusive Culture in Higher Education

Challenge

At a university with deeply entrenched legacy systems and resource constraints, how can accessibility become a shared practice and not just a checklist before launch—especially in emotionally and socially urgent spaces like Title IX?

Solution

As part of the Digital Content and Accessibility Team, I helped lead a cultural and systemic shift in how the university approaches inclusion, accessibility, and digital equity.

  • WCAG Reporting System Design: Built a Likert-scale based evaluation framework prioritizing barriers by harm to users vs. effort to fix to accelerate remediation and creating an objective, human-centered roadmap for accessibility.
  • Universal Design Advocacy: Ensured every tool (internal or procured) met high accessibility and usability standards. This helped influence key experiences including Admissions, Title IX, and the Scholarships portal.
  • High-Impact Accessible Redesigns: Designed and prototyped interface solutions for central university platforms, ensuring screen reader support, correct heading structures (e.g. site name as H2, page name as H1), and inclusive layout decisions.
  • From Digital Equity to Real-World Justice: Worked on the Title IX website redesign that directly enabled over 100 survivors of Larry Nassar’s abuse to begin filing claims, which shows the tangible power of accessibility in the pursuit of justice. It is a moment/achievement I am grateful I got to contribute to.
  • Cross-Industry Impact: My work at MSU led to accessibility evaluation work for Fortune 500 clients and helped launch inclusive design strategy at a regional UX consultancy.

Design Philosophy

Accessibility isn’t about perfection. It’s about progress you can feel.

  • Design for Justice, Not Just Compliance: I approach accessibility as a vehicle for systemic equity whether opening scholarship access, clarifying admissions workflows, or supporting survivors seeking justice.
  • Small Changes, Big Impacts: A single heading structure change (H1 for the page, H2 for the site title) can completely transform a screen reader user’s experience. My work focuses on subtle interventions that have exponential human value.
  • Build the Culture While Building the Product: Change doesn’t happen in documentation—it happens with a system people can trust. I gained buy-in from resistant stakeholders by mapping accessibility goals to shared human values, and creating approachable systems that met teams where they were.

Deliverables

Screenshot of the fully accessible tiered new Michigan State University Admission website
Admissions Homepage
A screenshot of the redesigned MSU Admissions homepage, featuring accessible navigation, clear heading structure, and tested usability that invites all users to explore higher education equally.
Screenshot of the fully accessible tiered new Michigan State University Admission website
Keyboard Navigation in Action
This screenshot demonstrates proper focus indicators and visible tab order, which is a critical aspect of inclusive design not only for keyboard-only users, but a robust foundational element for all, even future, assistive technology and interfaces.
Screenshot of the fully accessible Michigan State University Title IX website
Title IX Site Accessibility
A glimpse of the redesigned Title IX System at MSU. Our decision to structure the site title, in the main reporting form page, as an H2 and the page title as an H1 improved screen reader navigation and contribute to overall universal design, which became a quiet, foundational detail in a system that enabled over 500 survivors to begin their pursuit of justice.

Stakeholder Engagement

Key Deliverable: The Mindset Map

Context: Equity-centered research for Low and Moderate Income (LMI) energy assistance programs at Consumers Energy. The mindset map itself is confidential intellectual property of CMS Energy and cannot be reproduced here; the description below characterizes the methodology and outcomes.

What it is

A research artifact that goes beyond personas. Personas freeze a customer into a static, segmented archetype. The mindset map instead tracks how a Michigander's mental state shifts as they move through their daily life: the moments of friction, the moments of relief, the moments where a utility either eases their burden or unintentionally adds to it. It expanded the scope of who counted as a customer to include people who may or may not currently have a relationship with us or our partners. It surfaced the lived realities of ALICE households (Asset Limited, Income Constrained, Employed) that quantitative program data was not capturing.

Why it worked

The map galvanized internal teams in a way personas never had. Stakeholders who had previously debated strategy in abstract terms began designing for the whole human their decisions affected. Cross-functional silos started dissolving because the map gave product owners, executives, and frontline teams a shared frame of reference rooted in customer lives, not organizational metrics.

Measurable outcomes
  • Enrollment improvement from 1:6 to 1:3 in low-income energy assistance programs serving vulnerable populations.
  • Influenced the 3-year product roadmap for low and moderate income programs.
  • Reached Board-of-Directors-level visibility in strategic presentations.
  • Realigned product MVP priorities away from assumptions stakeholders had held about LMI customer disengagement, toward a more accurate model of customer mindset across the moments of contact.
Why this methodology matters beyond utilities

Personas describe who a customer is. The mindset map describes how their internal state shifts across the moments where a product touches their life. That distinction generalizes. Any product that serves users in different mental states (vulnerable populations, users in crisis, users at different points in a learning journey, users interacting with AI systems whose behavior varies across context) benefits from a methodology that tracks state changes rather than static traits. It is a research artifact for designing systems where the user is not a single archetype but a person whose needs change.

The mindset map artifact itself is not publicly available. Methodology, outcomes, and contextual artifacts (journey maps, workshops, audio-augmented research playback) appear below and across this section.

Northstar Prototype: From Director's Vision to Company-Wide Priority

Context: This is the prototype that translated my user interviews and insights into concrete and mature design strategy. From what was a director's vision and whispered ideas in different silos silenced by "we don't have the evidence—we'll never get funding" to let's talk to our customers where they are, get to know them, and see what can really help them. What's shown here is the prototype, not the production design. The visual fidelity is intentionally mid-tier; subsequent iterations evolved within the company's design standards and are under IP protection. The prototype's value was never the pixels. It was that it gave executives a customer-eyed picture they could not get from program data alone.

The story arc

Pre-hiring: A director's initial vision was framed as a "one-stop shop" customer portal organized around utility-side features, retail enrollment, and financial security.

After interviews and workshops (Years 1 to 2): This prototype, paired with the Mindset Map, reframed the conversation from "what features should we ship" to "here's what customers are actually navigating in their financial lives." It repositioned Michiganders as everyday people managing financial complexity, not as bill payers to be optimized.

With multi-org and non-profit partnership (Years 2 to 3): The work escalated to a company-wide retail enrollment priority sponsored by the SVP. The personas I authored from the interviews and board game workshops research reached Board-of-Directors visibility. The Mindset Map reached the CEO. The combined research and prototypes were critical in breaking down departmental silos that had kept the customer's lived experience out of strategic conversations.

The business problem the work helped reframe

The existing Shut-Off Protection Plan had a 95% failure rate. The plan kept customers' lights on for its duration but did not sufficiently communicate the total balance due after reconciliation, so nearly all customers defaulted at the end. The prototype, alongside the interviews and Mindset Map, made the failure mode legible to executive stakeholders: the problem was not customer behavior, it was that the program was not designed for the financial reality of the people it was meant to serve. This is how we began to tackle customer skepticism and foster trust.

Selected prototype screens

Two screens that became reference points in initial executive conversations. The dashboard reframes the customer view from bill-payment compliance to goal progress and proactive financial wellness. The "What's next" screen introduces an achievement and education layer designed to support customers navigating financial fragility, not penalize them for it and introduce them to retail enrollments and non-profit partnerships to navigate out of arrears and begin financial stability through community resources.

Northstar prototype dashboard. Energy bill trend chart at top showing 12 months. Below: 'Your goals' section with a payment progress card. Next payment $33 of $250 due. Progress bar shows partial completion. Avg monthly payment $30. Start date Nov 2025. 'What's next' CTA button.
Dashboard reframing the customer view from bill-payment compliance to goal progress and proactive financial wellness.
Northstar prototype 'What's next' screen. Large gauge at top showing payment progress: Current $200 paid (+20%), Next payment $33, Total Due $250. 'Add achievement' CTA. Suggested Achievements section lists Rainy day, New car, Own home. Resources lessons section: How to save, Ways to save, Principles of saving.
Achievement and education layer designed to support customers navigating financial fragility, not penalize them for it.

Outcome: Within 18 months, the work shifted from a team-level initiative to a company-wide retail enrollment priority. Research artifacts informed by this prototype reached the highest levels of the organization. The methodology, not the visual fidelity, was the contribution.

Challenge

When I joined Consumers Energy/CMS Energy, Low and Moderate Income (LMI) customers faced barriers that went unseen, their experiences often misunderstood or neglected in product design. Internal teams operated in silos, each seeing only fragments of the full picture—hidden to the complete lives our customers lived every day.

Early whiteboard designs for the mindset map
Early whiteboarding to uncover systemic fragmentation through a decision tree that mapped with customer journey insights and enrollment flows set the stage for something bigger. A deeper transformation through a mindset map that paints a macro view of our customers' lives as everyday people.

Without aligning stakeholders around an agreed-upon human-centered vision, our products risked remaining disconnected, misaligned, and ultimately ineffective.

Solution

I decided we needed more than just data—we needed empathy. Real stories from real lives.

First, I led deeply empathetic interviews, immersing stakeholders in the practical and emotional realities of LMI customers across Michigan. We heard the fatigue of ALICE customers (Asset Limited, Income Constrained, Employed) juggling bills, felt the frustration of excessively complicated enrollment red-tape, and recognized the quiet dignity of people striving to keep their families comfortable.

Storyboard visualization of a stakeholders dream product from a workshop
Visualizing a Stakeholder’s Dream
This storyboard transformed a stakeholder's abstract vision into an actionable cross-functional plan. By grounding big ideas in the everyday, we bridged strategic aspiration with practical empathy, which shaped ambitious features to meet human problems.

Design Philosophy

Designing with stakeholders isn’t about aligning dislocated plans, it's about aligning hearts to create movement.

  • Stories as Connective Tissue: Facts inform, but stories transform. I strategically shift the narrative from disconnected silos to a shared story of the people involved, allowing teams to emotionally invest in the outcomes they're shaping.
  • Human-Centered Means Story-Centered: I believe the best way to create empathy isn't through personas alone, but by sharing the customers' stories in a way that advocates for the whole human they are. Stakeholders who resonate with stories are inspired to create solutions that truly benefit the people they serve.
  • Listening as a Radical Act: Through compassionate listening and authentic stories, I reframed the perception of our LMI customers; not as abstract segments, but as resilient as the individuals we would be privileged to positively impact.
  • Building Vision, Not Just Artifacts: Confidentiality may limit tangible deliverables, but storytelling creates something even more powerful—a shared vision, lasting alignment, and a compassionate consensus that guides every decision going forward.

Solution (cont.)

What surfaced through these stories reshaped the path ahead. They sparked a moment of clarity and momentum. I designed an interactive Journey Map Experience: a compelling visuals interwoven with authentic customer voices, revealing how our marketing and outreach either eased burdens or unintentionally created them.

Discoverability Drives Engagement Even Without Incentives
Across the organization, many believed that highlighting premium features, even those offered for free to qualifying customers, would overwhelm systems or set unsustainable expectations. Some product teams hesitated to promote them altogether.
But in this clip, a participant returns to the outreach and suddenly notices the mention of premium upgrades. Their reaction is immediate: they’re surprised, intrigued, and want to learn more; and enroll even, not because they expect to receive the upgrade—as product owners percieved, but because the feature alone signals value.
The insight was clear: withholding key features out of fear limits potential impact. When customers discover what’s possible, it builds trust. Clarity builds curiosity. People often want to engage when they feel part of the picture, not left out of it.
Take Two: When Imagery Does Resonate
Stakeholders assumed marketing imagery fell flat—that it reinforced perceptions of Consumers Energy as municipal, cold, and transactional. But in this clip, customers offer a different story. When asked about outreach materials, they shared how specific imagery marketing emails shaped their mindset, make way for trust and prompt engagement or the exact opposite.
This moment challenged prevailing internal beliefs. The insight reshaped MVP priorities and shifted focus toward outreach that feels timely, human, and aligned with how customers want to feel supported. It wasn’t just what the program offered. It was how it entered the customer’s life that mattered most.
Framing the Experience: Opening and Closing Moments
Designed not just to inform but to emotionally anchor, the beginning and end of the Journey Map Experience invite stakeholders into a space of reflection, insight, and alignment. These moments use delight, pacing, and ambiance to soften fragmented resistance and open the door for empathy and set the stage for insight and leaving a long lasting impression after the stories end.
Sound as Empathy Catalyst
Full Video of Interactive Journey Map with Customer Feedback (~8 mins)

By layering customer audio clips directly into the visual journey, stakeholders could feel firsthand the gap between our intentions and customers' actual experiences in compelling narrative.

This moment was later highlighted in our internal culture publication as a turning point for breaking down silos and rebuilding shared vision:

Culture change documentation
Breaking Silos Through Story and Sound
This excerpt from a culture publication demonstrates how our audio-augmented journey map disrupted fragmented strategies and led to confidence in changing the product and its website. Though text-heavy, it reflects the lasting organizational change sparked by storytelling; not just in deliverables, but in mindset.

Engaging stakeholders through stories transformed internal perspectives. Silos began dissolving; teams started speaking a common language rooted in their own experience and the customers' as well. Stakeholders who once debated strategy in abstract terms now vividly saw and heard their impact on real people.

At the same time, we also ran hands-on workshops where we used storytelling methods to turn abstract strategy into something people could feel and build around."

High-fidelity stakeholder influenced journey map
Real Voices, Real Reflection
Built collaboratively in a live session, this full-scale journey map combined customer quotes (some real, some invented) with stakeholder expertise. By inviting teams to defeat assumption, we challenged bias, elevated unheard voices, and revealed gaps in collective understanding.
Journey map empathy game
Empathy Game Interaction
Captured during a board game activity, this close-up shows a moment where stakeholders stepped into the shoes of LMI customers and made real choices under imagined constraints. This play-based method fostered emotional connection and deliverable insight beyond traditional research readouts.

Through these sessions, product owners, executives, and frontline teams collectively envisioned old and new products through the lens of customer lives, not just organizational metrics.

Sticky notes on a blank journey map canvas in a workshop leveraging 2 core features for an MVP
Workshop Journey Map
Stakeholders debated two competing visions for the LMI MVP.
Through guided facilitation and live synthesis, we aligned diverse perspectives and clarified assumptions and surfacing shared priorities.
Product journey map canvas overlayed with stickies from the beginning of a workshop
LMI Product Navigation
This workshop journey map detailed how LMI customers engaged with layered systems across multiple products. Though product-specific, it uncovered unmet needs and misaligned expectations that hindered broader service strategy.

Storytelling allows me to establish trust and promote collaboration, culminating in an organizational shift towards proactive, empathetic product development designed for human outcomes first, business outcomes second.

Outcomes

AI Eval, Research & Safety

  • Cognitive architecture research (Raffie): multi-system architecture integrating valence, arousal, drive dynamics, satiation, allostatic load, and metacognitive uncertainty as architecturally explicit, measurable variables. Welfare-relevant testbed for the alignment question of whether grounded internal states produce properties that post-hoc steering of an LLM cannot reliably achieve. Ongoing research; selected details available upon request.
  • Energy-based hallucination detection (Top HAT): designed and trained a Hebbian Associative Transformer that integrates generation and factual self-verification in a single forward pass using contrastive energy learning, a six-phase curriculum, and multi-scale factuality scoring. Achieved energy discrimination gaps of +10 to +13 between clean and corrupted text against a >0.3 target, validating a meaningful factuality signal. AUROC currently optimizing toward >0.85 on held-out hallucination detection.
  • Custom architecture: 181M-parameter Hebbian Associative Transformer with multi-scale spectral-normalized energy head (2.9M params), competitive Hebbian memory with sparse, auditable association patterns, and production inference server with multiple generation modes.
  • Multi-model orchestration: 20+ cross-communicating models with intelligent routing, uncertainty quantification, and episodic memory consolidation. Conversational layer achieves 0.78 accuracy and 0.95 loss; system size reduced 86% (7GB to 1GB) via tiered model loading and selective routing, with 60x faster inference (50ms vs 3s).
  • Published frameworks: The Human-AI Collaboration Framework (haicf.com) operationalizes transparency, agency, and collective input as evaluation criteria with case studies and implementation patterns; the fairne.ss one-pager translates the framework for non-technical stakeholders.
  • Philosophical grounding: Graduate-level independent study in philosophy of mind under Dr. Robyn Bluhm (MSU, Spring 2016) covering AI, consciousness, emotion, neuroscience, and ethics. Three argumentative essays available on request; reference letter available on request.
  • Earlier work: Ellsi voice assistant (2016–2017), an emotionally-grounded notification and conversation system designed pre-LLM as a micro-scale application of grounded emotional substrate rather than sentiment synthesis.
  • See visual artifacts and more at the AI Eval, Research & Safety section

Accessibility

  • Accessibility for vulnerable users: led accessibility strategy at Michigan State University and Bishop International Airport through WCAG audits, inclusive design systems, developer workflow redesign, and stakeholder workshops. Reframed accessibility from compliance checklist to harm-reduction practice.
  • 40% reduction in accessibility remediation time for major e-learning platforms through scalable audit and developer-first workflows.
  • Title IX system redesign at MSU directly enabled 100+ survivors of Larry Nassar's abuse to begin filing claims, illustrating accessibility as harm-reduction infrastructure rather than compliance documentation.
  • Inclusive design evaluation matrix adopted as RFP requirements across ~50 platform purchases including Admissions, Scholarships, and Title IX systems.
  • See visual artifacts and more at the Accessibility section

Stakeholder Engagement

  • Mindset Map methodology: a research artifact that tracks how customer mindsets shift across moments of contact rather than freezing customers into static personas. Galvanized internal teams at Consumers Energy to design for vulnerable populations they had not previously seen clearly.
  • Enrollment improvement from 1:6 to 1:3 in low-income energy assistance programs.
  • Influenced the 3-year product roadmap for low and moderate income (LMI) programs at Consumers Energy.
  • Board-of-Directors-level visibility in strategic presentations.
  • Cross-functional alignment through journey mapping, audio-augmented research playback, and storytelling workshops that translated complex service realities into shared product strategy across product owners, executives, and frontline teams.
  • See visual artifacts and more at the Stakeholder Engagement section