It features voice and text input, runs on low-resource machines, and handles over 1,000 multi-context queries per day.
The kiosk queries a Supabase-hosted database with embedded knowledge entries managed through a separate Svelte-based CMS.
When a query is made, the system performs a semantic search using pgvector to find the most relevant information, then injects that data—alongside recent chat history—into a high-parameter AI model hosted serverlessly via Groq for response generation.
Embeddings are computed locally using Xenova to reduce cloud overhead, while voice input is transcribed into text before processing. The entire retrieval and response logic was written from scratch, with no external AI orchestration tools used.
Every part of the system—from UI to AI pipeline—was built and integrated manually, with a strong focus on performance, clarity, and deployment constraints.