Sagittarius
A multimodal AI interface that uses GPT-4 Vision to analyze speech, images, and hand gestures in real-time.
Category: AI Agents

Overview
Sagittarius is a demonstration tool that allows users to interact with AI through voice and visual input. It serves as a functional implementation of multimodal AI capabilities using OpenAI’s GPT-4 Vision model.
How to use it?
Enter your OpenAI API key into the interface to begin. Speak into your microphone and show the camera images or hand gestures to receive real-time AI analysis and responses.
Features
Voice interaction, Real-time image analysis, Hand gesture recognition, Multimodal processing, Browser-based interface
