Transforming Multimodal Search with Mixpeek 0.9.0

At Mixpeek, we're on a mission to make multimodal search (images, videos, audio and text) accessible and powerful. Our latest release introduces fundamental capabilities that address real-world challenges in building multimodal-enabled applications. Let's dive into the motivation and capabilities behind each feature.

Namespaces: Beyond Simple Data Isolation

The Challenge

Organizations struggle with managing multiple environments (dev/prod), different use cases, and evolving machine learning models - all while maintaining consistent APIs and performance.

Our Solution

Namespaces in Mixpeek go beyond simple data isolation. They provide:

Environment Management
- Create isolated spaces for development, staging, and production
- Test new features without affecting production data
- Maintain separate access controls and quotas
Model Flexibility
- We abstract away model names and versions
- When we upgrade our models, we automatically re-embed your content
- Switch between different embedding models for different use cases
- Zero changes needed in your application code
Use Case Optimization
- Configure different vector indexes for different content types
- Optimize search for specific domains (e.g., faces vs. scenes)
- Mix and match feature extractors per namespace
- Support multilingual text embedding models with scene-specific models

Example use case:

# Production environment with high-quality models
prod_namespace = {
    "namespace_id": "netflix_prod",
    "vector_indexes": ["image_vector", "text_vector"],
    "payload_indexes": [...]
}

# Development environment for testing
dev_namespace = {
    "namespace_id": "netflix_dev",
    "vector_indexes": ["image_vector"],  # Limited indexes for cost savings
    "payload_indexes": [...]
}

Hybrid Search: Making Vector Search Practical

The Challenge

Pure vector search is powerful but often impractical. Real applications need to combine semantic understanding with traditional filtering and ranking.

Our Solution

Our hybrid search system provides:

Unified Query Interface
- Combine vector similarity with metadata filters
- Support for complex boolean logic
- Multiple vector queries with automatic result fusion
Smart Ranking
- Automatic score normalization across different vector spaces
- Configurable weighting between different signals
- Integration with interaction data for dynamic ranking
Performance Optimization
- Efficient filter-first architecture
- Automatic query planning
- Caching and result reuse

Example complex query:

{
    "queries": [
        {
            "vector_index": "text_vector",
            "value": "dramatic car chase scene",
            "type": "text"
        },
        {
            "vector_index": "image_vector",
            "value": "base64_encoded_reference_image",
            "type": "base64"
        }
    ],
    "filters": {
        "AND": [
            {"key": "metadata.year", "value": 2023},
            {"key": "metadata.genre", "value": "action"}
        ]
    }
}

Search Interactions: Learning from User Behavior

The Challenge

Search results need to improve over time and adapt to user preferences, but collecting and utilizing interaction data is complex.

Our Solution

Our interaction system enables:

Automated Learning
- Collect click, view, and feedback data
- Automatic result re-ranking based on user behavior
- Support for explicit and implicit feedback
Search Analytics
- Track search effectiveness
- Identify content gaps
- Monitor user engagement
Personalization Pipeline
- Session-based personalization
- Long-term learning from interactions
- Custom ranking models

# Record user interaction
interaction = {
    "feature_id": "vid_123",
    "interaction_type": "click",
    "search_request": original_request,
    "metadata": {
        "watch_duration": 142,
        "user_segment": "premium"
    }
}

Feature Extractors: Customizable Understanding

The Challenge

Different applications need different types of understanding from their video content, and processing needs to be efficient and cost-effective.

Our Solution

Our feature extractors provide:

Modular Processing
- Choose only the features you need
- Configure processing intervals
- Balance quality vs. cost
Rich Understanding
- Scene-level descriptions
- Object and face detection
- Text extraction (OCR)
- Audio transcription
- Custom JSON extraction
Efficient Processing
- Smart caching of intermediate results
- Parallel processing pipelines
- Automatic optimization of extraction settings

Example specialized configuration:

{
    "interval_sec": 10,
    "describe": {
        "enabled": true,
        "prompt": "Focus on identifying product placements and brand mentions"
    },
    "detect": {
        "logos": {
            "enabled": true,
            "confidence_threshold": 0.7
        }
    }
}

Taxonomies: Structured Understanding

The Challenge

Organizations need consistent ways to classify and organize content, but manual classification is time-consuming and error-prone.

Our Solution

Our taxonomy system enables:

Flexible Classification
- Create custom classification schemes
- Hierarchical taxonomies
- Multiple taxonomies per namespace
Automated Classification
- ML-powered content categorization
- Confidence scores for classifications
- Bulk processing capabilities
Integration with Search
- Filter by taxonomy terms
- Faceted search
- Taxonomy-aware ranking

Pre-signed URLs: Secure Content Delivery

The Challenge

Serving video content securely while maintaining performance and controlling access is complex.

Our Solution

Our pre-signed URL system provides:

Security
- Time-limited access tokens
- Path-restricted URLs
- IP-based restrictions (optional)
Performance
- CDN integration
- Automatic URL generation
- Preview image generation
Access Control
- Per-user access tracking
- Usage quotas
- Bandwidth controls

Looking Forward

These features lay the groundwork for our vision of making video understanding accessible to every developer. Coming soon:

Advanced personalization capabilities
More pre-trained models
Real-time processing capabilities
Enhanced analytics dashboards

Want to learn more? Contact our team for a detailed discussion of how these features can help your specific use case.

Transforming Multimodal Search with Mixpeek 0.9.0

Namespaces: Beyond Simple Data Isolation

The Challenge

Our Solution

Hybrid Search: Making Vector Search Practical

The Challenge

Our Solution

Search Interactions: Learning from User Behavior

The Challenge

Our Solution

Feature Extractors: Customizable Understanding

The Challenge

Our Solution

Taxonomies: Structured Understanding

The Challenge

Our Solution

Pre-signed URLs: Secure Content Delivery

The Challenge

Our Solution

Looking Forward

Ethan Steininger

Multimodal Makers | Mixpeek