At Mixpeek, we're on a mission to make multimodal search (images, videos, audio and text) accessible and powerful. Our latest release introduces fundamental capabilities that address real-world challenges in building multimodal-enabled applications. Let's dive into the motivation and capabilities behind each feature.
Namespaces: Beyond Simple Data Isolation
The Challenge
Organizations struggle with managing multiple environments (dev/prod), different use cases, and evolving machine learning models - all while maintaining consistent APIs and performance.
Our Solution
Namespaces in Mixpeek go beyond simple data isolation. They provide:
- Environment Management
- Create isolated spaces for development, staging, and production
- Test new features without affecting production data
- Maintain separate access controls and quotas
- Model Flexibility
- We abstract away model names and versions
- When we upgrade our models, we automatically re-embed your content
- Switch between different embedding models for different use cases
- Zero changes needed in your application code
- Use Case Optimization
- Configure different vector indexes for different content types
- Optimize search for specific domains (e.g., faces vs. scenes)
- Mix and match feature extractors per namespace
- Support multilingual text embedding models with scene-specific models
Example use case:
# Production environment with high-quality models
prod_namespace = {
"namespace_id": "netflix_prod",
"vector_indexes": ["image_vector", "text_vector"],
"payload_indexes": [...]
}
# Development environment for testing
dev_namespace = {
"namespace_id": "netflix_dev",
"vector_indexes": ["image_vector"], # Limited indexes for cost savings
"payload_indexes": [...]
}
Hybrid Search: Making Vector Search Practical
The Challenge
Pure vector search is powerful but often impractical. Real applications need to combine semantic understanding with traditional filtering and ranking.
Our Solution
Our hybrid search system provides:
- Unified Query Interface
- Combine vector similarity with metadata filters
- Support for complex boolean logic
- Multiple vector queries with automatic result fusion
- Smart Ranking
- Automatic score normalization across different vector spaces
- Configurable weighting between different signals
- Integration with interaction data for dynamic ranking
- Performance Optimization
- Efficient filter-first architecture
- Automatic query planning
- Caching and result reuse
Example complex query:
{
"queries": [
{
"vector_index": "text_vector",
"value": "dramatic car chase scene",
"type": "text"
},
{
"vector_index": "image_vector",
"value": "base64_encoded_reference_image",
"type": "base64"
}
],
"filters": {
"AND": [
{"key": "metadata.year", "value": 2023},
{"key": "metadata.genre", "value": "action"}
]
}
}
Search Interactions: Learning from User Behavior
The Challenge
Search results need to improve over time and adapt to user preferences, but collecting and utilizing interaction data is complex.
Our Solution
Our interaction system enables:
- Automated Learning
- Collect click, view, and feedback data
- Automatic result re-ranking based on user behavior
- Support for explicit and implicit feedback
- Search Analytics
- Track search effectiveness
- Identify content gaps
- Monitor user engagement
- Personalization Pipeline
- Session-based personalization
- Long-term learning from interactions
- Custom ranking models
# Record user interaction
interaction = {
"feature_id": "vid_123",
"interaction_type": "click",
"search_request": original_request,
"metadata": {
"watch_duration": 142,
"user_segment": "premium"
}
}
Feature Extractors: Customizable Understanding
The Challenge
Different applications need different types of understanding from their video content, and processing needs to be efficient and cost-effective.
Our Solution
Our feature extractors provide:
- Modular Processing
- Choose only the features you need
- Configure processing intervals
- Balance quality vs. cost
- Rich Understanding
- Scene-level descriptions
- Object and face detection
- Text extraction (OCR)
- Audio transcription
- Custom JSON extraction
- Efficient Processing
- Smart caching of intermediate results
- Parallel processing pipelines
- Automatic optimization of extraction settings
Example specialized configuration:
{
"interval_sec": 10,
"describe": {
"enabled": true,
"prompt": "Focus on identifying product placements and brand mentions"
},
"detect": {
"logos": {
"enabled": true,
"confidence_threshold": 0.7
}
}
}
Taxonomies: Structured Understanding
The Challenge
Organizations need consistent ways to classify and organize content, but manual classification is time-consuming and error-prone.
Our Solution
Our taxonomy system enables:
- Flexible Classification
- Create custom classification schemes
- Hierarchical taxonomies
- Multiple taxonomies per namespace
- Automated Classification
- ML-powered content categorization
- Confidence scores for classifications
- Bulk processing capabilities
- Integration with Search
- Filter by taxonomy terms
- Faceted search
- Taxonomy-aware ranking
Pre-signed URLs: Secure Content Delivery
The Challenge
Serving video content securely while maintaining performance and controlling access is complex.
Our Solution
Our pre-signed URL system provides:
- Security
- Time-limited access tokens
- Path-restricted URLs
- IP-based restrictions (optional)
- Performance
- CDN integration
- Automatic URL generation
- Preview image generation
- Access Control
- Per-user access tracking
- Usage quotas
- Bandwidth controls
Looking Forward
These features lay the groundwork for our vision of making video understanding accessible to every developer. Coming soon:
- Advanced personalization capabilities
- More pre-trained models
- Real-time processing capabilities
- Enhanced analytics dashboards
Want to learn more? Contact our team for a detailed discussion of how these features can help your specific use case.