- API Documentation: https://docs.mixpeek.com
- Runnable Notebook with your API key: https://dash.mixpeek.com/notebooks
- Python SDK: https://pypi.org/project/mixpeek/
0.8.1
Deduplication
Sometimes the client may not be aware of duplicate files, and for these scenarios where you'd prefer not to have duplicate files indexed you can use the prevent_duplicate: true
parameter on index:
{
"url": "814723487.mp4",
"collection_id": "testing-123",
"prevent_duplicate": true,
"metadata": {
"foo": "bar"
},
"video_settings": [
{
"interval_sec": 10,
"read": {
"model_id": "video-descriptor-v1"
},
"embed": {
"model_id": "multimodal-v1"
},
"transcribe": {
"model_id": "polyglot-v1"
},
"describe": {
"model_id": "video-descriptor-v1",
"prompt": "Create a holistic description of the video, include sounds and screenplay"
}
}
]
}
This creates a hash of the file + settings so that whenever that file / setting combo is presented again it won't index and simply return the original file object.
Metadata filtering
- Complex Boolean Operators: Utilize AND, OR, and NOR operators to create sophisticated search queries.
- Nested Conditions: Combine multiple conditions with different logical operators for granular control.
- Wide Range of Comparison Operators: Including eq, ne, gt, gte, lt, lte, in, nin, regex, and exists.
- Dot Notation for Nested Fields: Access nested metadata fields easily (e.g., "personal.age").
Example Usage:
{
"query": "dog",
"collection_ids": ["tube-test"],
"model_id": "multimodal-v1",
"filters": {
"AND": [
{
"OR": [
{ "key": "metadata.age", "value": 3, "operator": "gte" },
{ "key": "metadata.breed", "value": "labrador", "operator": "eq" }
]
},
{ "key": "metadata.vaccinated", "value": true, "operator": "eq" },
{
"NOR": [
{ "key": "metadata.status", "value": "adopted" },
{ "key": "metadata.location", "value": "shelter" }
]
}
]
}
}
0.8.0
Async indexing
- Longer videos may take some time to fully process, with the task_status API you can poll for the status and when it's complete, receive the full video features (outlined in the next bullet)
def on_task_update(status):
print(f"Current task status: {status}")
status = task.wait_for_done(
sleep_interval=2,
callback=on_task_update
)
file_id = task.file_id
New visual feature extraction models
- (applicable with any image or video
interval_sec
) for the/index
pipeline. All are accessible standalone with/understand/<method>
:describe
: Provide a description of the video segmentread
: Grab the text from the screenembed
: Create a multimodal embedding of theinterval_sec
(if video) or imagedetect.faces
: Saves to keys:detected_face_ids
: returns face_ids for each face that has been registered via/index/face
face_details
: returns various characteristics about faces that were found in the visual asset
json_output
: More of a catch-all for structured data generation, supports all standard json types
{
"url": "video.mp4",
"collection_id": "name",
"should_save": false,
"video_settings": [
{
"interval_sec": 1,
"read": {
"model_id": "video-descriptor-v1"
},
"embed": {
"model_id": "multimodal-v1"
},
"detect": {
"faces": {
"model_id": "face-detector-v1"
}
},
"json_output": {
"response_shape": {
"emotions": ["label"]
},
"prompt": "This is a list of emotion labels, each one should be a string representing the scene."
}
},
{
"interval_sec": 30,
"transcribe": {
"model_id": "polyglot-v1"
}
},
{
"interval_sec": 120,
"describe": {
"model_id": "video-descriptor-v1",
"prompt": "Create a holistic description of the video, include sounds and screenplay"
}
}
]
}
Shared embedding space
Embed image and video now share the same embedding space (1408 dimensions), making multimodal search more accurate and faster. Uses multimodal-v1
model_id
Full file payloads
/collections/file/<file_id>/full
now returns the full features pulled out from the indexing pipeline. For embeddings returned, we support a separate/understand/embed
endpoint this allows you to combine full text and other data types returned to build more custom analytics pipelines.
{
"index_id": "ix-123",
"file_id": "123",
"collection_id": "testing-123",
"status": "DONE",
"metadata": {
"file": {
"source": "url",
"file_name": "123.mp4",
"file_extension": ".mp4",
"mime_type": "video/mp4",
"url": "video.mp4",
"file_size_bytes": 1900065,
"modality": "video"
},
"preview_url": "thumbnail.jpg"
},
"created_at": "2024-10-04T23:27:32.305000",
"video_segments": [
{
"start_time": 0.0,
"end_time": 10.0,
"text": "GoPure\nLift & Tighten\nNeck Cream\n#gopure \n",
"transcription": "Hi I'm back As most of you know I've incorporated Go Pure's lift and tighten neck cream a few weeks ago as part of my routine and I love it I see the difference already in my neck lines in my turkey neck and in the elasticity alone ",
"description": "The video starts with a woman...",
"embedding": [123],
"emotions": [
"happy"
]
},
{
"start_time": 10.0,
"end_time": 12.0,
"text": "GoPure \n#gopure \nP \n",
"transcription": "Go Pure ",
"emotions": [
"happy"
],
"embedding": [123],
"description": "The woman in the video smiles, her eyes light up, and she says, \"Go Pure!\" in a cheerful tone. \n _______________________________________________________ \n"
}
]
}
Video demo: https://www.youtube.com/watch?v=jkIXzfKBvM0&ab_channel=Mixpeek
Detect and index faces
Many of the videos and images we want to perform analysis on contain faces, and since these are unique to each use case there's a dedicated /register/faces
endpoint
# Path to the local image file containing faces
face_image_path = "/path/to/your/face_image.jpg"
# Start the face registration task
task = mixpeek.register.faces(
file_path=face_image_path,
collection_id="social-media",
metadata={"name": "John Doe", "age": 30},
settings={"detection_threshold": 0.8}
)
# Define a callback function (optional)
def on_task_update(status):
print(f"Current task status: {status}")
# Wait for the task to complete
status = task.wait_for_done(
sleep_interval=1,
callback=on_task_update
)