Release Notes


0.8.1

Deduplication

Sometimes the client may not be aware of duplicate files, and for these scenarios where you'd prefer not to have duplicate files indexed you can use the prevent_duplicate: true parameter on index:

{
    "url": "814723487.mp4",
    "collection_id": "testing-123",
    "prevent_duplicate": true,
    "metadata": {
        "foo": "bar"
    },
    "video_settings": [
        {
            "interval_sec": 10,
            "read": {
                "model_id": "video-descriptor-v1"
            },
            "embed": {
                "model_id": "multimodal-v1"
            },
            "transcribe": {
                "model_id": "polyglot-v1"
            },
            "describe": {
                "model_id": "video-descriptor-v1",
                "prompt": "Create a holistic description of the video, include sounds and screenplay"
            }
        }
    ]
}

This creates a hash of the file + settings so that whenever that file / setting combo is presented again it won't index and simply return the original file object.

Index Url - Mixpeek

Metadata filtering

  1. Complex Boolean Operators: Utilize AND, OR, and NOR operators to create sophisticated search queries.
  2. Nested Conditions: Combine multiple conditions with different logical operators for granular control.
  3. Wide Range of Comparison Operators: Including eq, ne, gt, gte, lt, lte, in, nin, regex, and exists.
  4. Dot Notation for Nested Fields: Access nested metadata fields easily (e.g., "personal.age").
Search - Mixpeek

Example Usage:

{
  "query": "dog",
  "collection_ids": ["tube-test"],
  "model_id": "multimodal-v1",
  "filters": {
    "AND": [
      {
        "OR": [
          { "key": "metadata.age", "value": 3, "operator": "gte" },
          { "key": "metadata.breed", "value": "labrador", "operator": "eq" }
        ]
      },
      { "key": "metadata.vaccinated", "value": true, "operator": "eq" },
      {
        "NOR": [
          { "key": "metadata.status", "value": "adopted" },
          { "key": "metadata.location", "value": "shelter" }
        ]
      }
    ]
  }
}

0.8.0

Async indexing

  • Longer videos may take some time to fully process, with the task_status API you can poll for the status and when it's complete, receive the full video features (outlined in the next bullet)
def on_task_update(status):
    print(f"Current task status: {status}")

status = task.wait_for_done(
    sleep_interval=2,
    callback=on_task_update
)
file_id = task.file_id
Task Status - Mixpeek
Retrieve the status of a specific task by its ID.

New visual feature extraction models

  • (applicable with any image or video interval_sec) for the /index pipeline. All are accessible standalone with /understand/<method>:
    • describe: Provide a description of the video segment
    • read: Grab the text from the screen
    • embed: Create a multimodal embedding of the interval_sec (if video) or image
    • detect.faces: Saves to keys:
      • detected_face_ids: returns face_ids for each face that has been registered via /index/face
      • face_details: returns various characteristics about faces that were found in the visual asset
    • json_output: More of a catch-all for structured data generation, supports all standard json types
{
    "url": "video.mp4",
    "collection_id": "name",
    "should_save": false,
    "video_settings": [
        {
            "interval_sec": 1,
            "read": {
                "model_id": "video-descriptor-v1"
            },
            "embed": {
                "model_id": "multimodal-v1"
            },
            "detect": {
                "faces": {
                    "model_id": "face-detector-v1"
                }
            },
            "json_output": {
                "response_shape": {
                    "emotions": ["label"]
                },
                "prompt": "This is a list of emotion labels, each one should be a string representing the scene."
            }
        },
        {
            "interval_sec": 30,
            "transcribe": {
                "model_id": "polyglot-v1"
            }
        },
        {
            "interval_sec": 120,
            "describe": {
                "model_id": "video-descriptor-v1",
                "prompt": "Create a holistic description of the video, include sounds and screenplay"
            }
        }
    ]
}
URL - Mixpeek
Index content from a specified URL with optional metadata and processing settings.

Shared embedding space

Embed image and video now share the same embedding space (1408 dimensions), making multimodal search more accurate and faster. Uses multimodal-v1 model_id

Overview - Mixpeek
ML model options for each method

Full file payloads

  • /collections/file/<file_id>/full now returns the full features pulled out from the indexing pipeline. For embeddings returned, we support a separate /understand/embed endpoint this allows you to combine full text and other data types returned to build more custom analytics pipelines.
{
    "index_id": "ix-123",
    "file_id": "123",
    "collection_id": "testing-123",
    "status": "DONE",
    "metadata": {
        "file": {
            "source": "url",
            "file_name": "123.mp4",
            "file_extension": ".mp4",
            "mime_type": "video/mp4",
            "url": "video.mp4",
            "file_size_bytes": 1900065,
            "modality": "video"
        },
        "preview_url": "thumbnail.jpg"
    },
    "created_at": "2024-10-04T23:27:32.305000",
    "video_segments": [
        {
            "start_time": 0.0,
            "end_time": 10.0,
            "text": "GoPure\nLift & Tighten\nNeck Cream\n#gopure \n",
            "transcription": "Hi I'm back As most of you know I've incorporated Go Pure's lift and tighten neck cream a few weeks ago as part of my routine and I love it I see the difference already in my neck lines in my turkey neck and in the elasticity alone ",
            "description": "The video starts with a woman...",
            "embedding": [123],
            "emotions": [
                "happy"
            ]
        },
        {
            "start_time": 10.0,
            "end_time": 12.0,
            "text": "GoPure \n#gopure \nP \n",
            "transcription": "Go Pure ",
            "emotions": [
                "happy"
            ],
            "embedding": [123],
            "description": "The woman in the video smiles, her eyes light up, and she says, \"Go Pure!\" in a cheerful tone. \n        _______________________________________________________ \n"
        }
    ]
}

Video demo: https://www.youtube.com/watch?v=jkIXzfKBvM0&ab_channel=Mixpeek

Full File - Mixpeek
Retrieve all the contents of a file by its file_id.

Detect and index faces

Many of the videos and images we want to perform analysis on contain faces, and since these are unique to each use case there's a dedicated /register/faces endpoint

# Path to the local image file containing faces
face_image_path = "/path/to/your/face_image.jpg"

# Start the face registration task
task = mixpeek.register.faces(
    file_path=face_image_path,
    collection_id="social-media",
    metadata={"name": "John Doe", "age": 30},
    settings={"detection_threshold": 0.8}
)

# Define a callback function (optional)
def on_task_update(status):
    print(f"Current task status: {status}")

# Wait for the task to complete
status = task.wait_for_done(
    sleep_interval=1,
    callback=on_task_update
)
Face - Mixpeek
Register a face for future detection and search operations.

Multimodal Makers | Mixpeek

Multimodal Pipelines for AI

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Multimodal Makers | Mixpeek.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.