GitHub is Your Product's Source of Truth

When developing an Open Source Software (OSS), it's common to include product screenshots, demo videos, and even documents in your repositories. As the product evolves it becomes challenging to maintain the "versions" of these assets.

When a repo's assets reference older versions, it results in hundreds or thousands of confused users.

In this tutorial, we'll use the Mixpeek API to index an example GitHub repository, discover stale screenshots, and flag them for updating.

Find Screenshots

First let's write a script that finds all pictures in the main branch of a repository:

import os

def list_of_pictures(username, repository, branch):
  # Set the directory where the GitHub repository is located
  # This example assumes the repository is in a directory named "repo"
  full_repo_path = "https://raw.githubusercontent.com/{}/{}/{}/".format(username, repository, branch)
  list_of_pictures = []

  # Loop through all the files and directories in the repository directory
  for item in os.listdir(repository):
    # Check if the item is a file
    if os.path.isfile(os.path.join(directory, item)):
      # Check if the file has a .png extension
      if item.endswith(".png"):
        # append full repository path to array
        list_of_pictures.append(full_repo_path + item)

  return list_of_pictures 

Next we find all pictures in the main branch and index them:

from mixpeek import Mixpeek

mix = Mixpeek(mixpeek_key="MIXPEEK_API_KEY")

pictures = list_of_pictures(
  username="mixpeek"
  repository="use-cases"
  branch="main"
)

mix.index(pictures)

Now we can search for text from a button that has recently changed in our product.

mix.search("sync now")  

[
  {
    "file_url": "https://raw.githubusercontent.com/mixpeek/use-cases/main/screenshot-1.png",
    "file_id": "63738f90829faf6a25053f61",
    "importance": "100%"
  },
  {
    "file_url": "https://raw.githubusercontent.com/mixpeek/use-cases/main/screenshot-2.png",
    "file_id": "63738f90829faf6a25053f62",
    "importance": "98%"
  },
  {
    "file_url": "https://raw.githubusercontent.com/mixpeek/use-cases/main/screenshot-3.png",
    "file_id": "63738f90829faf6a25053f63",
    "importance": "90%"
  }
]
    

Now we can update each of these product screenshots. 😅 We've just saved hours of time and rescued our prized users from confusion.

Other Use Cases

  • Word Count: Doing an anlysis on the count of certain words across your repo
  • Animations & Video: Often we include GIFs in our repos, they too can be searched for updating.
  • Security: Ensure nobody is including API Keys, or Secrets in the product screenshots.
About the author
Ethan Steininger

Ethan Steininger

Former GTM Lead of MongoDB's NLP platform, Atlas Search. Occasionally off the grid in his self-converted camper van.

Multimodal Makers | Mixpeek

Multimodal Pipelines for AI

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Multimodal Makers | Mixpeek.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.