External Mask Tracking with SAM 3¶

This notebook demonstrates how to use external segmentation masks with SAM 3's tracking capability. This is useful when:

SAM 3's text prompts don't segment the exact objects you need
You have masks from another segmentation model (YOLO, Detectron2, GroundingDINO, etc.)
You want to use a specialized model for initial segmentation and SAM 3 for tracking

Workflow¶

Initialize tracker - Call init_tracker() with any text prompt to initialize SAM3's tracker
Add external masks - Use add_mask_prompt() to add masks from your external model
Propagate - Use SAM 3's tracking to propagate masks through the video

Installation¶

SAM 3 requires CUDA-capable GPU. Install with:

In [ ]:

Copied!

# %pip install "segment-geospatial[samgeo3]"
# %pip install "segment-geospatial[samgeo3]"

Import Libraries¶

In [ ]:

Copied!

import os
import numpy as np
from samgeo import SamGeo3Video, download_file
import os
import numpy as np
from samgeo import SamGeo3Video, download_file

Download Sample Video¶

In [ ]:

Copied!

url = "https://github.com/opengeos/datasets/releases/download/videos/cars.mp4"
video_path = download_file(url)
url = "https://github.com/opengeos/datasets/releases/download/videos/cars.mp4"
video_path = download_file(url)

Method 1: Using SAM 3's Text Prompt to Generate Initial Masks¶

In this example, we'll use SAM 3's own text prompt to generate masks, then demonstrate how to use those masks with the add_mask_prompt() method. In a real workflow, you would replace this with masks from your preferred external model.

In [ ]:

Copied!

# Initialize SAM 3 video predictor
sam = SamGeo3Video()
# Initialize SAM 3 video predictor
sam = SamGeo3Video()

In [ ]:

Copied!

# Load the video
sam.set_video(video_path)
# Load the video
sam.set_video(video_path)

Step 1: Generate initial masks using text prompt¶

This simulates getting masks from an external model. In practice, you would use your own segmentation model here.

In [ ]:

Copied!

# Generate masks using text prompt (simulating external model output)
sam.generate_masks("car")
# Generate masks using text prompt (simulating external model output)
sam.generate_masks("car")

In [ ]:

Copied!

# Show the segmentation results
sam.show_frame(0, axis="on")
# Show the segmentation results
sam.show_frame(0, axis="on")

Step 2: Extract masks from frame 0¶

Extract the binary masks that we'll use to demonstrate the add_mask_prompt() method.

In [ ]:

Copied!





# Extract masks from the first frame
formatted_outputs = sam._format_outputs()
frame_0_masks = formatted_outputs.get(0, {})

# Store masks and their object IDs
external_masks = []
original_obj_ids = []
for obj_id, mask in frame_0_masks.items():
    if isinstance(obj_id, int):
        external_masks.append(np.array(mask))
        original_obj_ids.append(obj_id)

print(f"Extracted {len(external_masks)} masks from frame 0")
print(f"Object IDs: {original_obj_ids}")
# Extract masks from the first frame
formatted_outputs = sam._format_outputs()
frame_0_masks = formatted_outputs.get(0, {})

# Store masks and their object IDs
external_masks = []
original_obj_ids = []
for obj_id, mask in frame_0_masks.items():
    if isinstance(obj_id, int):
        external_masks.append(np.array(mask))
        original_obj_ids.append(obj_id)

print(f"Extracted {len(external_masks)} masks from frame 0")
print(f"Object IDs: {original_obj_ids}")

Step 3: Add external masks for tracking¶

Now add the extracted masks using add_mask_prompt(). Use new object IDs (starting from 100) to avoid conflicts with the text-prompt detected objects.

First, let's initialize the tracker.

In [ ]:

Copied!

sam.init_tracker()
sam.init_tracker()

In [ ]:

Copied!





# Add masks one by one using add_mask_prompt()
# Use obj_ids starting from 100 to avoid conflicts with detected objects
for i, mask in enumerate(external_masks):
    sam.add_mask_prompt(mask, obj_id=100 + i, frame_idx=0)
# Add masks one by one using add_mask_prompt()
# Use obj_ids starting from 100 to avoid conflicts with detected objects
for i, mask in enumerate(external_masks):
    sam.add_mask_prompt(mask, obj_id=100 + i, frame_idx=0)

In [ ]:

Copied!

# Propagate masks through the video
sam.propagate()
# Propagate masks through the video
sam.propagate()

In [ ]:

Copied!

# Show results
sam.show_frame(0, axis="on")
# Show results
sam.show_frame(0, axis="on")

In [ ]:

Copied!

# Show multiple frames
sam.show_frames(frame_stride=20, ncols=3)
# Show multiple frames
sam.show_frames(frame_stride=20, ncols=3)

Method 2: Using add_masks_prompt() for Multiple Masks¶

For convenience, you can add multiple masks at once using add_masks_prompt().

Note: Object IDs are auto-assigned starting from 100 to avoid conflicts.

In [ ]:

Copied!

sam.init_tracker()
sam.init_tracker()

In [ ]:

Copied!

# Add all masks at once (IDs will be auto-assigned starting from 100)
sam.add_masks_prompt(external_masks)
# Add all masks at once (IDs will be auto-assigned starting from 100)
sam.add_masks_prompt(external_masks)

In [ ]:

Copied!

# Propagate
sam.propagate()
# Propagate
sam.propagate()

In [ ]:

Copied!

# Show results
sam.show_frames(frame_stride=20, ncols=3)
# Show results
sam.show_frames(frame_stride=20, ncols=3)

Save Results¶

In [ ]:

Copied!

os.makedirs("output", exist_ok=True)

# Save mask images
sam.save_masks("output/external_masks")
os.makedirs("output", exist_ok=True)

# Save mask images
sam.save_masks("output/external_masks")

In [ ]:

Copied!

# Save video with blended masks
sam.save_video("output/external_tracked.mp4", fps=25)
# Save video with blended masks
sam.save_video("output/external_tracked.mp4", fps=25)

Clean Up¶

In [ ]:

Copied!

# Close session to free GPU resources
sam.close()
# Close session to free GPU resources
sam.close()

In [ ]:

Copied!

sam.shutdown()
sam.shutdown()

Summary¶

The add_mask_prompt() and add_masks_prompt() methods allow you to:

Use any segmentation model - Not limited to SAM 3's text prompts
Get accurate initial segmentation - Use specialized models for your specific objects
Leverage SAM 3's tracking - Powerful temporal tracking through the video
Selective tracking - Choose which objects to track

Important Workflow Notes¶

Initialize tracker first: Always call init_tracker() before add_mask_prompt()
Don't reset: Do not call reset() before adding external masks
Use unique IDs: Use obj_id >= 100 to avoid conflicts with text-detected objects

API Reference¶

add_mask_prompt(mask, obj_id, frame_idx=0, num_points=5) - Add a single mask
add_masks_prompt(masks, obj_ids=None, frame_idx=0) - Add multiple masks at once

Mask Requirements¶

Binary mask array of shape (H, W)
Values: True/1 for object, False/0 for background
Automatically resized to match video frame dimensions