# External Mask Tracking with SAM 3

[![image](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opengeos/segment-geospatial/blob/main/docs/examples/sam3_video_masks.ipynb)

This notebook demonstrates how to use external segmentation masks with SAM 3's tracking capability. This is useful when:

- SAM 3's text prompts don't segment the exact objects you need
- You have masks from another segmentation model (YOLO, Detectron2, GroundingDINO, etc.)
- You want to use a specialized model for initial segmentation and SAM 3 for tracking

## Workflow

1. **Initialize tracker** - Call `init_tracker()` with any text prompt to initialize SAM3's tracker
2. **Add external masks** - Use `add_mask_prompt()` to add masks from your external model
3. **Propagate** - Use SAM 3's tracking to propagate masks through the video

## Installation

SAM 3 requires CUDA-capable GPU. Install with:


In [None]:
# %pip install "segment-geospatial[samgeo3]"

## Import Libraries


In [None]:
import os
import numpy as np
from samgeo import SamGeo3Video, download_file

## Download Sample Video


In [None]:
url = "https://github.com/opengeos/datasets/releases/download/videos/cars.mp4"
video_path = download_file(url)

## Method 1: Using SAM 3's Text Prompt to Generate Initial Masks

In this example, we'll use SAM 3's own text prompt to generate masks, then demonstrate how to use those masks with the `add_mask_prompt()` method. In a real workflow, you would replace this with masks from your preferred external model.


In [None]:
# Initialize SAM 3 video predictor
sam = SamGeo3Video()

In [None]:
# Load the video
sam.set_video(video_path)

### Step 1: Generate initial masks using text prompt

This simulates getting masks from an external model. In practice, you would use your own segmentation model here.


In [None]:
# Generate masks using text prompt (simulating external model output)
sam.generate_masks("car")

In [None]:
# Show the segmentation results
sam.show_frame(0, axis="on")

![](https://github.com/user-attachments/assets/a66841cb-a1c5-476b-a865-0045c4ae821d)

### Step 2: Extract masks from frame 0

Extract the binary masks that we'll use to demonstrate the `add_mask_prompt()` method.


In [None]:
# Extract masks from the first frame
formatted_outputs = sam._format_outputs()
frame_0_masks = formatted_outputs.get(0, {})

# Store masks and their object IDs
external_masks = []
original_obj_ids = []
for obj_id, mask in frame_0_masks.items():
    if isinstance(obj_id, int):
        external_masks.append(np.array(mask))
        original_obj_ids.append(obj_id)

print(f"Extracted {len(external_masks)} masks from frame 0")
print(f"Object IDs: {original_obj_ids}")

### Step 3: Add external masks for tracking

Now add the extracted masks using `add_mask_prompt()`. Use new object IDs (starting from 100) to avoid conflicts with the text-prompt detected objects.

First, let's initialize the tracker.

In [None]:
sam.init_tracker()

In [None]:
# Add masks one by one using add_mask_prompt()
# Use obj_ids starting from 100 to avoid conflicts with detected objects
for i, mask in enumerate(external_masks):
    sam.add_mask_prompt(mask, obj_id=100 + i, frame_idx=0)

In [None]:
# Propagate masks through the video
sam.propagate()

In [None]:
# Show results
sam.show_frame(0, axis="on")

![](https://github.com/user-attachments/assets/f5ce1a2c-06fd-4615-84d3-06c31f787b2d)

In [None]:
# Show multiple frames
sam.show_frames(frame_stride=20, ncols=3)

![](https://github.com/user-attachments/assets/2b6eee3b-f749-4d71-bc6c-e403659551d4)

## Method 2: Using add_masks_prompt() for Multiple Masks

For convenience, you can add multiple masks at once using `add_masks_prompt()`.

**Note**: Object IDs are auto-assigned starting from 100 to avoid conflicts.


In [None]:
sam.init_tracker()

In [None]:
# Add all masks at once (IDs will be auto-assigned starting from 100)
sam.add_masks_prompt(external_masks)

In [None]:
# Propagate
sam.propagate()

In [None]:
# Show results
sam.show_frames(frame_stride=20, ncols=3)

![](https://github.com/user-attachments/assets/7268195a-4405-45c3-81e1-79642954259f)

## Save Results


In [None]:
os.makedirs("output", exist_ok=True)

# Save mask images
sam.save_masks("output/external_masks")

In [None]:
# Save video with blended masks
sam.save_video("output/external_tracked.mp4", fps=25)

## Clean Up


In [None]:
# Close session to free GPU resources
sam.close()

In [None]:
sam.shutdown()

## Summary

The `add_mask_prompt()` and `add_masks_prompt()` methods allow you to:

1. **Use any segmentation model** - Not limited to SAM 3's text prompts
2. **Get accurate initial segmentation** - Use specialized models for your specific objects
3. **Leverage SAM 3's tracking** - Powerful temporal tracking through the video
4. **Selective tracking** - Choose which objects to track

### Important Workflow Notes

1. **Initialize tracker first**: Always call `init_tracker()` before `add_mask_prompt()`
2. **Don't reset**: Do not call `reset()` before adding external masks
3. **Use unique IDs**: Use `obj_id >= 100` to avoid conflicts with text-detected objects

### API Reference

- `add_mask_prompt(mask, obj_id, frame_idx=0, num_points=5)` - Add a single mask
- `add_masks_prompt(masks, obj_ids=None, frame_idx=0)` - Add multiple masks at once

### Mask Requirements

- Binary mask array of shape `(H, W)`
- Values: `True/1` for object, `False/0` for background
- Automatically resized to match video frame dimensions
