Add SAM2 video notebook

opengeos · Sep 16, 2024 · 8ae9208 · 8ae9208
1 parent 0353f00
commit 8ae9208
Show file tree

Hide file tree

Showing 2 changed files with 252 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -14,6 +14,10 @@ private/
 # **/*.png
 **/*.csv
 **/*.pt
+*.mp4
+docs/examples/segments/
+docs/examples/blended/
+
 docs/examples/*.geojson
 
 # C extensions
@@ -119,3 +123,4 @@ ENV/
 docs/examples/segment.cpg
 docs/examples/segment.prj
 docs/changelog_update.md
+docs/examples/landsat_ts.zip
diff --git a/docs/examples/sam2_video.ipynb b/docs/examples/sam2_video.ipynb
@@ -0,0 +1,247 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Segmenting objects from timeseries images with SAM 2\n",
+    "\n",
+    "[![image](https://studiolab.sagemaker.aws/studiolab.svg)](https://studiolab.sagemaker.aws/import/github/opengeos/segment-geospatial/blob/main/docs/examples/sam2_video.ipynb)\n",
+    "[![image](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/opengeos/segment-geospatial/blob/main/docs/examples/sam2_video.ipynb)\n",
+    "\n",
+    "This notebook shows how to segment objects from timeseries with the Segment Anything Model 2 (SAM 2). \n",
+    "\n",
+    "Make sure you use GPU runtime for this notebook. For Google Colab, go to `Runtime` -> `Change runtime type` and select `GPU` as the hardware accelerator. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Install dependencies\n",
+    "\n",
+    "Uncomment and run the following cell to install the required dependencies."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# %pip install -U segment-geospatial"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Import libraries"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import leafmap\n",
+    "from samgeo import SamGeo2"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Download sample data\n",
+    "\n",
+    "For now, SamGeo2 supports remote sensing data in the form of RGB images, 8-bit integer. Make sure all images are in the same width and height."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "url = \"https://github.com/opengeos/datasets/releases/download/raster/landsat_ts.zip\""
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "leafmap.download_file(url)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Initialize the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor = SamGeo2(\n",
+    "    model_id=\"sam2-hiera-large\",\n",
+    "    video=True,\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Specify the input data\n",
+    "\n",
+    "Point to the directory containing the images or the video file."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "video_path = \"landsat_ts\"\n",
+    "predictor.set_video(video_path)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Specify the input prompts\n",
+    "\n",
+    "The prompts can be points and boxes. The points are represented as a list of tuples, where each tuple contains the x and y coordinates of the point. The boxes are represented as a list of tuples, where each tuple contains the x, y, width, and height of the box."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompts = {\n",
+    "    1: {\n",
+    "        \"points\": [[1582, 933], [1287, 905], [1473, 998]],\n",
+    "        \"labels\": [1, 1, 1],\n",
+    "        \"frame_idx\": 0,\n",
+    "    },\n",
+    "}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.show_prompts(prompts, frame_idx=0)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Althernatively, prompts can be provided in lon/lat coordinates. The model will automatically convert the lon/lat coordinates to pixel coordinates when the `point_crs` parameter is set to the coordinate reference system of the lon/lat coordinates."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "prompts = {\n",
+    "    1: {\n",
+    "        \"points\": [[-74.3713, -8.5218], [-74.2973, -8.5306], [-74.3230, -8.5495]],\n",
+    "        \"labels\": [1, 1, 1],\n",
+    "        \"frame_idx\": 0,\n",
+    "    },\n",
+    "}\n",
+    "predictor.show_prompts(prompts, frame_idx=0, point_crs=\"EPSG:4326\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Segment the objects"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.predict_video()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Save results\n",
+    "\n",
+    "To save the results as gray-scale GeoTIFFs with the same georeference as the input images:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.save_video_segments(\"segments\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To save the results as blended images and MP4 video:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "predictor.save_video_segments_blended(\n",
+    "    \"blended\", fps=5, output_video=\"segments_blended.mp4\"\n",
+    ")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "sam",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}