openalex / 2023

Segment Anything

Alexander M. Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan‐Yen Lo, Piotr Dollár, Ross Girshick

Computer VisionFoundation ModelsPopular and Landmark Papers

We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive – often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at segment-anything.com to foster research into foundation models for computer vision. We recommend reading the full paper at: arxiv.org/abs/2304.02643.

8,614 citations0 influential

Full paper

Read the original paper

Open PDF Source page

Learning resources

arXiv PDFPDF arXiv abstract pagearXiv Google Scholar referencesGoogle Scholar Paper pageOpenAlex Papers with Code searchPapers with Code YouTube explanationsYouTube

Reading state

Discuss in ChatGPT

Uses your own ChatGPT account. The paper context is copied into a tutor prompt before ChatGPT opens.

Preview prompt

You are my AI/ML research paper instructor. I want to deeply understand the paper below.

First, teach it in layers:
1. One-paragraph intuition.
2. Problem statement and why it mattered.
3. Key method, architecture, or algorithm.
4. Important equations or mechanisms, explained intuitively.
5. Experiments and evidence.
6. Limitations, assumptions, and failure modes.
7. How this paper influenced later AI/ML/Deep Learning/GenAI work.
8. A 30-minute study plan with checkpoints.
9. Quiz me with 5 questions and wait for my answers.

When something is not available in the attached context, say what is missing and infer carefully.

### Paper attached as context
Title: Segment Anything
Authors: Alexander M. Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan‐Yen Lo, Piotr Dollár, Ross Girshick
Year: 2023
Venue: Unknown
Categories: Computer Vision, Foundation Models, Popular and Landmark Papers
Citations: 8,614
Paper URL: https://arxiv.org/abs/2304.02643v1
Open PDF: https://arxiv.org/pdf/2304.02643v1

Abstract:
We introduce the Segment Anything (SA) project: a new task, model, and dataset for image segmentation. Using our efficient model in a data collection loop, we built the largest segmentation dataset to date (by far), with over 1 billion masks on 11M licensed and privacy respecting images. The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks. We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive – often competitive with or even superior to prior fully supervised results. We are releasing the Segment Anything Model (SAM) and corresponding dataset (SA-1B) of 1B masks and 11M images at segment-anything.com to foster research into foundation models for computer vision. We recommend reading the full paper at: arxiv.org/abs/2304.02643.

Learning resources:
- PDF: arXiv PDF (https://arxiv.org/pdf/2304.02643v1)
- arXiv: arXiv abstract page (https://arxiv.org/abs/2304.02643v1)
- Google Scholar: Google Scholar references (https://scholar.google.com/scholar?q=Segment%20Anything)
- OpenAlex: Paper page (https://doi.org/10.1109/iccv51070.2023.00371)
- Papers with Code: Papers with Code search (https://paperswithcode.com/search?q=Segment%20Anything)
- YouTube: YouTube explanations (https://www.youtube.com/results?search_query=Segment%20Anything+paper+explained)