bgless.video

Elementos de video transparente para apps, edición y APIs.

Añade video, imagen o prompt. Obtén assets listos con alfa, historial de trabajos y acceso por API.

WebM alpha
Overlays de navegador
MOV alpha
Flujos de edición
REST API
Pipelines por lote

Eliminar fondo de video

2 salidas seleccionadas

alpha
Listo

Trabajos recientes de generación

Previews grandes de los últimos trabajos con alfa de esta cuenta.

Cargando trabajos recientes

SAM 3 video background remover

SAM 3 Background Remover — Open-Vocabulary Video Matting

Drive Meta SAM 3 with natural-language concept prompts to remove backgrounds from any video. MatAnyone refines edges to alpha-grade output suitable for transparent WebM, ProRes 4444, and PNG sequence delivery.

Start with a free preview
Open-vocabulary text prompts ("the person", "the red car") — no point-and-click
Native video mask propagation across frames — no per-frame drift
MatAnyone alpha refinement for hair, fur, and transparent edges
WebM VP9-alpha, MOV ProRes 4444, PNG sequence, GIF, WebP outputs
RunPod Serverless GPU dispatch with adaptive polling

Why SAM 3 changes video background removal

SAM 3 (Segment Anything Model 3) is the first promptable concept segmentation model from Meta that natively tracks objects across video frames in a single forward pass. Unlike SAM 2 + GroundingDINO pipelines that require a separate detector and per-frame mask propagation, SAM 3 understands phrases like “every red baseball cap” and emits temporally consistent masks for every matching instance. Combined with MatAnyone for alpha refinement, the result is broadcast-quality transparent video without a green screen.

Open-vocabulary text prompts replace bounding boxes and clicks
Concept-level instance segmentation in a single model
Native video tracking eliminates flicker between frames
Released November 2025 under the SAM 3 license (research + commercial with restrictions)

How our SAM 3 background remover works

Upload a video, optionally type a concept prompt, and our pipeline orchestrates the rest: source video is streamed to a Cloudflare R2 staging bucket, dispatched to a RunPod Serverless GPU endpoint that runs SAM 3 video-predict mode, refined frame-by-frame with MatAnyone for temporally stable alpha edges, then composited and encoded with FFmpeg into the format you chose. Most 30-second 1080p clips finish in under 90 seconds on an RTX A5000.

Source upload → Cloudflare R2 (presigned PUT)
SAM 3 inference on RunPod Serverless (A5000 / L40S)
MatAnyone trimap-based alpha refinement
FFmpeg encode (WebM VP9-alpha or ProRes 4444 with audio passthrough)
Output URL served from cdn.bgless.video

Model tiers and pricing

Choose the SAM 3 variant that fits your speed and quality budget. All tiers include MatAnyone refinement, audio passthrough, and unlimited preview generation.

sam3-tiny — fastest preview path, 45 credits/min, ideal for thumbnails
sam3-base — balanced quality, 60 credits/min, default consumer tier
sam3-pro — SAM 3 Large + MatAnyone refinement, 180 credits/min, the production tier with text prompt support
sam3-human — human-class prior, 45 credits/min, optimized for face cam and portrait video

Prompt modes

SAM 3 accepts four prompt modalities. Pick whichever matches your UX.

auto — no prompt, segments the dominant foreground (simplest UX)
text — natural language concept ("the dog", "the surfboard")
box — normalized bounding box on the first frame (precise control)
point — positive/negative click hints on the first frame (corrects edge cases)

Output formats and delivery

Every output preserves the source frame rate, dimensions (subject to your max-dimension cap), and audio track. Transparent formats carry true alpha; MP4 falls back to color/image/video background composition.

WebM (VP9 + alpha) — browser-native transparent video
MOV (ProRes 4444) — Adobe Premiere / DaVinci Resolve workflows
PNG sequence (ZIP) — frame-perfect compositing in After Effects
GIF / WebP — social and lightweight web embeds
MP4 (H.264) — universal playback when alpha is not required

Preguntas antes de crear

Do I need a green screen to use the SAM 3 background remover?

No. SAM 3 detects and segments subjects directly from any footage. Green screen is supported but never required.

How does SAM 3 differ from SAM 2 or Grounded-SAM 2?

SAM 3 is a single model that handles both concept-level open-vocabulary detection AND video mask propagation. SAM 2 needs a separate detector (e.g. GroundingDINO) for text prompts and per-frame work for video. SAM 3 is faster, more consistent, and accepts richer prompts.

What does the text prompt support look like?

Type a noun phrase such as "the person", "the basketball", or "every red car". SAM 3 returns masks for every instance matching the concept. Negative phrases like "not the shadow" further constrain the result.

Can I get transparent video output?

Yes. Choose WebM (VP9 with alpha) for browser playback, MOV (ProRes 4444) for editing software, or a PNG sequence ZIP for frame-perfect compositing.

How long does processing take?

On an RTX A5000, a 30-second 1080p clip typically completes in 60–90 seconds end-to-end. Longer or 4K clips scale roughly linearly.

Is the API publicly available?

Yes. POST /v1/jobs with X-Api-Key. See the API page for the full SDK and webhook documentation.

bgless.video

Try SAM 3 video background removal now

Upload a clip, type an optional concept prompt, and preview a 2-second alpha result before any credits are spent.

Start with a free preview