lingbot-map
LibraryRobbyant/lingbot-map
A feed-forward 3D foundation model for streaming scene reconstruction.
Overview
LingBot-Map is a feed-forward 3D foundation model that reconstructs scenes from streaming data. It uses a Geometric Context Transformer to unify coordinate grounding, dense geometric cues, and drift correction. Achieves ~20 FPS on long sequences with a paged KV cache.
README Preview
\n \n\nLingBot-Map: Geometric Context Transformer for Streaming 3D Reconstruction\n\nRobbyant Team\n\n\n\n\n\n[](https://arxiv.org/abs/2604.14141)\n[](lingbot-map_paper.pdf)\n[](https://technology.robbyant.com/lingbot-map)\n[](https://huggingface.co/robbyant/lingbot-map)\n[](https://www.modelscope.cn/models/Robbyant/lingbot-map)\n[](LICENSE.txt)\n\n\n\nhttps://github.com/user-attachments/assets/fe39e095-af2c-4ec9-b68d-a8ba97e505ab\n\n-----\n\n### 🗺️ Meet LingBot-Map! We've built a feed-forward 3D foundation model for streaming 3D reconstruction! 🏗️🌍\n\nLingBot-Map has focused on:\n\n- **Geometric Context Transformer**: Architecturally unifies coordinate grounding, dense geometric cues, and long-range drift correction within a single streaming framework through anchor context, pose-reference window, and trajectory memory.\n- **High-Efficiency Streaming Inference**: A feed-forward architecture with paged KV cache attention, enabling stable inference at ~20 FPS on 518×378 resolution over long sequences exceeding 10,000 frames.\n- **State-of-the-Art Reconstruction**: Superior performance on diverse benchmarks compared to both existing streaming and iterative optimization-based approaches.\n\n---\n\n## 📑 Table of Contents\n\n\nClick to expand\n\n- [📰 News](#-news)\n- [📋 TODO](#-todo)\n- [⚙️ Installation](#️-installation)\n- [📦 Model Download](#-model-download)\n- [🚀 Quick Start](#-quick-start)\n- [🎬 Interactive Demo (`demo.py`)](#-interactive-demo-demopy)\n - [Try the Example Scenes](#try-the-example-scenes)\n - [Streaming with Keyframe Interval](#streaming-with-keyframe-interval)\n - [Windowed Inference (for long sequences, >3000 frames)](#windowed-inference-for-long-sequences-3000-frames)\n - [Sky Masking](#sky-masking)\n - [Visualization Options](#visualization-options)\n - [Performance & Memory](#performance--memory)\n- [🎥 Offline Rendering Pipeline (`demo_render/batch_demo.py`)](#-offline-rendering-pipeline-demo_renderbatch_demopy)\n- [📜 License](#