MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences
Abstracts interaction data into atomic decision experiences via hindsight reasoning, then retrieves them at inference through policy-driven wide- and deep-search strategies, improving fine-grained visual perception and multimodal reasoning over trajectory-level retrieval baselines.