CASA Collection CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated Dec 23, 2025 • 7