Running Featured 118 smolagents and tools gallery šØ 118 Browse tools and agents to use in smolagents
view post Post 2761 NVIDIA just dropped a gigantic multimodal model called NVLM 72B š¦ nvidia/NVLM-D-72BPaper page NVLM: Open Frontier-Class Multimodal LLMs (2409.11402)The paper contains many ablation studies on various ways to use the LLM backbone šš»š¦© Flamingo-like cross-attention (NVLM-X)š Llava-like concatenation of image and text embeddings to a decoder-only model (NVLM-D)⨠a hybrid architecture (NVLM-H)Checking evaluations, NVLM-D and NVLM-H are best or second best compared to other models šThe released model is NVLM-D based on Qwen-2 Instruct, aligned with InternViT-6B using a huge mixture of different datasetsYou can easily use this model by loading it through transformers' AutoModel š š„ 11 11 + Reply
view post Post 4062 If you feel like you missed out for ECCV 2024, there's an app to browse the papers, rank for popularity, filter for open models, datasets and demos š Get started at https://huggingface.co/spaces/ECCV/ECCV2024-papers ⨠š 11 11 š„ 6 6 + Reply
Runtime error Featured 1.1k Open NotebookLM š 1.1k Personalised Podcasts For All - Available in 13 Languages