> For the complete documentation index, see [llms.txt](https://metaverse-imagen.gitbook.io/ai-tools-research/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://metaverse-imagen.gitbook.io/ai-tools-research/large-language-models-llms/open-source-llms/open-source-image-models/florence-2-vision-foundation-model-microsoft.md).

# Florence-2: Vision foundation model (Microsoft)

🔥[Microsoft](https://www.linkedin.com/company/microsoft/) drops Florence-2: Vision foundation model that slays! 🚀 All models are released on [Hugging Face](https://www.linkedin.com/company/huggingface/) hub. Learn more👉\
\
\- 230M & 770M param models crush specialists in captioning, detection & more 💪\
\- 230M model beats Flamingo 80B (400x bigger!) in zero-shot 🤯\
\- Trained on FLD-5B: 5.4B annotations, 126M images 📊\
\- Fine-tuned: SOTA in captioning, VQA, referring expressions 🏆\
\- Excel in captioning, object detection, segmentation, VQA & more 🎨🔍❓\
\- Leverage multi-task learning on massive FLD-5B dataset 💡\
\- Beat larger models like PaLI, PaLI-X in specialist tasks 🥊\
\- Available in 230M & 770M param versions for all 🤗\
\
🌟 Florence-2 is clearly a unified vision representation powerhouse! 🦾\
🙌 Kudos to [Microsoft](https://www.linkedin.com/company/microsoft/) for advancing vision foundation models and for 👏 for open-sourcing!\
All models are on [Hugging Face](https://www.linkedin.com/company/huggingface/) Hub.