วันเสาร์ที่ 9 พฤษภาคม พ.ศ. 2569

Large Multimodal Model (LMM)

A Large Multimodal Model (LMM) represents the next evolution of AI beyond text-only Large Language Models (LLMs). While a traditional LLM is like a brilliant scholar who has only ever read books, an LMM is like that same scholar who can now also see, hear, and create.

At its core, an LMM is a single AI system capable of processing and generating information across multiple "modalities"—such as text, images, audio, and video—all within a unified framework.