Moneycontrol PRO
Sansaar
HomeTechnologyAlibaba unveils new open-source multimodal AI model: All the details

Alibaba unveils new open-source multimodal AI model: All the details

Part of the Qwen series, this 7-billion parameter model is optimized for deployment on edge devices like smartphones and laptops, said the China-based tech giant.

March 27, 2025 / 13:08 IST
Alibaba Qwen

Alibaba has introduced Qwen2.5-Omni-7B, a unified multimodal AI model designed to process and generate text, images, audio, and video. Part of the Qwen series, this 7-billion parameter model is optimized for deployment on edge devices like smartphones and laptops, said the China-based tech giant.

Despite its compact size, Qwen2.5-Omni-7B delivers strong multimodal capabilities, making it suitable for various applications, including real-time voice assistance and intelligent customer service interactions. It can assist visually impaired users by providing real-time audio descriptions, analyse cooking videos for step-by-step guidance, and enhance interactive AI conversations. “This unique combination makes it the perfect foundation for developing agile, cost-effective AI agents that deliver tangible value, especially intelligent voice applications,” said Alibaba.

The model is now open-sourced on Hugging Face and GitHub, with additional access via Qwen Chat and Alibaba Cloud’s ModelScope, said the company. Alibaba Cloud has previously open-sourced over 200 generative AI models, added the company

Key features of the model

Qwen2.5-Omni-7B introduces an innovative architecture that improves multimodal performance. Some of its key features include a Thinker-Talker Architecture, which Separates text generation and speech synthesis to enhance output quality. It also offers block-wise Streaming Processing, which reduces latency for faster voice interactions.

Pre-trained on a diverse dataset covering text, images, video, and audio, the model delivered impressive performance in OmniBench tests, which measure AI’s ability to process and reason across multiple data types.

As per Alibaba, reinforcement learning (RL) optimisation has further improved generation stability, reducing attention misalignment, pronunciation errors, and unnatural pauses in speech responses.

Invite your friends and family to sign up for MC Tech 3, our daily newsletter that breaks down the biggest tech and startup stories of the day

MC Tech Desk Read the latest and trending tech news—stay updated on AI, gadgets, cybersecurity, software updates, smartphones, blockchain, space tech, and the future of innovation.
first published: Mar 27, 2025 01:08 pm

Discover the latest Business News, Sensex, and Nifty updates. Obtain Personal Finance insights, tax queries, and expert opinions on Moneycontrol or download the Moneycontrol App to stay updated!

Subscribe to Tech Newsletters

  • On Saturdays

    Find the best of Al News in one place, specially curated for you every weekend.

  • Daily-Weekdays

    Stay on top of the latest tech trends and biggest startup news.

Advisory Alert: It has come to our attention that certain individuals are representing themselves as affiliates of Moneycontrol and soliciting funds on the false promise of assured returns on their investments. We wish to reiterate that Moneycontrol does not solicit funds from investors and neither does it promise any assured returns. In case you are approached by anyone making such claims, please write to us at grievanceofficer@nw18.com or call on 02268882347