Close Menu
  • Home
  • Ratings
  • Showbiz News
  • Horoscope
  • Tech Jungle
  • BRAND NEWS
  • Movies
  • Music
  • About
    • BE PART OF THE LIONHEARTV FAMILY!
    • THE PRIDE
    • ADVERTISE AT LIONHEARTV
What's Hot

Myrtle Sarrosa named first-ever celebrity ambassador of Timezone Philippines

June 2, 2025

Sparkle GMA Artist Center launches Cloud 7, its youngest P-pop boy group

June 2, 2025

Rampa Drag Club reopens in new Quezon City location with Grand ‘Moulin Rouge’ showcase

June 2, 2025
Facebook X (Twitter) Instagram
Facebook X (Twitter) Instagram YouTube TikTok
LionhearTVLionhearTV
  • Home
  • Ratings
  • Showbiz News
  • Horoscope
  • Tech Jungle
  • BRAND NEWS
  • Movies
  • Music
  • About
    • BE PART OF THE LIONHEARTV FAMILY!
    • THE PRIDE
    • ADVERTISE AT LIONHEARTV
LionhearTVLionhearTV
Home»Press Release»Alibaba Cloud Launches Open-Source Large Vision Language Model with Image Comprehension Capability
Press Release

Alibaba Cloud Launches Open-Source Large Vision Language Model with Image Comprehension Capability

Lion's DenBy Lion's DenSeptember 2, 2023No Comments3 Mins Read
Share
Facebook Twitter Reddit Pinterest Email

Alibaba Cloud, the digital technology and intelligence backbone of Alibaba Group, launched two open-source large vision language models (LVLM), Qwen-VL and its conversationally fine-tuned Qwen-VL-Chat. The models can comprehend images, texts and bounding boxes in prompts and facilitate multi-round question answering in both English and Chinese.

Qwen-VL is the multimodal version of Qwen-7B, Alibaba Cloud’s 7-billion-parameter model of its large language model Tongyi Qianwen (also available on ModelScope as open-source). Capable of understanding both image inputs and text prompts in English and Chinese, Qwen-VL can perform various tasks such as responding to open-ended queries related to different images and generating image captions.

Qwen-VL-Chat caters to more complex interaction, such as comparing multiple image inputs and engaging in multi-round question answering. Leveraging alignment techniques, this AI assistant exhibits a range of creative capabilities, which include writing poetry and stories based on input images, summarizing the content of multiple pictures, and solving mathematical questions displayed in images.

Contribution to open source and inclusivity

In a bid to democratize AI technologies, Alibaba Cloud has shared the model’s code, weights, and documentation with academics, researchers, and commercial institutions worldwide. This contribution to the open-source community is accessible via Alibaba’s AI model community ModelScope and the collaborative AI platform Hugging Face. For commercial uses, companies with over 100 million monthly active users can request a license from Alibaba Cloud.

The introduction of these models, with their ability to extract meaning and information from images, holds the potential to revolutionize the interaction with visual content. For instance, leveraging its image comprehension and question-answering capability, the models could provide information assistance to visually impaired individuals during online shopping in the future.

The Qwen-VL model was pre-trained on image and text datasets. Compared to other open-source large vision language models that can process and understand images in 224*224 resolution, Qwen-VL can handle image input at a resolution of 448*448, resulting in better image recognition and comprehension.

Based on various benchmarks,Qwen-VL recorded outstanding performs on several visual language tasks, including zero-shot captioning, general visual question answering, text-oriented visual question answering, and object detection.

Qwen-VL-Chat has also achieved leading results in both Chinese and English for text-image dialogue and alignment levels with humans, according to the benchmark test of Alibaba Cloud. This test involved over 300 images, 800 questions, and 27 categories.

Earlier this month, Alibaba Cloud open sourced its 7-billion-parameter LLMs, Qwen-7B and Qwen-7B-Chat as its ongoing contribution to the open-source community. The two models have had over 400,000 downloads within a month of their launch.




For more information, please check out the Alizila story here and more details of Qwen-VL and Qwen-VL-Chat on ModelScope, HuggingFace and GitHub pages. The paper of the model is also available: https://arxiv.org/abs/2308.12966 .

Comments

Alibaba Cloud ardentcommpr Qwen-VL Qwen-VL-Chat
Share. Facebook Twitter Pinterest LinkedIn Reddit Email
Previous Article‘Voltes V Legacy’ reclaims leadership in the ratings game after 3 weeks of loss to ‘FPJ’s Batang Quiapo’
Next Article Redefining Toughness: CASIO celebrates 40 years of toughness in all forms
Lion's Den
  • Website
  • Facebook
  • X (Twitter)
  • Instagram

LionhearTV has always believed in what the everyday reader can contribute, and has always been open to receiving input, help, or leads on stories. Readers are always encouraged to drop us their thoughts either by either by leaving a comment on a post, or contact us directly – email us at lionheartvnet@gmail.com.

Related Posts

Me Time That Makes Mom Life Work: Anne Curtis on Maintaining Wellness While Doing It All

June 2, 2025

Morissette Amon returns to the concert stage with ‘Ember’

June 2, 2025

Ex-Housemates Take On Kakaibang Task with Grab as Last Summer Hurrah

June 1, 2025

Infinix’s Next GT Phone? Color-Shifting Back Panel Teases Concept Gaming Phone for Pro Gamers

June 1, 2025
Add A Comment

Comments are closed.

Find us on Facebook
Blogmeter.Top



Trending

Philippine Showbiz’s Triple Threats: Multi-Hyphenate Queens Shining Bright

May 28, 2025

10 New and Upcoming Philippine TV Shows to Watch Out for in 2025

May 27, 2025

Why ‘Pinoy Big Brother: Celebrity Collab Edition’ Might Be the Franchise’s Finest Season Yet

May 22, 2025

2025 Midterm Elections: An Eye-Opener for Celebrities Entering Politics and Voters

May 20, 2025

Christopher Diwata: From Viral Meme to Heartwarming Hero

May 20, 2025
Showbiz News

Myrtle Sarrosa named first-ever celebrity ambassador of Timezone Philippines

June 2, 2025

Sparkle GMA Artist Center launches Cloud 7, its youngest P-pop boy group

June 2, 2025

Rampa Drag Club reopens in new Quezon City location with Grand ‘Moulin Rouge’ showcase

June 2, 2025

Vice Ganda champions responsible gaming as GameZone launches nationwide

June 2, 2025

Friend defends Vice Ganda amid claims of ₱80K payment and tardiness in Palawan trip

June 2, 2025
Most Viewed

Myrtle Sarrosa named first-ever celebrity ambassador of Timezone Philippines

June 2, 2025

Sparkle GMA Artist Center launches Cloud 7, its youngest P-pop boy group

June 2, 2025

Rampa Drag Club reopens in new Quezon City location with Grand ‘Moulin Rouge’ showcase

June 2, 2025

Vice Ganda champions responsible gaming as GameZone launches nationwide

June 2, 2025

Me Time That Makes Mom Life Work: Anne Curtis on Maintaining Wellness While Doing It All

June 2, 2025
eMVP Digital is an online empire that useful pieces of information and a resource for a daily dose of entertainment in all forms. It produces LionhearTV.net, Dailypedia.net, RAWR Awards, RAWRMag, DailyPIPOL, and Broken Lion. These platforms have a highly-engaged audience per month, which varies from ages and sexes.



Blogmeter.Top
© 2025 LionhearTV.net.
  • Home
  • Ratings
  • Showbiz News
  • Horoscope
  • Tech Jungle
  • BRAND NEWS
  • Movies
  • Music
  • About
    • BE PART OF THE LIONHEARTV FAMILY!
    • THE PRIDE
    • ADVERTISE AT LIONHEARTV

Type above and press Enter to search. Press Esc to cancel.

News Hub
New
Icon
Icon
×
Icon
New
Icon News Hub
Icon News Hub
News Hub Powered by iZooto
You have no new updates. Watch this space to get latest updates.
Unblock notifications to start receiving real time updates. Know More
Link copied to clipboard.