Connect with us

Hi, what are you looking for?

Gain That FlavourGain That Flavour

Tech News

Microsoft brings out a small language model that can look at pictures

Illustration of the Microsoft wordmark on a green background
Illustration: The Verge

Microsoft announced a new version of its small language model, Phi-3, which can look at images and tell you what’s in them.

Phi-3-vision is a multimodal model — aka it can read both text and images — and is best used on mobile devices. Microsoft says Phi-3-vision, now available on preview, is a 4.2 billion parameter model (parameters refer to how complex a model is and how much of its training it understands) that can do general visual reasoning tasks like asking questions about charts or images.

But Phi-3-vision is far smaller than other image-focused AI models like OpenAI’s DALL-E or Stability AI’s Stable Diffusion. Unlike those models, Phi-3-vision doesn’t generate images, but it can understand what’s in an image and analyze it for a…

Continue reading…

You May Also Like

Editor's Pick

Colleen Hroncich When her husband suggested she create a “pod school,” longtime teacher Becky McNichols was initially dismissive. “I don’t even know what that...

Politics

Editor’s note: This is the sixth in a series of profiles of potential running mates for presidential candidate Donald Trump on the 2024 Republican...

Editor's Pick

Travis Fisher Energy producers will be subject to retroactive taxes in New York if the state assembly passes Senate Bill S2129A, known as the...

Tech News

With Apple Intelligence features coming to older MacBooks, you shouldn’t feel the need to wait for Apple’s M4 chips. | Photo by Chris Welch...