Connect with us

Hi, what are you looking for?

Gain That FlavourGain That Flavour

Tech News

Microsoft brings out a small language model that can look at pictures

Illustration of the Microsoft wordmark on a green background
Illustration: The Verge

Microsoft announced a new version of its small language model, Phi-3, which can look at images and tell you what’s in them.

Phi-3-vision is a multimodal model — aka it can read both text and images — and is best used on mobile devices. Microsoft says Phi-3-vision, now available on preview, is a 4.2 billion parameter model (parameters refer to how complex a model is and how much of its training it understands) that can do general visual reasoning tasks like asking questions about charts or images.

But Phi-3-vision is far smaller than other image-focused AI models like OpenAI’s DALL-E or Stability AI’s Stable Diffusion. Unlike those models, Phi-3-vision doesn’t generate images, but it can understand what’s in an image and analyze it for a…

Continue reading…

You May Also Like

Editor's Pick

David J. Bier The House of Representatives impeached Department of Homeland Security (DHS) Secretary Alejandro Mayorkas, and now some senators are pressing to hold...

Politics

War between Iran and Israel would be to no one’s benefit in the region as it would likely end up in a pitched battle...

Politics

The threats from the Chinese Communist Party continue to grow. From the market manipulation that is hurting American industry, to the fentanyl crisis ravaging...

Editor's Pick

Patrick G. Eddington One point I always make when talking about national security issues, and especially those involving surveillance powers, is this: when in doubt,...