DeepSeek Debuts ‘Sparse Attention’ Method in Next-Gen AI Model
The secretive Chinese startup outlined the DeepSeek-V3.1-Exp platform, explaining it uses a new technique it calls DeepSeek Sparse Attention or DSA

DeepSeek is working towards new products to shore up that lead in Chinese AI sector.
DeepSeek updated an experimental AI model Monday in what it called a step toward next-generation artificial intelligence.
The secretive Chinese startup outlined the DeepSeek-V3.1-Exp platform, explaining it uses a new technique it calls DeepSeek Sparse Attention or DSA, according to a post on its Hugging Face page. The latest version marked “an intermediate step toward our next-generation architecture,” the Hangzhou-based startup said, also indicating it was working with Chinese chipmakers on the model.
DeepSeek, whose seminal R1 stunned Silicon Valley with its sophistication this year, is working towards new products to shore up that lead in Chinese AI sector. The latest version builds on the older V3.1 by introducing a mechanism designed to explore and optimize AI training and operation. It’s intended to showcase the startup’s research into ways to improve efficiency when processing long text sequences, the startup said.
DeepSeek also said it was halving prices on its software tools, joining other Chinese startups in slashing costs to attract users.
On Monday, Huawei — the leader in Chinese AI chips —- announced its products would support DeepSeek’s latest model update.
DeepSeek has indicated that the newest versions of its models support FP8 or Floating Point 8 architecture, while it works on supporting BF16. Both are technical terms for the ways in which numbers can be stored on computers for AI and machine learning. FP8 in theory saves memory and speeds up calculations.
AI models handle millions of numbers and using smaller formats like FP8 and BF16 balances speed with accuracy, and makes it easier to run big models on limited hardware. Though it isn’t very precise, FP8 considered useful for many AI tasks. BF16 or Brain Floating Point 16 is regarded as more accurate for training AI models.
( Source : Bloomberg )
Next Story

