News

To build the TerraMesh dataset that underpins TerraMind, IBM’s researchers compiled data on everything from biomes to land ...
Microsoft’s model BitNet b1.58 2B4T is available on Hugging Face but doesn’t run on GPU and requires a proprietary framework.
Compared to DeepSeek R1, Llama-3.1-Nemotron-Ultra-253B shows competitive results despite having less than half the parameters.
The BitNet b1.58 2B4T model was developed by Microsoft's General Artificial Intelligence group and contains two billion parameters – internal values that enable the model to ...
Microsoft Research has introduced BitNet b1.58 2B4T, a new 2-billion parameter language model that uses only 1.58 bits per ...
These 10 AI companies are at the forefront of machine learning. Find out how they’re driving innovation and jostling to be ...
These models are commonly used through the Hugging Face transformers library, which provides simple access via pipelines or more customizable options by manually loading tokenizers and models.