-
Getting Started with Gemma 3
Gemma-3 is a lightweight, efficient language model developed by Google, part of the Gemma family of models optimized for instruction-following tasks. Designed for resource-constrained environments, Gemma-3 retains strong performance in reasoning and instruction-based applications while maintaining computational efficiency. Its compact size makes it ideal for edge deployment and scenarios requiring rapid inference. This model achieves competitive results across benchmarks, particularly excelling in tasks requiring logical reasoning and structured responses. We have quantized Gemma-3 in GGUF format for broader compatibility with edge AI stacks.…
-
Getting Started with QwQ-32B
Qwen/QwQ-32B is the latest version of the Qwen seriesl. It is the medium-sized reasoning model, designed to excel at complex tasks with deep thinking and advanced problem-solving abilities. Unlike traditional instruction-tuned models, QwQ harnesses both extensive pretraining and a reinforcement learning stage during post-training to deliver significantly enhanced performance, especially on challenging problems with 32.5 billion total parameters. In this article, we will cover how to run and interact with QwQ-32B-GGUF on your own edge device.…
-
Getting Started with DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek-R1-Distill-Qwen is a series of distilled large language models derived from Qwen 2.5, utilizing outputs from the larger DeepSeek-R1 model. These models are designed to be more efficient and compact while retaining strong performance, especially in reasoning tasks. The distillation process allows them to inherit the knowledge and capabilities of the larger model, making them suitable for resource-constrained environments and easier deployment. These distilled models have shown impressive results across various benchmarks, often outperforming other models of similar size.…
-
Getting Started with Mistral Small
Mistral Small 3 is a groundbreaking 24-billion parameter model designed to deliver high-performance AI with low latency. Released under the Apache 2.0 license, it stands out in the AI landscape for its ability to compete with much larger models like Llama 3.3 70B and Qwen 2.5 32B, while being more than three times faster on the same hardware. The model is particularly tailored for agentic tasks — those requiring robust language understanding, tool use, and instruction-following capabilities.…
-
Tutorial: Integrating Locally-Run DeepSeek R1 Distilled Llama Model with Cursor
In this article, we'll explore how to integrate the DeepSeek R1 Distilled Llama-8B model with the highly-rated intelligent code editor Cursor to create a private coding assistant. DeepSeek R1 is a powerful open-source language model with efficient inference capabilities and cost-effectiveness, making it particularly suitable for developers and researchers. Cursor is a popular AI code editor that can rely on different LLMs to complete code assistance tasks. While large language models specifically trained for coding tasks have shown excellent results, we've found that the trending DeepSeek's coding capabilities are quite impressive and comparable to many programmers.…
-
DeepSeek Topples OpenAI's O1: The Chinese Model Rewriting AI Economics at 1/70th the Cost
This is adapted from an interview when DeepSeek V2 impressed many when it debut in July, 2024. With the DeepSeek R1 coming out, we are seeing more heated discussions on X: “So the Chinese have open sourced a model that can out think any PhD I've ever met.” The original interview was in Chinese by Sina Finance. For the full English translation of the…
-
Run DeepSeek R1 on your own Devices in 5 mins
In our previous article, we got a peek into the lastest interview on the founder of DeepSeek. DeepSeek R1 is a powerful and versatile open source LLM model that challenges established players like OpenAI with its advanced reasoning capabilities, cost-effectiveness, and open-source availability. While it has some limitations, its innovative approach and strong performance make it a valuable tool for developers, researchers, and businesses alike. For those interested in exploring its capabilities, the model and its distilled versions are readily accessible on platforms like Hugging Face and GitHub.…
-
RustCoder: AI-Assisted Rust Learning
Rust has been voted the most beloved programming language by StackOverflow users eight years in a row. It is already widely used in mission critical software, including the Linux Kernel. In February 2024, the U.S. government released an official report to urge developer adoption of Rust over C/C++ in all government software due to its memory-safety and performance. Rust is also the hottest language among innovative startups of all sizes. For example, Elon Musk’s xAI is using Rust for all its AI infrastructure — a clear signal of where the industry is headed.…
-
WasmEdge at KubeCon NA 2024: AI-Driven Video Translation
Second State is thrilled to bring its open source work around LLMs, WasmEdge and Gaia, to KubeCon + CloudNativeCon North America 2024, November 12-15, 2024 in Salt Lake City, Utah. This year, Second State unveils VideoLangua.com—a groundbreaking platform using the open source WasmEdge and Gaia tech stack to deliver high-quality video translation, dubbing, and subtitling. This innovation highlights Second State’s commitment to democratizing access to global communication through advanced open-source solutions.…
-
WebAssembly Devroom at FOSDEM 2025 – Call for Speakers Open
We're excited to announce that the WebAssembly Devroom will be held on 2nd February 2025 at FOSDEM 2025 in Brussels, Belgium. WebAssembly is expanding its use cases from browsers to the cloud, and this devroom is a fantastic opportunity for the community to meet and discuss the latest developments in the WebAssembly ecosystem. This is cohosted by WasmEdge and NTT software innovation lab engineers. About FOSDEM: FOSDEM is a free event for software developers to meet, share ideas, and collaborate.…