DeepSeek

Getting Started with Llama 4

Meta AI has once again pushed the boundaries of open-source large language models with the unveiling of Llama 4. This latest iteration builds upon the successes of its predecessors, introducing a new era of natively multimodal AI innovation. Llama 4 arrives with a suite of models, with Llama 4 Scout and Llama 4 Maverick firstly launched and 2 more coming, each engineered for leading intelligence and unparalleled efficiency. This series boasts native multimodality, mixture-of-experts architectures, and remarkably long context windows of 10 million tokens, promising significant leaps in performance and broader accessibility for developers and enterprises alike.…
LLM AI inference Rust WebAssembly DeepSeek
Open Source Adventure: Apply to Google Summer of Code 2025 with WasmEdge!

Have you ever dreamed of contributing to real-world tech projects, collaborating with seasoned developers, and getting paid to write code that matters—all while building your resume? Google Summer of Code (GSoC) 2025 is your golden ticket, and WasmEdge wants YOU to join the journey! What’s Google Summer of Code? Google Summer of Code (GSoC) is a global, online program that pays you to work on open source projects during your summer break.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with Gemma 3

Gemma-3 is a lightweight, efficient language model developed by Google, part of the Gemma family of models optimized for instruction-following tasks. Designed for resource-constrained environments, Gemma-3 retains strong performance in reasoning and instruction-based applications while maintaining computational efficiency. Its compact size makes it ideal for edge deployment and scenarios requiring rapid inference. This model achieves competitive results across benchmarks, particularly excelling in tasks requiring logical reasoning and structured responses. We have quantized Gemma-3 in GGUF format for broader compatibility with edge AI stacks.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with QwQ-32B

Qwen/QwQ-32B is the latest version of the Qwen seriesl. It is the medium-sized reasoning model, designed to excel at complex tasks with deep thinking and advanced problem-solving abilities. Unlike traditional instruction-tuned models, QwQ harnesses both extensive pretraining and a reinforcement learning stage during post-training to deliver significantly enhanced performance, especially on challenging problems with 32.5 billion total parameters. In this article, we will cover how to run and interact with QwQ-32B-GGUF on your own edge device.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen is a series of distilled large language models derived from Qwen 2.5, utilizing outputs from the larger DeepSeek-R1 model. These models are designed to be more efficient and compact while retaining strong performance, especially in reasoning tasks. The distillation process allows them to inherit the knowledge and capabilities of the larger model, making them suitable for resource-constrained environments and easier deployment. These distilled models have shown impressive results across various benchmarks, often outperforming other models of similar size.…
LLM AI inference Rust WebAssembly DeepSeek