Articles and tutorials

Getting Started with Llama 4

Meta AI has once again pushed the boundaries of open-source large language models with the unveiling of Llama 4. This latest iteration builds upon the successes of its predecessors, introducing a new era of natively multimodal AI innovation. Llama 4 arrives with a suite of models, with Llama 4 Scout and Llama 4 Maverick firstly launched and 2 more coming, each engineered for leading intelligence and unparalleled efficiency. This series boasts native multimodality, mixture-of-experts architectures, and remarkably long context windows of 10 million tokens, promising significant leaps in performance and broader accessibility for developers and enterprises alike.…
LLM AI inference Rust WebAssembly DeepSeek
Open Source Adventure: Apply to Google Summer of Code 2025 with WasmEdge!

Have you ever dreamed of contributing to real-world tech projects, collaborating with seasoned developers, and getting paid to write code that matters—all while building your resume? Google Summer of Code (GSoC) 2025 is your golden ticket, and WasmEdge wants YOU to join the journey! What’s Google Summer of Code? Google Summer of Code (GSoC) is a global, online program that pays you to work on open source projects during your summer break.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with Gemma 3

Gemma-3 is a lightweight, efficient language model developed by Google, part of the Gemma family of models optimized for instruction-following tasks. Designed for resource-constrained environments, Gemma-3 retains strong performance in reasoning and instruction-based applications while maintaining computational efficiency. Its compact size makes it ideal for edge deployment and scenarios requiring rapid inference. This model achieves competitive results across benchmarks, particularly excelling in tasks requiring logical reasoning and structured responses. We have quantized Gemma-3 in GGUF format for broader compatibility with edge AI stacks.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with QwQ-32B

Qwen/QwQ-32B is the latest version of the Qwen seriesl. It is the medium-sized reasoning model, designed to excel at complex tasks with deep thinking and advanced problem-solving abilities. Unlike traditional instruction-tuned models, QwQ harnesses both extensive pretraining and a reinforcement learning stage during post-training to deliver significantly enhanced performance, especially on challenging problems with 32.5 billion total parameters. In this article, we will cover how to run and interact with QwQ-32B-GGUF on your own edge device.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with DeepSeek-R1-Distill-Qwen-1.5B

DeepSeek-R1-Distill-Qwen is a series of distilled large language models derived from Qwen 2.5, utilizing outputs from the larger DeepSeek-R1 model. These models are designed to be more efficient and compact while retaining strong performance, especially in reasoning tasks. The distillation process allows them to inherit the knowledge and capabilities of the larger model, making them suitable for resource-constrained environments and easier deployment. These distilled models have shown impressive results across various benchmarks, often outperforming other models of similar size.…
LLM AI inference Rust WebAssembly DeepSeek
Getting Started with Mistral Small

Mistral Small 3 is a groundbreaking 24-billion parameter model designed to deliver high-performance AI with low latency. Released under the Apache 2.0 license, it stands out in the AI landscape for its ability to compete with much larger models like Llama 3.3 70B and Qwen 2.5 32B, while being more than three times faster on the same hardware. The model is particularly tailored for agentic tasks — those requiring robust language understanding, tool use, and instruction-following capabilities.…
LLM AI inference Rust WebAssembly Mistral
Tutorial: Integrating Locally-Run DeepSeek R1 Distilled Llama Model with Cursor

In this article, we'll explore how to integrate the DeepSeek R1 Distilled Llama-8B model with the highly-rated intelligent code editor Cursor to create a private coding assistant. DeepSeek R1 is a powerful open-source language model with efficient inference capabilities and cost-effectiveness, making it particularly suitable for developers and researchers. Cursor is a popular AI code editor that can rely on different LLMs to complete code assistance tasks. While large language models specifically trained for coding tasks have shown excellent results, we've found that the trending DeepSeek's coding capabilities are quite impressive and comparable to many programmers.…
LLM AI inference Rust WebAssembly
DeepSeek Topples OpenAI's O1: The Chinese Model Rewriting AI Economics at 1/70th the Cost

This is adapted from an interview when DeepSeek V2 impressed many when it debut in July, 2024. With the DeepSeek R1 coming out, we are seeing more heated discussions on X: “So the Chinese have open sourced a model that can out think any PhD I've ever met.” The original interview was in Chinese by Sina Finance. For the full English translation of the…
LLM AI inference Rust WebAssembly
Run DeepSeek R1 on your own Devices in 5 mins

In our previous article, we got a peek into the lastest interview on the founder of DeepSeek. DeepSeek R1 is a powerful and versatile open source LLM model that challenges established players like OpenAI with its advanced reasoning capabilities, cost-effectiveness, and open-source availability. While it has some limitations, its innovative approach and strong performance make it a valuable tool for developers, researchers, and businesses alike. For those interested in exploring its capabilities, the model and its distilled versions are readily accessible on platforms like Hugging Face and GitHub.…
LLM AI inference Rust WebAssembly
Tiktok Refugees' Guide to RedNote

Getting Started: The Basics What is RedNote? Think of RedNote as TikTok meets Instagram meets Pinterest. While TikTok focuses on short videos, RedNote (or “Little Red Book”) is more about lifestyle content with both photos and videos. It's like having a digital lifestyle magazine where you're both the reader and creator. Download & Registration Tips Download from the App Store or Google Play (search for &ldqu…
LLM Translation TikTok RedNote Translation

1
2
3
4
5