-
Getting Started with Gemma 3
Gemma-3 is a lightweight, efficient language model developed by Google, part of the Gemma family of models optimized for instruction-following tasks. Designed for resource-constrained environments, Gemma-3 retains strong performance in reasoning and instruction-based applications while maintaining computational efficiency. Its compact size makes it ideal for edge deployment and scenarios requiring rapid inference. This model achieves competitive results across benchmarks, particularly excelling in tasks requiring logical reasoning and structured responses. We have quantized Gemma-3 in GGUF format for broader compatibility with edge AI stacks.…
-
Getting Started with QwQ-32B
Qwen/QwQ-32B is the latest version of the Qwen seriesl. It is the medium-sized reasoning model, designed to excel at complex tasks with deep thinking and advanced problem-solving abilities. Unlike traditional instruction-tuned models, QwQ harnesses both extensive pretraining and a reinforcement learning stage during post-training to deliver significantly enhanced performance, especially on challenging problems with 32.5 billion total parameters. In this article, we will cover how to run and interact with QwQ-32B-GGUF on your own edge device.…
-
Getting Started with DeepSeek-R1-Distill-Qwen-1.5B
DeepSeek-R1-Distill-Qwen is a series of distilled large language models derived from Qwen 2.5, utilizing outputs from the larger DeepSeek-R1 model. These models are designed to be more efficient and compact while retaining strong performance, especially in reasoning tasks. The distillation process allows them to inherit the knowledge and capabilities of the larger model, making them suitable for resource-constrained environments and easier deployment. These distilled models have shown impressive results across various benchmarks, often outperforming other models of similar size.…