WebAssembly

Getting Started with DeepSeek-LLM-7B-Chat

To quick start, you can run DeepSeek-LLM-7B-Chat with just one single command on your own device. The command tool automatically downloads and installs the WasmEdge runtime, the model files, and the portable Wasm apps for inference. DeepSeek-LLM-7B-Chat is an advanced language model trained by DeepSeek, a subsidiary company of High-flyer quant, comprising 7 billion parameters. It is trained on a dataset of 2 trillion tokens in English and Chinese.…
LLM AI inference Rust WebAssembly
Getting Started with Mistral-7B-Instruct-v0.2

Or you can run this newest model with just one single command on your mac/ across devices. The Mistral-7B-Instruct-v0.2 model is a new model released by the Mistral AI team. It’s built upon the successful foundation of its predecessor, the Mistral-7B-v0.1. This model stands out for its improved abilities in understanding and following complex instructions, making it an even more powerful tool for a wide range of applications.This combination of advanced technology and user-friendly design makes Mistral-7B-Instruct-v0.…
LLM AI inference Rust WebAssembly
Getting Started with Neural-Chat-7B-v3-1

Neural-Chat-7B-v3-1 is a fine-tuned model based on Mistral-7B-v0.1 and trained on the Open-Orca/SlimOrca open-source dataset. The model underwent training between September and October 2023. It incorporates a Direct Preference Optimization (DPO) algorithm, highlighting its advanced fine-tuning and optimization capabilities. In this article, we will cover How to run Neural-Chat-7B-v3-1 on your own device How to create an OpenAI-compatible API service for Neural-Chat-7B-v3-1 We will use the Rust + Wasm stack to develop and deploy applications for this model.…
LLM AI inference Rust WebAssembly
Introducing the run-llm.sh, an all-in-one CLI app to run LLMs locally

The run-llm.sh script, developed by Second State, is a command-line tool designed to run a chat interface, and an OpenAI-compatible API server using open-source Large Language Models (LLMs) on your device. This CLI app automatically downloads and installs the WasmEdge runtime, the model files, and the portable Wasm apps for inference. Users simply need to follow the CLI prompts to select their desired options. You can access run-llm.sh here. Get started with the run-llm.…
LLM AI inference Rust WebAssembly
Getting Started with DeepSeek-Coder-6.7B

DeepSeek-Coder-6.7B is among DeepSeek Coder series of large code language models, pre-trained on 2 trillion tokens of 87% code and 13% natural language text. DeepSeek Coder models are trained with a 16,000 token window size and an extra fill-in-the-blank task to enable project-level code completion and infilling. DeepSeek Coder achieves state-of-the-art performance on various code generation benchmarks compared to other open-source code models. In this article, we will cover How to run DeepSeek-Coder-6.…
LLM AI inference Rust WebAssembly
Getting Started with Starling-LM-7B-alpha

Starling-LM-7B-alpha is a large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). In other words, it is trained by GPT-4 generated synthetic conversations. It is developed by the Berkeley-Nest team. According to commonly accepted benchmarks, the model excels in education, STEM, humanities, writing and role play. In this article, we will cover How to run Starling-LM-7B-alpha on your own device How to create an OpenAI-compatible API service for Starling-LM-7B-alpha We will use the Rust + Wasm stack to develop and deploy applications for this model.…
LLM AI inference Rust WebAssembly
Getting Started with Dolphin-2.2.1-Mistral-7B

The Dolphin-2.2.1-Mistral-7B, developed by Eric Hartford, is an iteration of the Dolphin family of models, building upon the previous Dolphin 2.1 Mistral version. This model is distinguished by its enhanced conversation and empathy skills, based on the Mistral-7B-v0.1 and is designed to offer a more empathetic AI experience, aiming to provide highly engaging and personal chat interactions. In this article, we will cover How to run Dolphin-2.2.1-Mistral-7B on your own device How to create an OpenAI-compatible API service for Dolphin-2.…
LLM AI inference Rust WebAssembly
Getting Started with Samantha-1.11-CodeLlama-34b

The Samantha-1.11-CodeLlama-34bmodel is trained on the CodeLlama-34b. This version of Samantha stands out for its coding capabilities and the ability to assist with homework, in addition to acting as a personal companion. The model has undergone training in areas such as philosophy, psychology, and personal relationships, distinguishing it from typical assistant models by also aspiring to be a friend and companion to users. In this article, we will cover How to run Samantha-1.…
LLM AI inference Rust WebAssembly
Getting Started with Orca-2-13B

To quick start, you can run Orca-2-13B with just one single command on your own device. The command tool automatically downloads and installs the WasmEdge runtime, the model files, and the portable Wasm apps for inference. The Orca-2-13B, part of Microsoft's Orca 2 series, comes in 7B and 13B parameter versions, fine-tuned from the LLAMA 2 base models. This model excels in reasoning, text summarization, math problem-solving and comprehension tasks, building upon the original 13B Orca model.…
LLM AI inference Rust WebAssembly
Getting Started with Samantha-1.2-Mistral-7b

The Samantha-1.2-Mistral-7b isretrained version of the Samantha Mistral-7b model, now using the ChatML prompt format instead of Vicuna-1.1. This version, trained on the Mistral-7b as a base model, underwent training for 4 hours on 4x A100 80gb GPUs across 6 epochs using the Samantha-1.1 dataset. The model focuses on philosophy, psychology, and personal relationships, positioning itself as not just an assistant but also as a friend and companion. In this article, we will cover…
LLM AI inference Rust WebAssembly

6
7
8
9
10