WebAssembly

Getting Started with WizardLM-1.0-Uncensored-CodeLlama-34b

The WizardLM-1.0-Uncensored-CodeLlama-34b is a language model that is based on the CodeLlama-34b architecture, known for its strong coding abilities. This model represents a retraining of the WizardLM-13B-V1.0, utilizing a filtered dataset aimed at reducing refusals, avoidance, and bias in its responses. In this article, we will cover How to run WizardLM-1.0-Uncensored-CodeLlama-34b on your own device How to create an OpenAI-compatible API service for WizardLM-1.0-Uncensored-CodeLlama-34b We will use the Rust + Wasm stack to develop and deploy applications for this model.…
LLM AI inference Rust WebAssembly
Getting Started with Llama 2 Models

Llama 2 is a series of LLMs released by Meta, ranging from 7B to 70B parameters. Llama 2 serves as a foundational framework for numerous other LLMs. In this article, we will cover How to run Llama 2-13B on your own device How to create an OpenAI-compatible API service for Llama 2-13B We will use the Rust + Wasm stack to develop and deploy applications for this model. There are no complex Python packages or C++ toolchains to install!…
LLM AI inference Rust WebAssembly
Getting Started with WizardCoder-Python-7B-V1.0

WizardCoder is a specialized Large Language Model (LLM) tailored for coding tasks. In this article, we will cover How to run WizardCoder-Python-7B on your own device How to create an OpenAI-compatible API service for WizardCoder-Python-7B We will use the Rust + Wasm stack to develop and deploy applications for this model. There are no complex Python packages or C++ toolchains to install! See why we choose this tech stack.…
LLM AI inference Rust WebAssembly
Getting Started with Yi-34B-Chat

Yi-34B-Chatis a large language model trained from scratch by developers at 01.AI. In this article, we will cover How to run Yi-34B-Chat on your own device How to create an OpenAI-compatible API service for Yi-34B-Chat We will use the Rust + Wasm stack to develop and deploy applications for this model. There is no complex Python packages or C++ toolchains to install! See why we choose this tech stack.…
LLM AI inference Rust WebAssembly
Getting Started with Zephyr-7B

Zephyr-7B is fine-tuned Mistral-7B-v0.1 language model, released by the HuggingFace team. It removed the in-built alignment of these datasets boosted performance on MT Bench. In this article, we will cover How to run Zephyr-7B on your own device How to create an OpenAI-compatible API service for Zephyr-7B We will use the Rust + Wasm stack to develop and deploy applications for this model. There is no complex Python packages or C++ toolchains to install!…
LLM AI inference Rust WebAssembly
Getting Started with Baichuan2-13B-Chat

The Baichuan2-13B-Chat model is a 13B Large Language Model (LLM) developed by Baichuan Intelligent, which is inspired by offline reinforcement learning. According to the team, this approach allows the model to learn from mixed-quality data without preference labels, enabling it to deliver exceptional performance that rivals even the sophisticated ChatGPT models. In this article, we will cover How to run Baichuan2-13B-Chat on your own device How to create an OpenAI-compatible…
LLM AI inference Rust WebAssembly
Getting Started with Code Llama

Code Llama is an LLM for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks. In this article, we will cover How to run CodeLlama-13b-hf on your own device How to create an OpenAI-compatible API service for CodeLlama-13b-hf We will use the Rust + Wasm stack to develop and deploy applications for this model.…
LLM AI inference Rust WebAssembly
Getting Started with MistralLite

MistralLite is a fine-tuned Mistral-7B-v0.1 language model, released by AWS, with enhanced capabilities of processing long context (up to 32K tokens). In this article, we will cover How to run MistralLite on your own device How to create an OpenAI-compatible API service for MistralLite We will use the Rust + Wasm stack to develop and deploy applications for this model. There is no complex Python packages or C++ toolchains to install!…
LLM AI inference Rust WebAssembly
Getting Started with TinyLlama-1.1B-Chat-v0.3

TinyLlama is an open source effort to train a “small” LLM with only 1.1B parameters on a large corpus of data (3T tokens). It is meant to push the scaling-law envelop by compressing as much knowledge as possible into a small model file. The small size also translates to fast inference. If it is successful, it will be a great fit for edge devices and real time applications. It is right at the sweet spot of WasmEdge!…
LLM AI inference Rust WebAssembly
Getting Started with Wizard-Vicuna-13B

Wizard-Vicuna-13B is an impressive creation based on the Llama 2 platform and developed by MelodysDreamj. This model represents a significant advancement in the field of large language models (LLMs). It effectively combines the principles of WizardLM and VicunaLM, which includes the dataset from WizardLM and the conversation extension from ChatGPT, along with Vicuna's unique tuning method. This innovative combination results in a robust model capable of a wide range of applications.…
LLM AI inference Rust WebAssembly

6
7
8
9
10