-
Getting Started with CausalLM
The CausalLM 14B model is based on the popular llama2 architecture but with Qwen 14B model weights. The Qwen models are developed by Alibaba to be English / Chinese bilingual LLMs. They perform very well in benchmarks compared with other models of similar sizes. The CausalLM model is further SFT fine-tuned on an uncensored dataset with 1.3B tokens. So, it follows conversations and provides a solid basis for further fine-tuning with domain specific knowledge and styles.…
-
Getting Started with OpenChat 3.5
To quick start, you can run Orca or a list of other models with just one single command on your own device. The command tool automatically downloads and installs the WasmEdge runtime, the model files, and the portable Wasm apps for inference. The OpenChat 13B model is fine-tuned on llama2 13B base model for conversation / chat applications. It has a novel fine-tuning method that is more effective than SFT but less expensive than RLFT.…
-
Getting Started with Mistral-7b-Instruct-v0.1
The mistral-7b-instruct-v0.1 model is a 7B instruction-tuned LLM released by Mistral AI. It is a true open source model licensed under Apache 2.0. It has a context length of 8,000 tokens and performs on par with 13B llama2 models. It is great for generating prose, summarizing documents, and writing code. In this article, we will cover How to run mistral-7b-instruct-v0.1 on your own device How to create an OpenAI-compatible API service for mistral-7b-instruct-v0.…
-
Getting Started with Dolphin-2.2-yi-34b
The dolphin-2.2-yi-34b model is based on the 34B LLM, Yi, released by the 01.AI team. Yi is converted to the llama2 format by Charles Goddard and then further fine-tuned by Eric Hartford. In this article, we will cover How to run dolphin-2.2-yi-34b on your own device How to create an OpenAI-compatible API service for dolphin-2.2-yi-34b We will use the Rust + Wasm stack to develop and deploy applications for this model.…
-
Fast and Portable Llama2 Inference on the Heterogeneous Edge
The Rust+Wasm stack provides a strong alternative to Python in AI inference. Compared with Python, Rust+Wasm apps could be 1/100 of the size, 100x the speed, and most importantly securely run everywhere at full hardware acceleration without any change to the binary code. Rust is the language of AGI. We created a very simple Rust program to run inference on llama2 models at native speed. When compiled to Wasm, the binary application (only 2MB) is completely portable across devices with heterogeneous hardware accelerators.…
-
Wasm as the runtime for LLMs and AGI
Large Language Model (LLM) AI is the hottest thing in tech today. With the advancement of open source LLMs, new fine-tuned and domain-specific LLMs are emerging everyday in areas as diverse as coding, education, medical QA, content summarization, writing assistance, and role playing. Don't you want to try and chat with those LLMs on your computers and even IoT devices? But, Python / PyTorch, which are traditionally required to run those models, consists of 3+GB of fragile inter-dependent packages.…
-
WasmEdge @ KubeCon + CloudNativeCon NA 2023
The stage is set. The Cloud Native Computing Foundation (CNCF) prepares to kick off its flagship event KubeCon + CloudNativeCon NA 2023 in the vibrant city of Chicago, Illinois from November 6 to 9, 2023. WasmEdge maintainers and members of the community will be in attendance, speaking, exhibiting, and networking with other attendees. Come join us at KubeCon this year! Talks Cloud Native Wasm Day (6th, Nov) Cloud Native Wasm Day, a pre-event to KubeCon NA, is a gathering where Wasm enthusiasts will share the latest technical updates and user stories about server-side WebAssembly.…
-
WasmEdge @ KubeCon + CloudNativeCon NA 2022
The most exciting cloud-native event for 2022 is just around the corner! KubeCon and CloudNativeCon NA 2022 will take place between Oct 24th and 28th in Detroit USA! WasmEdge and our community members will attend the conference, host a booth, go to the parties, and give 5 official talks! So, come say Hi virtually or even better, in person! Through conference presentations and demonstrations, the WasmEdge team focuses on WebAssembly (Wasm) developer experience and tooling for cloud-native applications.…
-
🚀 WasmEdge 0.10.0 is released!
In version 0.10.0, WasmEdge provides a brand new plug-in mechanism to make native extensions easier to develop and install, improves compatibility with LLVM 14, and supports new WebAssembly specs, proposals, and features. New plug-in system for native host functions Many enhancements to the WasmEdge socket API (e.g., microservices and web service clients in WasmEdge) Support for new WebAssembly proposals and specs WasmEdge C API enhancements Other features and bug fixes New plug-in system for native Host Functions The host funtion is the bridge that allows WebAssembly programs to access functionalities and features provided by native libraries.…
ProductWasmEdgeWebAssemblyRustServerlesscloud computingSocketCNCF
-
WasmEdge @ KubeCon + CloudNativeCon Europe 2022
KubeCon + CloudNativeCon Europe 2022 is around the corner. WasmEdge community members will be there and giving talks! It's time to say Hi virtually or in person if you are around. There are two WasmEdge-related talks on this year’s KubeCon + CloudNativeCon. They are on how to develop and manage Wasm apps. Cloud Native Wasm Day Time: May 16, 2022 11:05 - 11:35 CEST Topic: Run JavaScript, Python, and Ruby in WebAssembly…