-
Getting Started with Mistral-7b-Instruct-v0.1
The mistral-7b-instruct-v0.1 model is a 7B instruction-tuned LLM released by Mistral AI. It is a true open source model licensed under Apache 2.0. It has a context length of 8,000 tokens and performs on par with 13B llama2 models. It is great for generating prose, summarizing documents, and writing code. In this article, we will cover How to run mistral-7b-instruct-v0.1 on your own device How to create an OpenAI-compatible API service for mistral-7b-instruct-v0.…
-
Getting Started with Dolphin-2.2-yi-34b
The dolphin-2.2-yi-34b model is based on the 34B LLM, Yi, released by the 01.AI team. Yi is converted to the llama2 format by Charles Goddard and then further fine-tuned by Eric Hartford. In this article, we will cover How to run dolphin-2.2-yi-34b on your own device How to create an OpenAI-compatible API service for dolphin-2.2-yi-34b We will use the Rust + Wasm stack to develop and deploy applications for this model.…
-
Fast and Portable Llama2 Inference on the Heterogeneous Edge
The Rust+Wasm stack provides a strong alternative to Python in AI inference. Compared with Python, Rust+Wasm apps could be 1/100 of the size, 100x the speed, and most importantly securely run everywhere at full hardware acceleration without any change to the binary code. Rust is the language of AGI. We created a very simple Rust program to run inference on llama2 models at native speed. When compiled to Wasm, the binary application (only 2MB) is completely portable across devices with heterogeneous hardware accelerators.…
-
Wasm as the runtime for LLMs and AGI
Large Language Model (LLM) AI is the hottest thing in tech today. With the advancement of open source LLMs, new fine-tuned and domain-specific LLMs are emerging everyday in areas as diverse as coding, education, medical QA, content summarization, writing assistance, and role playing. Don't you want to try and chat with those LLMs on your computers and even IoT devices? But, Python / PyTorch, which are traditionally required to run those models, consists of 3+GB of fragile inter-dependent packages.…
-
Rust + WebAssembly: Building Infrastructure for Large Language Model Ecosystems
This is a talk at the track “The Programming Languages Shaping the Future of Software Development” at QCon 2023 Beijing on Sept 6th, 2023. The session aims to address the challenges faced by the current mainstream Python and Docker approach in building infrastructure for large language model(LLM) applications. It introduced the audience to the advantages of the Rust + WebAssembly approach, emphasizing its potential in addressing the performance, security, and efficiency concerns associated with the traditional approach.…
-
WasmEdge at WasmCon: Dive into Keynotes, Tech Talks, and a Hands-on Workshop
WasmCon, the highly anticipated tech conference dedicated to WebAssembly (Wasm), is just a week away from September 6th to 8th! The WasmEdge community is thrilled to be a part of it! Our maintainers and community contributors have several talks and a hands-on workshop covering high-performance networking, AI Inference, LLM, serverless functions, and microservices. If you are interested in Wasm technology or business use cases, do not miss these talks either in-person next week in Seattle or online later!…
-
🤖The Last Chance to Get in LFX Mentorship in 2023
The LFX Mentorship Term 3 (from September to November) is open to apply! Contributing to WasmEdge through a paid LFX / CNCF internship program is a sure way to future-proof your resume and skill sets! If you’re interested in making contributions to the CNCF-hosted projects, like WasmEdge, grab the last opportunities in 2023. You will gain the following benefits when you contribute to CNCF-hosted projects like WasmEdge through the LFX Mentorship program.…
-
Prompt AI for Different Everyday Tasks: Your Customizable Personal Assistants
Introduction: In today's rapidly advancing large language models landscape, people are anxious about losing jobs to AI. We are increasingly integrating to our daily lives. From assisting with reading, writing, drawing, and traveling to learning new things, we want to make AI help us better instead of replace us. We proactively embrace LLMs like ChatGPT for different needs. A common challenge we face is when using ChatGPT, we are logged out every 15 minutes and the answer histories we have could be very unorganized and all mumble jumble.…
-
Build the future of cloud computing with WasmEdge via GSoC
Google Summer of Code 2023 is accepting applications with WasmEdge as a featured open source project with two available projects. By participating, you'll have the opportunity to work on WasmEdge and contribute to the evolution of cloud computing. GSoC is a global, online mentoring program that introduces new contributors to open source software development. Through GSoC, you will gain valuable experience in real-world software development while being compensated for your efforts and time!…
-
🤖 LFX Internship opportunities: building the future foundation of cloud computing
Happy New Year! According to the recent CNCF Annual Survey 2022 of over 2000 IT professionals, WebAssembly is going to be a key part of the cloud native technology stack. In fact, the survey’s key finding states: Containers are the new normal, and WebAssembly is the future. The WasmEdge project is an open source WebAssembly runtime that is optimized for cloud native use cases. It is already bundled and distributed with Docker Desktop and Fedora / Red Hat Linux.…