-
Fast and Portable Llama2 Inference on the Heterogeneous Edge
The Rust+Wasm stack provides a strong alternative to Python in AI inference. Compared with Python, Rust+Wasm apps could be 1/100 of the size, 100x the speed, and most importantly securely run everywhere at full hardware acceleration without any change to the binary code. Rust is the language of AGI. We created a very simple Rust program to run inference on llama2 models at native speed. When compiled to Wasm, the binary application (only 2MB) is completely portable across devices with heterogeneous hardware accelerators.…
-
Wasm as the runtime for LLMs and AGI
Large Language Model (LLM) AI is the hottest thing in tech today. With the advancement of open source LLMs, new fine-tuned and domain-specific LLMs are emerging everyday in areas as diverse as coding, education, medical QA, content summarization, writing assistance, and role playing. Don't you want to try and chat with those LLMs on your computers and even IoT devices? But, Python / PyTorch, which are traditionally required to run those models, consists of 3+GB of fragile inter-dependent packages.…
-
WasmEdge @ KubeCon + CloudNativeCon NA 2023
The stage is set. The Cloud Native Computing Foundation (CNCF) prepares to kick off its flagship event KubeCon + CloudNativeCon NA 2023 in the vibrant city of Chicago, Illinois from November 6 to 9, 2023. WasmEdge maintainers and members of the community will be in attendance, speaking, exhibiting, and networking with other attendees. Come join us at KubeCon this year! Talks Cloud Native Wasm Day (6th, Nov) Cloud Native Wasm Day, a pre-event to KubeCon NA, is a gathering where Wasm enthusiasts will share the latest technical updates and user stories about server-side WebAssembly.…
-
Rust + WebAssembly: Building Infrastructure for Large Language Model Ecosystems
This is a talk at the track “The Programming Languages Shaping the Future of Software Development” at QCon 2023 Beijing on Sept 6th, 2023. The session aims to address the challenges faced by the current mainstream Python and Docker approach in building infrastructure for large language model(LLM) applications. It introduced the audience to the advantages of the Rust + WebAssembly approach, emphasizing its potential in addressing the performance, security, and efficiency concerns associated with the traditional approach.…
-
Join the Security Slam with WasmEdge and Boost Open Source Security!
The Cloud Native Computing Foundation (CNCF) is hosting the annual Security Slam, a virtual event aimed at enhancing the security posture of CNCF projects. WasmEdge is thrilled to be a part of this initiative, and contributors, we need your help! Why Participate? For the Open Source Community: Elevate awareness of modern security tools and practices, ensuring a safer ecosystem for all. For Contributors: Awards and recognition for new and outstanding contributions to projects Linux Foundation Cybersecurity Training and Certification Find new projects and meet new communities Build on existing skills or learn new ones How Can You Help WasmEdge?…
-
How do I create a GGUF model file?
The llama2 family of LLMs are typically trained and fine-tuned in PyTorch. Hence, they are typically distributed as PyTorch projects on Huggingface. However, when it comes to inference, we are much more interested in the GGUF model format for three reasons. Python is not a great stack for AI inference. We would like to get rid of PyTorch and Python dependency in production systems. GGUF can support very efficient zero-Python inference using tools like llama.…
-
WasmEdge at WasmCon: Dive into Keynotes, Tech Talks, and a Hands-on Workshop
WasmCon, the highly anticipated tech conference dedicated to WebAssembly (Wasm), is just a week away from September 6th to 8th! The WasmEdge community is thrilled to be a part of it! Our maintainers and community contributors have several talks and a hands-on workshop covering high-performance networking, AI Inference, LLM, serverless functions, and microservices. If you are interested in Wasm technology or business use cases, do not miss these talks either in-person next week in Seattle or online later!…
-
🤖The Last Chance to Get in LFX Mentorship in 2023
The LFX Mentorship Term 3 (from September to November) is open to apply! Contributing to WasmEdge through a paid LFX / CNCF internship program is a sure way to future-proof your resume and skill sets! If you’re interested in making contributions to the CNCF-hosted projects, like WasmEdge, grab the last opportunities in 2023. You will gain the following benefits when you contribute to CNCF-hosted projects like WasmEdge through the LFX Mentorship program.…
-
Prompt AI for Different Everyday Tasks: Your Customizable Personal Assistants
Introduction: In today's rapidly advancing large language models landscape, people are anxious about losing jobs to AI. We are increasingly integrating to our daily lives. From assisting with reading, writing, drawing, and traveling to learning new things, we want to make AI help us better instead of replace us. We proactively embrace LLMs like ChatGPT for different needs. A common challenge we face is when using ChatGPT, we are logged out every 15 minutes and the answer histories we have could be very unorganized and all mumble jumble.…
-
WasmEdge 0.12.1 released with new plugin system, new Wasm APIs for AI, advanced socket networking, etc
WasmEdge 0.12.0 and 0.12.1 are released. The two releases bring a host of new features, optimizations, and bug fixes that further enhance the performance, security, and versatility of WasmEdge. Key features: New plugin system makes it easy for community to add features to WasmEdge New Wasm APIs for AI, observability and networking through plugins Advanced socket networking Better embedding through improved host SDKs Performance and compatibility enhancements New plugin system A WasmEdge plugin C API is introduced in WasmEdge 0.…