-
Getting Started with Gemma-1.1-7b-it
Gemma-1.1-7b-it along with Gemma-1.1-2b-it is an freshly released update over Gemma 1.0. Gemma 1.1 appears to be an improvement over Gemma 1.0, particularly in its ability to understand ambiguous language. BBQ Ambig (stands for Balanced Breakfast Ambiguity): This metric measures a model’s ability to understand language that is ambiguous. The higher the score, the better. Gemma 1.1 2B shows a significant improvement over Gemma 1.0 2B, going from 62.58 to 86.…
-
WebAssembly on Kubernetes: from containers to Wasm (part 01)
Community blog by Seven Cheng WebAssemly (Wasm) was originally created for the browser, and it has become increasingly popular on the server-side as well. In my view, WebAssembly is gaining popularity in the Cloud Native ecosystem due to its advantages over containers, including smaller size, faster speed, enhanced security, and greater portability. In this article, I will provide a brief introduction to WebAssembly and explain its advantages. Then I will discuss how Wasm modules can be executed using container toolings, including low-level container runtimes, high-level container runtimes, and Kubernetes in the next article.…
-
Talk to WasmEdge at WasmIO 2024 in Barcelona and KubeCon EU 2024 in Paris
WasmEdge is set to make a splash at two of the most awaited tech events of the year, WasmIO 2024 in Barcelona and KubeCon EU 2024 in Paris. With a series of engaging talks, workshops, and presentations lined up, these appearances highlight the increasing importance of efficient, portable AI/LLM inference and cloud-native technologies in today's fast-evolving digital landscape. From deep dives into cloud-native WebAssembly to innovative strategies for log processing and building business models around open-source projects, WasmEdge's contributions are poised to offer invaluable insights and practical solutions for developers, entrepreneurs, and tech enthusiasts looking to leverage the full potential of Wasm and AI technologies on the edge cloud and beyond.…
-
Getting Started with Llava-v1.6-Vicuna-7B
Llava-v1.6-Vicuna-7B is open-source community's answer to OpenAI's multimodal GPT-4-V. It is also known as a Visual Language Model for its ability to handle visual images and language in a conversation. The model is based on lmsys/vicuna-7b-v1.5. In this article, we will cover how to create an OpenAI-compatible API service for Llava-v1.6-Vicuna-7B. We will use LlamaEdge (the Rust + Wasm stack) to develop and deploy applications for this model. There is no complex Python packages or C++ toolchains to install!…
-
LlamaEdge released v0.4.0, adding RAG and Llava support
LlamaEdge v0.4.0 is out! Key enhancements: Support the Llava series of VLMs (Visual Language Models), including Llava 1.5 and Llava 1.6 Support RAG services (i.e., OpenAI Assistants API) in the LlamaEdge API server Simplify the run-llm.sh script interactions to improve the onboarding experience for new users Support Llava series of multimodal models Llava is an open-source Visual Language Model (VLM). It supports multi-modal conversations, where the user can insert an image into a conversation and have the model answer questions based on the image.…
-
Getting Started with Qwen1.5-72B-Chat
Qwen1.5-72B-Chat,developed by Alibaba Cloud, according to its hugging face page, has the below improvements over the previous released Qwen model: Significant performance improvement in human preference for chat models; Multilingual support of both base and chat models; Stable support of 32K context length for models of all sizes. It surpasses GPT4 in 4 out of 10 benchmarks based on a photo on Qwen’s Github page. In this article, taking Qwen1.5-72B-Chat as an example, we will cover…
-
Getting Started with Gemma-2b-it
Google open sourced its Gemma models family yesterday, finally joining the open-source movement in large language models. Gemma-2b-it, like Gemma-7b-it we have discussed, is also designed for a range of text generation tasks like question answering, summarization, and reasoning. These lightweight, state-of-the-art models are built on the same technology as the Gemini models, offering text-to-text, decoder-only capabilities. They are available in English, with open weights, pre-trained variants, and instruction-tuned versions, making them suitable for deployment in resource-constrained environment.…
-
Win a Free Linux Foundation Certification Exam or Course Voucher by Contributing to WasmEdge
As a proud CNCF silver member, Second State is excited to offer 10 free vouchers (valued from $300 to $600 each) for Linux Foundation Exam/ Linux Foundation Training Course to new WasmEdge contributors. By gifting these vouchers, we hope to encourage further open-source contributions and support developers in gaining new skills and knowledge. All you need to do is to make a contribution to the WasmEdge project! Rules: 10 new contributors for all the repos under the WasmEdge org between Feb 20th and August 30th, 2024 will win a voucher code to claim a free training course or certification exam in Linux Foundation Training & Certification Catalog, excluding WasmEdge's LFX Mentorship, GSoC, OSPP, GSoD mentees during the time, and Second State’s paid interns or employees.…
-
Getting Started with Gemma-7b-it
*Right now the Gemma 7b model is undergoing some issues. Please come back to try later. Google announced Gemma models Gemma-2b-it and Gemma-7b-it yesterday. Google's Gemma model family is designed for a range of text generation tasks like question answering, summarization, and reasoning. These lightweight, state-of-the-art models are built on the same technology as the Gemini models, offering text-to-text, decoder-only capabilities. They are available in English, with open weights, pre-trained variants, and instruction-tuned versions, making them suitable for deployment in resource-constrained environment.…
-
Getting Started with Qwen1.5-0.5B-Chat
Qwen1.5-0.5B-Chat, developed by Alibaba Cloud, is a beta version of the Qwen2, a transformer-based language model pretrained on a large amount of data. It offers improved performance in chat models, multilingual support, and stable support for 32K context length for models of all sizes. The model is designed for text generation and can be used for tasks like post-training and continued pretraining. In this article, taking Qwen1.5-0.5B-Chat as an example, we will cover…