AI inference is a computationally intensive task that could benefit greatly from the speed of Rust and WebAssembly. However, the standard WebAssembly sandbox provides very limited access to the native OS and hardware, such as multi-core CPUs, GPU and specialized AI inference chips. It is not ideal for the AI workload.
The popular WebAssembly System Interface (WASI) provides a design pattern for sandboxed WebAssembly programs to securely access native host functions. The WasmEdge Runtime extends the WASI model to support access to native Tensorflow libraries from WebAssembly programs. It provides the security, portability, and ease-of-use of WebAssembly and native speed for Tensorflow.
Table of contents
A Rust example
Prerequisite
You need to install WasmEdge and Rust.
Build
$ rustup target add wasm32-wasi
$ cargo build --target wasm32-wasi --release
Run
The wasmedge-tensorflow-lite
utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.
$ wasmedge-tensorflow-lite target/wasm32-wasi/release/classify.wasm < grace_hopper.jpg
It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture
Make it run faster
To make Tensorflow inference run much faster, you could AOT compile it down to machine native code, and then use WasmEdge sandbox to run the native code.
$ wasmedgec-tensorflow target/wasm32-wasi/release/classify.wasm classify.so
$ wasmedge-tensorflow-lite classify.so < grace_hopper.jpg
It is very likely a <a href='https://www.google.com/search?q=military uniform'>military uniform</a> in the picture
Code walkthrough
It is fairly straightforward to use the WasmEdge Tensorflow API. You can see the entire source code in main.rs.
First, it reads the trained TFLite model file (ImageNet) and its label file. The label file maps numeric output from the model to English names for the classified objects.
let model_data: &[u8] = include_bytes!("models/mobilenet_v1_1.0_224/mobilenet_v1_1.0_224_quant.tflite");
let labels = include_str!("models/mobilenet_v1_1.0_224/labels_mobilenet_quant_v1_224.txt");
Next, it reads the image from STDIN
and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.
let mut buf = Vec::new();
io::stdin().read_to_end(&mut buf).unwrap();
let flat_img = wasmedge_tensorflow_interface::load_jpg_image_to_rgb8(&buf, 224, 224);
Then, the program runs the TFLite model with its required input tensor (i.e., the flat image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.
let mut session = wasmedge_tensorflow_interface::Session::new(&model_data, wasmedge_tensorflow_interface::ModelType::TensorFlowLite);
session.add_input("input", &flat_img, &[1, 224, 224, 3])
.run();
let res_vec: Vec<u8> = session.get_output("MobilenetV1/Predictions/Reshape_1");
Let's find the object with the highest probability, and then look up the name in the labels file.
let mut i = 0;
let mut max_index: i32 = -1;
let mut max_value: u8 = 0;
while i < res_vec.len() {
let cur = res_vec[i];
if cur > max_value {
max_value = cur;
max_index = i as i32;
}
i += 1;
}
let mut label_lines = labels.lines();
for _i in 0..max_index {
label_lines.next();
}
Finally, it prints the result to STDOUT
.
let class_name = label_lines.next().unwrap().to_string();
if max_value > 50 {
println!("It {} a <a href='https://www.google.com/search?q={}'>{}</a> in the picture", confidence.to_string(), class_name, class_name);
} else {
println!("It does not appears to be any food item in the picture.");
}
A JavaScript example
Prerequisite
You need to install WasmEdge. You also need the QuickJS interpreter for WasmEdge. It is the qjs_tf.wasm
file in the WasmEdge repo.
You can build you own
qjs_tf.wasm
from the wasmedge-quickjs project.
Run
The wasmedge-tensorflow-lite
utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.
$ cd <WasmEdge>/tools/wasmedge/examples/js
# Download the Tensorflow example
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/aiy_food_V1_labelmap.txt
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/food.jpg
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/lite-model_aiy_vision_classifier_food_V1_1.tflite
$ wget https://raw.githubusercontent.com/second-state/wasmedge-quickjs/main/example_js/tensorflow_lite_demo/main.js
$ wasmedge-tensorflow-lite --dir .:. qjs_tf.wasm main.js
label: Hot dog
confidence: 0.8941176470588236
Code walkthrough
It is fairly straightforward to use the WasmEdge JavaScript Tensorflow API. You can see the entire source code in main.js.
First, it reads the image from a file and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.
let img = new Image('food.jpg')
let img_rgb = img.to_rgb().resize(192,192)
let rgb_pix = img_rgb.pixels()
Then, the program runs the TFLite model with its required input tensor (i.e., the pixel image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.
let session = new TensorflowLiteSession('lite-model_aiy_vision_classifier_food_V1_1.tflite')
session.add_input('input',rgb_pix)
session.run()
let output = session.get_output('MobilenetV1/Predictions/Softmax');
let output_view = new Uint8Array(output)
Let's find the object with the highest probability, and then look up the name in the labels file.
let max = 0;
let max_idx = 0;
for (var i in output_view){
let v = output_view[i]
if(v>max){
max = v;
max_idx = i;
}
}
let label_file = std.open('aiy_food_V1_labelmap.txt','r')
let label = ''
for(var i = 0; i <= max_idx; i++){
label = label_file.getline()
}
label_file.close()
Finally, it prints the result to the console.
print('label:')
print(label)
print('confidence:')
print(max/255)
Deployment options
All the tutorials below use the WasmEdge Rust SDK for Tensorflow to create AI inference functions. Those Rust functions are then compiled to WebAssembly and deployed together with WasmEdge on the cloud. If you are not familar with Rust, you can try our experimental AI inference DSL.
Serverless functions
The following tutorials showcase how to deploy WebAssembly programs (written in Rust) on public cloud serverless platforms. The WasmEdge Runtime runs inside a Docker container on those platforms. Each serverless platform provides APIs to get data into and out of the WasmEdge runtime through STDIN and STDOUT.
Second Sate FaaS and Node.js
The following tutorials showcase how to deploy WebAssembly functions (written in Rust) on the Second State FaaS. Since the FaaS service is running on Node.js, you can follow the same tutorials for running those functions in your own Node.js server.
- Tensorflow: Image classification using the MobileNet models | Live demo
- Tensorflow: Face detection using the MTCNN models | Live demo
Service mesh
The following tutorials showcase how to deploy WebAssembly functions and programs (written in Rust) as sidecar microservices.
- The Dapr template shows how to build and deploy Dapr sidecars in Go and Rust languages. The sidecars then use the WasmEdge SDK to start WebAssembly programs to process workloads to the microservices.
Data streaming framework
The following tutorials showcase how to deploy WebAssembly functions (written in Rust) as embedded handler functions in data streaming frameworks for AIoT.
- The YoMo template starts the WasmEdge Runtime to process image data as the data streams in from a camera in a smart factory.