State of Machine Learning in Rust

Every once in a while this topic comes up on a social media or Rust user channel. I’d like to describe briefly the way I see where things are going by a little bit of history as well as some information about existing flux of Machine Learning/Deep Learning frameworks and major recent trends.

Brief history and where are we now?

Existing ML/DL ecosystems are huge because they are the combinations of High Performance Computing, Mathematical Optimization, System and Compiler Engineering, etc. etc. So for the sake of simplicity, if we go by the common breakdown of ML into traditional ML vs. DL (overlap included), then rusty-machine, rustlearn vs. leaf comes in front of our eyes. They have done very interesting and bold developments, in particular, leaf at their time, but eventually they were mostly abandoned because of the huge task of creating a complete open-source ML/DL framework which requires

  • Various language supports (will get into in a bit)
  • Mature base linear algebra and statistic crates
  • A community of ML specialists who happen to know Rust and are willing to contribute

Dominant existing ML libraries (mostly in Python/Cython or C++) have been developed with all these supports and Rust is no exception.

Language support and crates

A while ago Gonzalo has put up a list of HPC requirements which as of now, we can say Rust supports most of the items as language (stable/unsable) features or in crates and hopefully by the end of this year we will see more and more supports. Still constant-generics (good array support), stable std::simd and native GPU, async etc. supports are work-in-progress. Some workarounds and existing solutions namely are; generic-array (using typenum), packed simd, RustaCUDA. For MPI, there’s an MPI-binding and for OpenMP, there’s rayon.

Linear algebra base

Are we learning yet? is tracking most of the signals in this area and a simple search over crates.io will tell you that we have a lot of things to cover, so when in comes to production Rust is not there yet!

Thanks to bluss who initiated ndarray and various contributors, ndarray has become the numpy of Rust i.e. the base linear algebra crate (though still a lot to be done). Note that, this is very fundamental and simply wrapping BLAS/BLIS, LAPACK etc. are not enough!

ndarray has become the base for Rust ML ecosystem where others are building upon for example, ndarray-linalg, ndarray-stats.

Community

Looking back, it is fair to say people have been, more or of less, experimenting with Rust for ML. I think the experimental phase is getting into its final stage, once Rust pushes the immediate requirements such as const-generic, GAT, std::simd, GPU support. I think the community is getting bigger and considering the collective efforts of the authors and contributors of the aforementioned crates, the number of ML specialists and enthusiasts is approx. where we can all get together to do interesting things by learning from and assessing existing ones (in particular in Python) to create our own curated Rust ecosystem. I think it is time to create an ML Working Group or at least for now, if you’re interested you can join rust-ml group to see how things would turn out.

What about Deep Learning?

This is the area I’m mostly passionate about. DL frontiers are pushing more and more into systems and compiler so that harder computations, graph level optimizations, differentiation (aka differentiable programming), efficient codegen and kernel generations etc. to happen at the compile time. Major frontiers are; TVM, tensorflow/swift, pytorch/glow (also pytorch with TVM backend). So when it comes to Rust, all these efforts cannot be ignored.

Therefore, a (short term) solution is creating bindings. That’s what I did for TVM. Basically, we can train (mostly vision tasks now) using any DL frameworks (TensorFlow, PyTorch, MXNet) or bridge some with ONNX, then compile using TVM on varieties of supported hardwares, and for inference, we can use our beloved Rust. I should also mention the existing bindings such as tensorflow/rust and tch-rs. The major problem with these bindings is they’re limited. For example, tenorflow/rust does not have the higher abstractions that Python has now and tch-rs is far from being safe.

Inference, in particular on edge devices, is one of the hottest areas. Another very interesting project which uses Rust for inference is tract which has good support for TF and ONNX ops. I should mention that Google’s TFLite, Tencent’s NCNN or FeatherCNN, Xiaomi’s MACE and Microsoft’s ELL are all trying to push their own solutions, but frankly, they’re still limited to certain well-known tasks and are painful to use for varieties of other tasks.

You might ask, how about creating a DL framework in Rust from scratch? I’d yes, first read the source code of any major DL framework and try to catch up on the compiler development. Then you’ll see the pieces are moving fast and haven’t even converged to a relatively complete solution. Though it could work out as a very long term solution, personally I’m not interested now.

Final words

I love Rust because of two main reasons

  • It is very community driven and offering solutions never/less seen before by keeping the community healthy where no-ego rules and any inputs are welcome
  • The community and in particular the leaders have high EQ which in my opinion, is one of the most neglected cohesive forces in fruitful long lasting open-source communities

I would love to see Rust flourishing in ML/DL domains. There are still areas that it lacks a decent crate such as a Visualizations crate for ML type of workloads, but my bet is on Rust. I hope this post has cleared up where Rust is when it comes to ML/DL. For inputs from other people, please see the rust-ml discussion.

2 thoughts on “State of Machine Learning in Rust

  1. Hey 🙂

    Thank you for the great post!
    What do you think would be a good place to help the rust-ML community?
    I am a Machine Learning Master student who searches for a cool project
    in the summer, but I am new to rust and can hardly estimate which
    project would be fine for a beginner.

    Best regards
    Chips

    1. That’s great 🙂

      Depending on your background, if you’re interested in tackling some fundamental challenges, it’d be helpful to work on ndarray (or the like) issues, read their source code, help with docs. Or see if you can port a cool ML project written in other languages into Rust by considering the existing Rust limitations. Also searching over crates.io could give you some inspirations of what has been done before or needs improvement, etc.

      Good luck

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.