If you’ve been following the Rust ecosystem throughout 2025, you know that the “Rust for Machine Learning” narrative has shifted from “Is it possible?” to “Which tool should I use for production?”
While Python remains the undisputed king of training and research, Rust has carved out a massive niche in inference and deployment. The promise is simple: type safety, fearless concurrency, and massive performance gains without the overhead of the Python Global Interpreter Lock (GIL).
But here is the dilemma every mid-to-senior Rust developer faces today: Do you stick with the battle-tested bindings of tch-rs (Libtorch), or do you embrace the pure-Rust approach of Hugging Face’s Candle?
In this article, we aren’t just reading docs. We are going to build equivalent models in both, analyze the compilation/runtime differences, and look at the architectural trade-offs. By the end, you’ll know exactly which crate belongs in your Cargo.toml.
The Landscape in Late 2025 #
Before we write code, let’s understand the architectural differences. This is crucial because it dictates your build pipeline, Docker image sizes, and deployment strategy.
1. tch-rs (The Wrapper) #
tch-rs provides Rust bindings for the C++ API of PyTorch (Libtorch).
- Pros: Access to virtually every operation PyTorch supports. If it works in Python, it likely works here.
- Cons: Heavy reliance on external C++ shared libraries. The build process can be painful (
LNK2001errors, anyone?).
2. Candle (The Native) #
Candle is a minimalist ML framework written entirely in Rust by Hugging Face.
- Pros: Lightweight, compiles to WASM, zero C++ dependencies (unless you enable CUDA), “rusty” API design.
- Cons: Fewer implemented operators compared to PyTorch (though the gap is closing fast).
Here is a visual breakdown of how your code interacts with the hardware in both scenarios:
Prerequisites and Setup #
To follow along, ensure you have a modern Rust toolchain installed (1.80+ recommended).
Environment Setup #
For tch-rs, you usually need to download Libtorch manually, though the crate can handle it automatically in some cases. For Candle, you just need Cargo.
Create a new project:
cargo new rust_ml_showdown
cd rust_ml_showdownUpdate your Cargo.toml to include both for this comparison (in a real app, you’d pick one):
[package]
name = "rust_ml_showdown"
version = "0.1.0"
edition = "2021"
[dependencies]
# The Contender
tch = "0.18" # Verify latest version on crates.io
# The Challenger
candle-core = "0.8"
candle-nn = "0.8"
anyhow = "1.0"Round 1: The “Hello World” of Tensors #
Let’s look at the syntax. We will perform a simple matrix multiplication followed by a ReLU activation. This is the bread and butter of neural networks.
The tch-rs Approach
#
If you come from PyTorch, this will feel incredibly familiar.
// src/bin/tch_example.rs
use tch::{Tensor, Kind, Device};
fn main() -> anyhow::Result<()> {
// 1. Define device (CUDA if available, else CPU)
let device = Device::cuda_if_available();
println!("Running tch-rs on: {:?}", device);
// 2. Create Tensors
// A: 2x3 matrix
let a = Tensor::from_slice(&[1.0f32, 2.0, 3.0, 4.0, 5.0, 6.0])
.view([2, 3])
.to(device);
// B: 3x2 matrix
let b = Tensor::from_slice(&[0.1f32, 0.2, 0.3, 0.4, 0.5, 0.6])
.view([3, 2])
.to(device);
// 3. Operations: Matmul -> ReLU
// Note: tch uses method chaining heavily
let result = a.matmul(&b).relu();
// 4. Print result
result.print();
Ok(())
}Observation: The API is imperative and eager. It feels like dynamic Python code but statically typed. Notice .view() and .to()—exact parallels to PyTorch.
The Candle Approach
#
Candle forces you to handle errors explicitly. There is no hidden panic if shapes mismatch; it returns a Result.
// src/bin/candle_example.rs
use candle_core::{Device, Tensor, DType};
use anyhow::Result;
fn main() -> Result<()> {
// 1. Define Device (explicit choice)
let device = Device::Cpu; // Or Device::new_cuda(0)?;
println!("Running Candle on: {:?}", device);
// 2. Create Tensors
// Candle requires shape to be explicit during creation usually
let a = Tensor::new(&[1.0f32, 2.0, 3.0, 4.0, 5.0, 6.0], &device)?
.reshape((2, 3))?;
let b = Tensor::new(&[0.1f32, 0.2, 0.3, 0.4, 0.5, 0.6], &device)?
.reshape((3, 2))?;
// 3. Operations
// Note: Explicit error handling with `?`
let result = a.matmul(&b)?.relu()?;
// 4. Print
println!("Result:\n{}", result);
Ok(())
}Observation: Candle is more verbose regarding error handling (? everywhere). This is a good thing for production. In tch-rs, a shape mismatch often crashes the C++ runtime causing a segmentation fault or a panic. In Candle, it’s a manageable Rust error.
Round 2: Defining a Neural Network #
Let’s step it up. How do we define a reuseable model layer?
tch-rs: The nn::Module Style
#
tch-rs uses a “Path” pattern to register variables.
use tch::{nn, nn::Module, Device, Tensor};
struct SimpleNet {
fc1: nn::Linear,
fc2: nn::Linear,
}
impl SimpleNet {
fn new(vs: &nn::Path, in_dim: i64, hidden_dim: i64, out_dim: i64) -> SimpleNet {
SimpleNet {
fc1: nn::linear(vs / "fc1", in_dim, hidden_dim, Default::default()),
fc2: nn::linear(vs / "fc2", hidden_dim, out_dim, Default::default()),
}
}
}
// Forward pass trait
impl Module for SimpleNet {
fn forward(&self, xs: &Tensor) -> Tensor {
xs.apply(&self.fc1).relu().apply(&self.fc2)
}
}Candle: The Struct-Based Style
#
Candle separates the variable builder (VarBuilder) from the model logic cleanly.
use candle_core::{Tensor, Result};
use candle_nn::{Linear, Module, VarBuilder, linear};
struct SimpleNet {
fc1: Linear,
fc2: Linear,
}
impl SimpleNet {
fn new(vs: VarBuilder, in_dim: usize, hidden_dim: usize, out_dim: usize) -> Result<Self> {
let fc1 = linear(in_dim, hidden_dim, vs.pp("fc1"))?;
let fc2 = linear(hidden_dim, out_dim, vs.pp("fc2"))?;
Ok(Self { fc1, fc2 })
}
fn forward(&self, xs: &Tensor) -> Result<Tensor> {
let x = self.fc1.forward(xs)?;
let x = x.relu()?;
self.fc2.forward(&x)
}
}Verdict: The syntax is remarkably similar. However, Candle’s use of usize for dimensions (standard Rust) vs tch’s i64 (C++ heritage) makes Candle feel more native.
Round 3: Feature & Performance Comparison #
This is where the decision is usually made. I’ve compiled a comparison based on current benchmarks and developer experience.
| Feature | tch-rs (Libtorch) | Candle (Hugging Face) |
|---|---|---|
| Backend | C++ Libtorch Bindings | Pure Rust (Core) + CUDA/Metal Kernels |
| Compilation Speed | 🐢 Slow (Linker heavy) | 🐇 Fast |
| Binary Size | Huge (>100MB with Libtorch) | Tiny (Small static binary) |
| WASM Support | No (Very difficult) | ✅ First-class citizen |
| Model Support | Excellent (Anything PyTorch) | Good (LLMs, Bert, Whisper, SD) |
| Hugging Face Hub | Manual Download logic | Integrated (hf-hub crate) |
| Developer Experience | Dynamic-ish, Panics | Type-safe, Result-based |
| Deployment | Requires shared libs on host | Copy single binary & run |
The “Build Time” Trap #
When working with tch-rs, your CI/CD pipeline becomes complex. You must ensure the LIBTORCH environment variable matches the CUDA version on the runner. With Candle, you just run cargo build --release.
Performance #
- Inference (CPU): Candle is often faster due to specialized SIMD optimizations for specific models (like quantization in LLMs).
- Inference (GPU):
tch-rsstill holds a slight edge for generic networks because Libtorch’s CUDA kernels are highly tuned by NVIDIA/Facebook over a decade. However, for LLMs (Llama, Mistral), Candle’s custom kernels are competitive.
Best Practices and Common Pitfalls #
1. Shape Checking #
In ML, 90% of bugs are shape mismatches.
- tch-rs: Use
tensor.size()frequently to debug. - Candle: Use
tensor.dims(). - Tip: Create a macro to assert shapes in debug builds for both libraries.
2. Loading Weights (The .safetensors Revolution) #
Forget .bin (Pickle). Use .safetensors. Rust pioneered this format, and it is natively supported by Candle.
- Candle can memory-map
.safetensorsfiles, leading to near-instant model loading. tch-rscan load them too, but it feels like a second-class citizen compared to native PyTorch checkpoints.
3. Memory Management #
- tch-rs: Uses C++
shared_ptrunder the hood. Rust’sDrophandles cleanup, but if you create cycles in graphs, you might leak. - Candle: Pure Rust ownership rules apply. It’s much harder to leak memory.
Example: Loading a Model from Hugging Face #
This is the most common use case in 2025: downloading a pre-trained BERT or Llama model and running it.
Here is how clean it is in Candle:
use candle_core::Tensor;
use hf_hub::{api::sync::Api, Repo, RepoType};
fn main() -> anyhow::Result<()> {
// 1. Get the file path from Hugging Face Hub
let api = Api::new()?;
let repo = api.repo(Repo::new("bert-base-uncased".to_string(), RepoType::Model));
let weights_path = repo.get("model.safetensors")?;
println!("Weights downloaded to: {:?}", weights_path);
// 2. Load weights using Candle
// Note: We need a VarBuilder to map names in safetensors to the model
// let vb = unsafe { VarBuilder::from_mmaped_safetensors(&[weights_path], DType::F32, &Device::Cpu)? };
// From here, you would pass `vb` to your model struct's new() method.
Ok(())
}Doing this in tch-rs usually involves manually downloading the file using reqwest, ensuring it’s a format Libtorch understands, and then loading it.
Conclusion: Which One Should You Choose? #
As we move through 2026, the recommendation is becoming clearer.
Choose tch-rs if:
- You are porting a complex, custom research model from Python code that uses obscure PyTorch operators.
- You need 100% numerical parity with PyTorch for validation purposes.
- Binary size and compile times are not a concern (e.g., on-premise dedicated servers).
Choose Candle if:
- You are deploying to production. This is the big one. Small binaries, no shared library headaches, Docker-friendly.
- You are working with LLMs (Llama, Mistral, Gemma) or standard Computer Vision models (ResNet, YOLO).
- You want to run on Edge devices or the Browser (WASM).
- You prefer idiomatic Rust (Results over Panics).
At Rust DevPro, we have largely migrated our inference microservices to Candle. The CI/CD simplification alone saved us hours of debugging linker errors.
Further Reading #
Have you switched your ML pipeline to Rust yet? Let us know in the comments below!