How do robots see? Beam models in robotics

Table of Contents

Note: This is a reupload of a post I published on a previous iteration of this blog in 2021.

In this post, I’ll be taking a detour from computer vision and cameras to talk about range sensors in robotics. I know, I’m proud of me too. In comparison to a camera, a robot with range sensors can ‘see’ their environment by emitting a beam and recording its echo. In the case of sensors like LiDAR, this signal is a light beam and the robot gets information by measuring the time that the light takes to travel to and from objects along a vector. In ultrasound, this beam is a sound wave.

Why do you care? Range sensors are interesting because they can operate in conditions where a camera would be awkward, such as low light. Sensors like LiDAR are also very accurate compared to cameras (or models to estimate pose from camera images) because the light conditions in general don’t affect them. Never again will someone take down your entire ML system by shining a light in your robot’s eyes 👁️💧

Right, let’s look at a simple model to illustrate this.

A basic measurement algorithm #

At time $t$, a range sensor will usually generate multiple ($K$) scans for one measurement, $z$.

$$ z_{t} = {z_{t}^{^1}, …, z_{t}^{K}} $$

I’ll use $z_{t}^{k}$ to denote one range value within the measurement.

In a simplified model we assume all of the measurement values are independent of one another given the robot’s pose $x_{t}$ and its map of the environment $m$. So the probability of this is

$$ p(z_{t} | x_{t}, m) = \prod_{k = 1}^{K} p(z_{t}^{k} | x_{t}, m) $$

As you can imagine, the independence (Markov) assumption doesn’t always hold in real-world scenarios. One example is a small angle between beams hitting a very large object. Both beams would have a very similar measurement. But you don’t need to consider that in this simplified case.

Measurement error #

A probabilistic beam model will include 4 types of measurement errors:

Small measurement noise with the correct range. This error is approximately the correct range, but factors like the atmosphere stop it being perfect. Therefore we model this by a narrow Gaussian.
Unexpected objects. If a cat suddenly pops up in front of your robot as it patrols the room, it’s probably not in the robots map 🐈. So this error is for unexpectedly short ranges. This is modelled as an exponential distribution, as the liklihood of this affecting the measurement decreases with range (hah).
Failures to detect objects. If a LiDAR sensor throws a beam onto a light-absorbing object, or an object that’s just too bright, the measurement will be missed altogether. In this case, the sensor will return the maximum range value. This way we can model it as a pretend point-mass distribution (it’s discrete, so not a real probability distribution) centered around the maximum value.
Random noise. Some sensor measurements just don’t make sense. It’s a fact of life. This can be things like cross-talk and phantom recordings. So we measure this as a uniform distribution over the entire sensors range.

The model we want, $p( z_{t} | x_{t}, m)$ is a weighted mixture of these 4 densities.

$$ p(z_{t} | x_{t}, m) = \prod_{k = 1}^{K}\textcolor{#a316e0}{z_{hit} \cdot p_{hit}(z_{t}^{k} | x_{t}, m)}+ \textcolor{#ff6900}{z_{short} \cdot p_{short}(z_{t}^{k} | x_{t}, m)}+\newline \textcolor{#a3000d}{z_{max} \cdot p_{max}(z_{t}^{k} | x_{t}, m)}+ \textcolor{#00a31d}{z_{rand} \cdot p_{rand}(z_{t}^{k} | x_{t}, m)} $$

Where the weighting parameters $z_{hit}, z_{short}, z_{max}$ and $z_{rand}$ sum to 1.

And here’s the same thing in graphical form 🎉

How to set these parameters? #

Okay we have our simple model, but how do we choose the weighting parameters? Or the parameters for the different distributions such as $\sigma$ and $\lambda$? These are all called intrinsic parameters, $\Theta$. We can estimate these using real-world (or simulated) data!

If we have a reference dataset $Z = {z_{i}}$ and robot poses $x_{i}$ and map $m$, the likelihood of this data is given by

$$ p(Z | X, m, \Theta) $$

We want to find $\Theta$ that maximises this likelihood, so unsurprisingly we can use a maximum likelihood (ML) estimator. I won’t write out the whole derivation here, but here is the general idea in pseudo-code.

while not converged:
    for each i, z in Z do  # i is our loop counter
        norm = 1/(p_hit(z) + p_short(z) + p_max(z) + p_rand(z))
        calculate real_z
        e_hit[i] = norm * p_hit(z)
        e_short[i] = norm * p_short(z)
        e_max[i] = norm * p_max(z)
        e_rand[i] = norm * p_rand(z)
    
    z_hit = norm(Z)^-1 * sum(e_hit)
    z_short = norm(Z)^-1 * sum(e_short)
    z_max = norm(Z)^-1 * sum(e_max)
    z_rand = norm(Z)^-1 * sum(e_rand)
    
    sigma_hit = calculate_sigma(e_hit, z, real_z)
    lambda_short = calculate_lambda(e_short, z)

return theta = {z_hit, z_short, z_max, z_rand, sigma_hit, lambda_short}

Where calculate_sigma produces

$$ \sigma = \sqrt{\frac{1}{\sum_{i}{e_{i, hit}}} \sum_{i}{e_{i, hit}(z_{i} - z_{i}^{*})}^{2}} $$

And calculate_lambda produces

$$ \lambda = \frac{\sum_{i}e_{i, short}}{\sum_{i}e_{i, short} z_{i}} $$

What are some drawbacks? #

Calculating the probability density for every measurement can get expensive fast. Especially if you’re scanning at a high rate.
If there are sensor dependencies, your robot may become overconfident in it’s estimates (every day is the same as today!).
Casting a ray is also expensive, so results can be saved into a gridlike cache. This way, you can do a table lookup instead of having to cast a ray when you don’t need to.
A small cluttered room would cause a lot of problems for this model, as it would make the probability estimates very discontinous given changes in $x_{t}$. This lacl of smoothness can cause local minima in the ML optimisation process or for the learner to miss the solution altogether.
Again, computationally expensive!

Where next? #

In this post, we had a look into the principles of beam models in robotic perception. These sensors are another way for robots to see the environment, especially if cameras are not useful e.g in the dark. This post also looked into the different types of measurement error that affect the values we collect, and some drawbacks of this approach.

I hope you enjoyed this post about beam models, it was fun to write! Most of the information in this post is from [1], so have a look in there for the full ML estimator derivation and plots. Extensions from model include likelihood fields, map-matching and feature based sensing.

References #

Bongard, J. (2008). Probabilistic Robotics. Sebastian Thrun, Wolfram Burgard, and Dieter Fox.(2005, MIT press.) 647 pages.
Long, Y., & Morris, D. Lidar Essential Beam Model for Accurate Width Estimation of Thin Poles.