Baraja selling point was AFAIK that they used a integrated swept laser source (they typically have lower coherence but you can work around that in DSP).
So many great lines:
- "We tried to find the smoothest thing in the frame but the smoothest thing turned out to be the sky"
- "We had it adapt to rough terrain by having me drive the car and it learned from my driving. Granted, it drives like a German now."
- "Nobody tells you their sensor error rate so we had to drive the car around and have the car learn the error probabilities"
- "Nobody needs to tell you this but Stanford students are amazing"
- "A lot of the people who entered are what I would call: 'car nuts' "
IEC 60825 is applicable to safety of laser products emitting laser radiation in the wavelength range 180 nm to 1 mm. So it does not just cover wavelengths that present a retinal hazard.
IEC 60825 is not a regulation, it is an international standard that governments may use to inform regulations. For example, in the US, laser products are regulated by the FDA through 21 CFR 1040.10 and 21 CFR 1040.11. These regulations are well aligned with the US national standard ANSI z136.1 and the international standard IEC 60825-1.
Emissions from lasers that present a retinal hazard are in general not additive. This is because beams coming from different directions focus onto different parts of the retina and any effect they have on the retina is localized. Thus the claim that 20 cars at an intersection all impinging on a viewer presents a 20-fold retina hazard is ill-founded. Emissions from beams that are primarily absorbed by the cornea, before they are focussed by the eye, do present an additive hazard to the cornea.
Regarding operation under faulted conditions, in particular a scanning failure, if the system's scan-failure shutoff latency is too long to prevent hazardous emission then it is a faulty product that has not been correctly classified. IEC 60825-1 explicitly requires evaluation of emissions under every reasonably foreseeable single point of failure. Generally speaking, a scanning failure is reasonably foreseeable.
While some lasers used for example in industrial processing are extremely powerful, the sort of lasers used in automotive lidars are quite modest in power. Low power lasers like traditional supermarket barcode scanners have been in common use since the 1970s and millions, probably billions, of people have been exposed to them without the dire consequences some posters seem to predict from automotive lidars.
But humans have no lidar technology. We rely almost solely on sight for driving (and a tiny bit on sound I guess). Hence in principle it should be possible for cars to do so too. My question is this: at what point, if at all, will self-driving get good enough to make automotive lidar redundant? Or will it always be able to make the self-driving 1% better than just cameras?
This seems like it will be a growing problem with increased autonomy on the roads
Lidar is flawed at the foundational level. There's a reason no living creature on earth evolved it.
Interference between LIDARs can be a problem, mostly with the continuous-wave emitters. Pulsed emitters are unlikely to collide in time, especially if you put some random jitter in the pulse timing to prevent it. The radar people figured this out decades ago.
There are two wavelengths of interest used:
a) 905 nm/940 nm (roof and bumpers): 70–100 µJ per pulse max, regulated by IEC 60825 since this WL is focused on the retina
b) 1550 nm systems (the Laser Bear Honeycomb): 8–12 mJ per pulse allowed (100x more photons since this WL stays the cornea)
The failure mode of these LIDARs can be akin to a weapon. A stuck mirror or frozen phased array turns into a continuous-wave pencil beam.
A 1550 nm LIDAR leaking 1W continuous will raise corneal temperature >5C in 100ms. The threshold for cataract creation is only 4C rise in temp.
A 905 nm Class 1 system stuck in one pixel gives 10 mW continuous on retina, capable of creating a lesion in 250ms or less.20 cars at an intersection = 20 overlapping scanners, meaning even if each meets single-device Class 1, linear addition could offer your retina a 20x dose enough to push into Class 3B territory. The current regs (IEC 60825-1:2014) assume single-source exposure. There is no standard for multi-source, multi-axis, moving-platform overlay.
Additionally, no LIDAR manufacturer publishes beam-failure shutoff latency. Most are >50ms, which can be long enough for permanent injury
Recently got a Waymo for the first time to take my kids and I from one hotel to another in Phoenix.
- Car pulls up
- I walk up to the trunk as I have a suitcase
- Out of habit, I go to open the trunk by pressing the button under the "handle" (didn't realize you have to unlock the car via the app first)
- My hand moves by the rear trunk laser that is spinning and "whacks" my hand.
Not a big deal but seems like an interesting design choice to place a motorized spinning device right next to where people are going to be reaching to open the trunk.
https://tech.yahoo.com/transportation/articles/volvo-ends-re...
Why can't you place them further away from each other using an additional optical system (i.e. a mirror) and adjusting for the additional distance in software?
And proprioception. If I'm driving in snowy conditions, I'm definitely paying attention to whether the wheels are slipping, the car is sliding, the steering wheel suddenly feels slack, etc. combined with memorized knowledge of the road.
However, that's ... not great. It requires a lot of active engagement from the driver and gets tiring fast.
Self-driving can be way better than this.
GPS with dead reckoning tells the car exactly where it is relative to a memorized maps of the road--it won't miss a curve in a whiteout condition because it doesn't need to see the curve--that's a really big deal and gets you huge improvements over humans. Radar/lidar will detect a stopped car in front of you long before your sight will. And a computer system won't get tired after driving in stressful conditions for a half hour. etc.
By 2018, if you listen to certain circa-2015 full self-driving technologists.
I think a future where cameras are more eye like would be a big leap forward especially in bad weather - give them proper eyelids, refined tears, rotating ability, actual lenses to refocus at different distances, etc.
For pulsed emitters, indeed adding random jitter in the timing would avoid the problem of multiple lidars being synced up and firing at the same time. For some SPAD sensors, it's common to emit a train of multiple pulses to make a single measurement. Adding random jitter between them is a known and useful trick to mitigate interference. But in fact it isn't super accurate to say that interference is a problem for continuous-wave emitters either. Coherent FMCW lidar are typically quite robust against interference by, say, using randomized chirp patterns.
Over the last 2 days I drove from Greenville, SC to Raleigh, NC (4-5 hours) and back with self driving the entire way. Traffic, Charlotte, navigating parking lots to pull into a super charger. The only place I took over was the conference center parking lot for the Secure Carolina's Conference.
It drives at least as well or better than me in almost all cases...and I'm a pretty confident driver.
I say all that to say this...I can't imagine lidar improving on what I'm already seeing that much. Diminishing returns would be the biggest concern from a standpoint of cost justification. The fact that this type of technology exists in a vehicle as affordable as the Model 3 is mind blowing.
And it is certain that in India they use sound sound for echolocation.
As far as Tesla, time will tell. I ride their robotaxis daily and see them performing better than Waymo, but it's obviously meaningless until we see accident stats after they remove safety monitors.
I'm not aware of the inner workings of automotive lidar, but I can't imagine building one that didn't work that way.
[1] https://lidarmag.com/2011/05/21/velodyne-donates-lidar-and-r...
> a stuck mirror
This is one of the advantages of using an array of low power lasers rather than steering a single high power laser. The array physically doesn't have a failure mode where the power gets concentrated in a single direction. Anyway, theoretically, you would hope that class 1 eye-safe lidars should be eye safe even at point blank range, meaning that even if the beam gets stuck pointing into your eye, it would still be more or less safe.
> 20 cars at an intersection = 20 overlapping scanners, meaning even if each meets single-device Class 1, linear addition could offer your retina a 20x dose enough to push into Class 3B territory.
In the article, I point out a small nuance: If you have many lidars around, the beams from each 905 nm lidar will be focused to a different spot on your retina, and you are no worse off than if there was a single lidar. But if there are many 1550 nm lidars around, their beams will have a cumulative effect at heating up your cornea, potentially exceeding the safety threshold.
Also, if a lidar is eye-safe at point blank range, when you have multiple cars tens of meters away, laser beam divergence already starts to reduce the intensity, not to mention that when the lidars are scanning properly, the probability of all of them pointing in the same spot is almost impossible.
By the way, the Waymo Laser Bear Honeycomb is the bumper lidar (940 nm iirc) and not the big 1550 nm unit that was on the Chrysler Pacificas. The newer Jaguar I-Pace cars don't have the 1550 nm lidar at all but have a much bigger and higher performance spinning lidar.
One would hope there would be more regulation around this.
Edit: There's basically three approaches to this problem that I'm aware of. Number one is to push the cross-talk below the noise floor -- your suggestion helps with this. Number two is to do noise cancellation by measuring your cross-talk and deleting it from the signal. Number three is to make the cross-talk signal distinct from a real reflection (e.g. by modulating the pulses so that there's low correlation between an in-flight pulse and a being-fired pulse). In practice, all three work nicely together; getting the cross-talk noise below saturation allows cancellation to leave the signal in place, and reduced correlation means that the imperfections of the cancellation still get cleaned up later in the pipeline.
The MKBHD YouTube video where he shows his phone camera has burned out pixels from lidar equipped car reviews is revealing (if I recall correctly, he proceeds to show it live). I don't want that pointed at my eye.
I love lidar from an engineering / capability perspective. But I grew up with the "don't look in a laser!" warnings everywhere even on super low power units... and it's weird that those have somehow gone away. :P
To wit: Plenty of other tesla owners in a similar position as you, probably similarly praised the system, until it slammed them into a wall, car, or other obstacle, killing them.
Agreed, but there are still really good human drivers, who still operate on sight alone. It's more about the upper bound, not the human average, that can be achieved with only sight.
I've seen this claimed a lot but never have gotten a definitive answer.
Is this like "overall better but hard to pinpoint" or "this maneuver is smoother than Waymo" or something in between?
Would love to hear experiences with them since they're so limited currently.
Just look at the comments of article you posted, with sock puppet accounts being actively hostile towards anyone asking questions.
Likewise with the big spinning lidar on top, which was covered in the older Chrysler Pacificas but externally spinning in the newer Jaguar I-Paces.
[1] https://commons.wikimedia.org/wiki/File:Waymo_self-driving_c...
Ouster uses (or at least used to use, not sure if they still do) 840 nm. Much higher quantum efficiency for standard silicon receivers, without having to play games with stressed silicon and stuff; but also much better focusing by the retina, so lower power permitted.
Detect the mirror being stuck and shut the beam off. Easy.
Hint: how bad would it be if the MCU in your gas heating boiler latched up and wouldn't shut the burner off? How is this mitigated?
Ok, but now the software can cause the failure. Not sure if that's much of a relief.
I also know how the tech industry makes decisions about safety and responsibility (hahaha...). And I have seen some of the recent changes that automakers have somehow slipped past safety regulators. So it seems foolish to trust any of them on this safety issue.
Do we all have to move to rural areas, if we want to be able to go outside without wearing laser safety goggles?
To date most class-1 lasers have also been hidden/enclosed I think (and there is class 1M for limited medical use), so I'm not convinced that the limits for long-term daily exposure have been properly studied.
Until I see 3rd party studies otherwise, I plan to treat vehicle lidar no different than laser pointers and avoid looking directly at them. If/when cars become common enough that this is too hard to do, maybe I'll purchase NIR blocking glasses (though most ones I found have an ugly green tint, I wonder if it's possible to make the frequency cutoff sharp enough that it doesn't filter out visible reds).
Social media is full of little clips of lidar systems burning out camera pixels, and I'm sure big proponents of the tech have paid people off over eye injuries at this point. There've probably been a ton of injuries that just got written off as random environmental hazards, "must have looked at the sun" etc.
It's nuts that this stuff gets deployed.
Autopilot kills loads of people but my understanding is that autopilot is the dumb driver assist while FSD is the one that tries to solve general purpose driving.
Has FSD really only killed 2 people? FSD has driven 6 billion miles and the human driver death rate is 10 per billion so it has killed 2 where "as good as human" would mean 60. That seems really good tbh.
EDIT: and it looks like "deactivate before collision" doesn't work as a cheat, NHTSA requires reporting if it was active at any time within 30 seconds of the crash: https://www.nhtsa.gov/laws-regulations/standing-general-orde...
The second and third place companies in terms of the number of deployed robotaxis are both subsidiaries of large Chinese Internet platforms, and both of them are also leaders in providing geospatial data and navigation in China. Neither operates camera-only vehicles.
Crowd: https://www.youtube.com/watch?v=3DWz1TD-VZg
Negotiation: https://www.youtube.com/shorts/NxloAweI6nU
I don't know the answer to any of these but it seems like the camera based approach has some advantages to it as well. Doesn't seem that cut and dry.
> Additionally, no LIDAR manufacturer publishes beam-failure shutoff latency. Most are >50ms, which can be long enough for permanent injury
https://en.wikipedia.org/wiki/Beamforming
It is possible for the array to produce a concentrated beam into one direction. The software determines when that happens and in what direction.
So yes, a mirror trip reset is probably a good start. But would I trust someone's vision to this alone?
I wrote a whole paragraph, then realised that "relative speeds" is the sum of opposing speeds, ie. two cars going in the opposite direction at 50km/h each make up a relative speed of 100km/h.
the hurdle to full autonomous driving was basically jumped by Tesla this year.
Tesla doesn't have driverless operations anywhere, and their Austin fleet consists of <30 vehicles with full time safety drivers that have a far worse safety record than Waymo vehicles.It's not nothing, but it's a long way from being a complete system (let alone the obviously superior one).
My biggest gripe with FSD is typically that it's too safe in a few situations where I would have gone a little sooner at an intersection.
EDIT: https://www.tesla.com/fsd/safety
Miles Driven Before Major Collision
699,000 - US Avg
972,000 - Teslas Driven Manually (no active safety features)
2.3 mil - Tesla Driven Manually (active safety features)
5.1 mil - Tesla Driven with FSD (supervised)
Miles Driven Before Minor Collision
229,000 - US Avg
308,000 - Teslas Driven Manually (no active safety features)
741,000 - Tesla Driven Manually (active safety features)
1.5 mil - Tesla Driven with FSD (supervised)
Nope, nothing as complicated as that. You're close with the watchdog timer.
The solenoid is driven by a charge pump, which is capacitively coupled to the output of the controller. The controller toggles the gas grant output on and off a couple of times a second, and it doesn't matter if it sticks high or low - if there's no pulses the charge pump with "go flat" after about a second and drop the solenoid out.
Do the same thing. If a sensor at the edge of the LIDAR's scan misses a scan, kill the beam.
Same way we used to do for electron beam scanning.
One could go further, and have an integrated system where the headset shows video feed from cameras around the car. You could almost get a 3rd person view of your own car like in video games.
I realize it's not easily possible to prove the negative, but when you're exposing the public the burden must be on the company to be transparent and rigorous. And from what I see it's difficult to even find certification documents for the lidars used in commercial self-driving vehicles, possibly because everything is proprietary and trade secret.
Thomas Midgley even organised an event for reporters where he poured pure tetraethyl lead on his hands and inhale its fumes for around a minute to show how safe it was. "I could do this every day without getting any health problems", he claimed. Once the reporters left, he needed a lay-down to recover.
Tesla are on their final stretch now and can basically manufacture the entire Waymo fleet in Robotaxis in a week.
>> Do the same thing. If a sensor at the edge of the LIDAR's scan misses a scan, kill the beam.
Sounds like a great plan, but I question the "about a second" timing; the GP post calculates that "about a second" is between 4X and 10X the time required to cause damage. So, how fast do these things scan/cycle across their field of view? Could this be solved by speeding up the cycle, or would that overly compromise the image? Change the scan pattern, or insert more check-points in the pattern?
I guess we will see soon.
There are tons of people suing Tesla over FSD killing people and every Robotaxi needs a "safety driver" ready to take over at all times and even with this they drive much worse than Waymo. You need more accurate data.
Waymo is operating at a much larger scale across a huge range of conditions with hardware that's generations behind their latest and still performing better.
Tesla's are generalists, they can self drive in places Waymo's simple cannot and never will be able to.
One of these can scale, the other cannot.
[0] https://www.thorlabs.com/newgrouppage9.cfm?objectgroup_id=76...
Here I'll provide a comprehensive overview of automotive lidar technology. Lidar is used for autonomous vehicles and robotics because it's a cool technology.
FIGURE 1 Waymo Jaguar I-Pace with several lidars.
FIGURE 2 A Chrysler Pacifica Hybrid with 8 Ouster lidars.
A lidar is a sensor which operates by bouncing light off surrounding surfaces. Lidars typically quantify:
FIGURE 3 Ambient, intensity, and range channels of a really old Ouster OS1-64.
In general, we are most interested in distance and bearing. Surface reflectivity is also valuable, as it allows detection of road lines in the automotive case.
By measuring distance in many directions, an autonomous vehicle can perceive its environment. Each measurement corresponds to a discrete 3D point in space. Through a decade of steady research, engineers designed algorithms capable of leveraging this 3D point cloud to unlock spatial understanding. Obstacle avoidance and precise positioning are just two direct results of this technology.
Distance and bearing measurements can be converted into 3D Cartesian points. For example, given range and bearing
,
, the 3D point is:
In contrast, a camera only measures bearing and ambient light intensity. Each pixel of a photo is a measurement of how much light there is in that particular direction. But generally, a camera has much higher bearing resolution than a lidar.
FIGURE 4 A point cloud accumulated from an Ouster OS1-64 lidar.
Measuring distance is also known as ranging. Basically, it just measures how close something is. There are in general two ways of doing this:
For measuring the time, there are again two ways:
Direct detection pulsed lidar fires one or more laser pulses. Then, we simply measure the time to see the reflection from the pulse.
where is the speed of light (
m/s). The division by 2 is because the range is half of the round trip distance.
Imagine if we have a stopwatch that measures in, say, a nanosecond resolution. If we measure 1000 nanoseconds, then it means the round trip distance was 300 m, which means that the range is 150 m.
This involves measuring the time series data of how much light is seen at any point in time. Since electronics typically run at 1 GHz or so, the time series is discretized on the order of 1 ns, which corresponds to a range of 15 cm. To further improve the ranging accuracy, an interpolation filter is a standard technique in signal processing. Typically ranging accuracy at the centimeter level is possible.
After getting the time series data, the peaks in the series are found, and these correspond to the range.
FIGURE 5 A simplified time series plot of number of photons vs time.
Usually, it is better to have stronger, shorter pulses. Diode lasers can produce pulses on the order of a couple of nanoseconds, and fiber lasers can produce even shorter pulses with much higher peak energy.
In practice, the laser pulse has some finite duration and shape (rather than being an infinitely short impulse function), so the peak is found in the cross correlation of the outgoing pulse’s shape with the return data, rather than the raw time series data itself. It is possible to send a randomly shaped pulse (or sequence of pulses), and cross correlate the return data against that. This provides much greater resistance against noise, interference, and crosstalk, and is known as a matched filter.
We should note, however, that the shape of the return pulse could be distorted or “smeared out”. This can be due to, for example, hitting a very slanted surface. One strategy to overcome this is to try correlating it with a bunch of different pulse shapes. This technique may be called template matching, dictionary matching, matched filter bank, or model-based detection.
In order to get a time series of the amount of light per unit time at a super high rate, we need a really fast sensor that can operate at 1 GHz. Usually one of these two types of sensors is used:
Other types of sensors such as CCD sensors are not fast enough for this application.
A photodiode is a diode that also has the photoelectric effect.
A diode is like a one-way valve for electricity. Just like a one way valve for water, if you try to force things sufficiently in the opposite direction, it will break down, resulting in a huge gush of water. Likewise, if you apply a strong voltage in the reverse direction, it’s called a reverse bias, and a sufficiently strong voltage will cause a sudden spike in electrical current. This is called avalanche breakdown.
Meanwhile, some metals produce an electric current when shining light on it, in an effect known as the photoelectric effect.
Avalanche photodiodes have a reverse bias, meaning that a voltage is applied in the opposite direction of the one-way valve. If the reverse voltage exceeds a certain amount known as the breakdown voltage, it stops acting like a diode. Suddenly, a large current can flow through the device.
Linear-mode APDs have a reverse bias slightly below the breakdown voltage. Here, the current is linearly related to the voltage, but the gain is very high, so that even changing a small voltage results in a large change in the current. Hence, it is a very sensitive way of measuring light intensity.
Geiger- mode avalanche photodiodes (GMAPDs) or single-photon avalanche diodes (SPADs) have such a strong reverse bias that even getting hit by a single photon can make them break down, resulting in a large current spike. The output of a SPAD can be directly connected to a voltage discriminator so that the spike becomes a digital signal from logic 0 to 1.
FIGURE 6 I-V diagram of avalanche photodiodes
In the above I-V diagram, we see the relationship between the voltage (V) and the current (I). The breakdown voltage is labelled. As you can see, where the linear-mode APD operates, the current is linearly proportional to the voltage. The Geiger-mode APD operates where the slope is effectively infinitely steep.
Note on terminology: Typically the word avalanche photodiode (APD) refers to linear-mode APDs. Meanwhile, GMAPDs and SPADs operate in the same way but the term SPAD often refers to silicon devices sensitive to near infrared (850 nm to 940 nm) and GMAPD often refers to InGaAs devices sensitive to longer wavelengths (1064 nm to 1550 nm).
SPADs have the following advantages:
Meanwhile, APDs have these advantages:
If the return signal from a pulse is very strong, a SPAD array can be saturated at the very beginning of the pulse. If the pulse length is long, ranging may be biased when measuring the range of retroreflective materials. This is also known as range walk.
SPADs are so sensitive that they can be triggered by single photons, but this also makes them sensitive to ambient illumination. Therefore saturation is a concern.
In contrast, the continuous signal from an APD can be digitized with many bits.
To prevent SPADs from being drowned out by ambient light, the probability of detection of any single SPAD must be kept very low. Some techniques include:
FIGURE 7 The Sony IMX479 SPAD sensor is physically a 105×1,568 pixel array, with a total of approximately 164,000 pixels, but it combines many pixels into macropixels, so the final output is only 520 macropixels. This allows it to have amazing dynamic range and produce this beautiful image. Note that the lower image is the raw ambient image output from the lidar rather than a separate photo taken by a camera.
Instead of a single SPAD per pixel, several SPADs can be combined into a single “macropixel”. This trade-off results in lower spatial resolution, but the benefit is that it mitigates most of the drawbacks of SPADs.
Even with a macropixel, ranging with SPADs can be noisy as there may only be as many photons measured as there are SPADs in the macropixel. To increase signal strength, the lidar can fire many shots and aggregate the time series data from each shot. This is known as multi-shot ranging.
As an additional bonus, making multiple low-energy shots is somewhat safer than a single high-energy shot as the peak laser energy is less.
The tradeoff is that it takes a longer time to make a measurement, during which you could suffer from motion blur.
Silicon photomultipliers are a group of SPADs whose outputs are combined into a single analog signal. This has some advantages:
However, an ADC is still required to digitize the signal.
Instead of firing pulses, an amplitude modulated lidar continually modulates the laser amplitude at some radio frequency, say, 1 GHz. In other words, it is just a fast blinking light that turns on and off rapidly.
Meanwhile, there are two detectors that turn on and off at the same rate but are out of phase. That is, when detector 1 is on, detector 2 is off, and vice versa.
The range can be estimated by checking the ratio of the light falling in two detectors, for ranges up to a multiple of the modulation wavelength For example, at 1 GHz, the wavelength is 15 cm.
To resolve the range absolutely, the sensor changes the modulation frequency slightly, say, to 1.05 GHz, giving a range estimate modulo a different wavelength. The unknown multiples can then be found as a least common multiple problem.
The advantage of this type of amplitude-modulated lidar is that it is very cheap. There is no need for high-speed timing electronics to count photons at a high speed. Instead, a simple oscillator is sufficient to make the lights and detectors blink at 1 GHz.
Since the detectors just need to measure intensity rather than timing information, they do not need to be very fast, and basic CMOS or CCD sensors will suffice.
This type of lidar is used in RGBD sensors such as the Kinect V2. However, the ranging accuracy is much poorer than needed for automotive purposes, so this type of lidar is not typically used for automotive.
A frequency modulated lidar has a laser that can change in frequency rapidly.
Now, the laser beam goes through a beam splitter, and part of it is sent out, where it hits something, and bounces back. Then, you can combine the part that didn’t go out with the part that bounced back.
When you combine two waves of similar but slightly different frequency, you’ll end up with something called beat. When the waves line up, they will double their strength, and when they are out of phase, they cancel each other out. Then, you can use a photodiode to measure the time series of the combined wave in order to determine the beat frequency, which in turn tells you the range.
Here’s a plot that shows this effect. The main thing is that the beat frequency is proportional to the difference in frequency, so you can measure it relatively easily with a photodiode.
Frequency modulated lidar is known as frequency modulated continuous wave (FMCW) since the laser beam is always on (a continuous wave) that doesn’t turn off. The principle of using the beat to determine the range is known as optical heterodyne detection. Here, “heterodyne” means comparing two slightly different frequencies (as opposed to “homodyne”, where you have the same frequency).
With FMCW lidar, you can also measure the speed of things by measuring the Doppler shift.
The main tradeoff is that you’ll need an expensive fiber laser that can do frequency modulation with highly linear chirps, increasing the overall cost.
A parallax lidar works by triangulation, that is, similar to coincidence rangefinding.
This does not use any timing information at all. A linear photodetector is placed physically offset from the laser. The detector measures the incident angle of the reflected light and obtains the range by triangulation.
FIGURE 9 Parallax rangefinding, figure from “Low cost laser distance sensor” by K. Konolige et al.
This is rarely or never used in automotive applications but is instead found in robotic vacuum cleaners and other low-speed, low-cost applications. A famous example is the “Low cost laser distance sensor” by Kurt Konolige et al. Many robotic vacuum cleaner sensors are based on this.
A structured light depth camera, also known as active stereo, is a special case of parallax rangefinding. Instead of a single laser beam, it projects a bunch of different dots at once, and instead of a 1D line scan sensor, it has a regular 2D sensor. But the depth measurement is again based on triangulation. Structured light depth cameras are used in the early versions of the Kinect as well as many Intel Realsense cameras.
With parallax rangefinding, it measures disparity, which is the inverse of range, so the uncertainty in range is quite high and grows quadratically with range. As such, it is less suitable for advanced robotics and autonomous cars.
As mentioned in our introduction, lidar sensors combine distance readings with bearing to produce 3D points. Now that we’ve covered distance, we are ready to discuss how to figure out the directions (bearing) of things.
Lidars can either:
Discerning bearing is also known as “imaging”. People may describe a system as having both imaged rx and tx, for example.
Generally, having imaged rx and tx is vastly better, since you are only pointing your laser beam where you’re looking, so you get more range and efficiency, and meanwhile the imaged receiver rejects off-angle background light.
There are two main approaches for discerning bearing:
FIGURE 10 Animation showing arrays vs steering
As with the methods for measuring distance, each method has advantages and disadvantages.
The advantages of arrays are that they don't have any moving parts, each array element can be a lot cheaper, potentially leading to overall cheaper cost, and that it can produce a much greater quantity of points. The advantage of beam steering is that it works with high quality but expensive laser sources such as fiber lasers, the scan pattern may be configurable. Being able to use high quality lasers also unlocks the ability to use ranging modalities unavailable to array-based lidars, such as FMCW.
Note that if you rely on steering very few (even one) lasers, the number of points per second is limited by the speed of light. It takes light a microsecond to travel 300 m round trip, meaning that a single beam lidar is limited to about a million points per second at a range of 150 m. Meanwhile, array-based lidars can easily pump out several million points per second.
The simplest way to determine direction is to just have an array of elements pointed in various directions.
Basically, you’ll need cheap and small array elements in order to have an array.
| Laser type | Performance | Cost | Array? |
|---|---|---|---|
| VCSEL | Low | Low | Solid state 2D arrays of hundreds of lasers are possible |
| Edge-emitting diodes | Mid | Mid | Discrete 1D arrays of dozens of lasers are typical |
| Fiber | High | High | No, typically used as single laser + beam steering |
Table 1 Comparison of lasers
| Sensor type | Size | Cost | Array? |
|---|---|---|---|
| SPAD | Small | Low | Solid state 2D arrays of even millions of SPADs are possible |
| APD | Mid | Mid | Discrete 1D arrays of dozens of APDs are typical |
Table 2 Comparison of receivers
With discrete arrays, you have discrete components like edge-emitting laser diodes and avalanche photodiodes that are pointed in different directions. Some early lidars, like the Velodyne VLP 16, literally have 16 circuit boards, each with one laser diode on them, and another 16 circuit boards, each with one APD on them. Then, these 32 circuit boards are glued into place.
FIGURE 11 The inside of the Velodyne VLP 16. Source: xtech
FIGURE 12 Detail of the array of 16 PCBs in the Velodyne VLP 16. Source: xtech
The reason for doing that is because, due to the simple design of the lens, it was necessary to arrange the lasers and detectors along a curved arc. Interestingly, a Google (now Waymo) patent US8836922B1 describes using a flexible substrate to achieve the curve.
Solid state arrays put lasers or detectors on a single chip. The obvious benefit is vastly simpler manufacturing and consistency. High performance edge-emitting laser diodes shoot lasers to the sides so you can’t just put a bunch of them in an array on a chip, so you’ll have to make do with lower power VCSELs.
FIGURE 14 The Ouster L3 chip, a SPAD array. Each square is a macropixel with many SPADs. Source: Ouster blog post
FIGURE 15 Instead of 32 circuit boards, there are just two in this Ouster lidar: one containing the chip full of lasers, and one containing the chip full of detectors. Source: How Ouster Digital Lidar Works
Since the laser array or detector array is now flat, the optical design will be somewhat more complex. You’ll need the lens to be image space telecentric since your flat array of lasers all produces parallel beams.
FIGURE 16 Ouster lidars have relatively complex multi-element lenses compared to single-element lenses on early Velodyne lidars. Source: How Ouster Digital Lidar Works
For lasers, only VCSELs are compatible with this method. As for detectors, SPADs are also vastly more amenable to solid state arrays, although APD arrays are also available (but with fewer elements). This is because, as discussed earlier, SPADs are compatible with typical chipmaking technologies and they output digital signals rather than analog ones, so you can fabricate them on a single chip, whereas APDs would typically require discrete components.
With large arrays, a lidar could sequentially fire small parts of an array instead of all of it at once. This is called electronically scanning. In effect, it is similar to scanning, except there are fixed elements already pointed in different directions rather than the same element being made to point in different directions. Electronically scanning has the advantage of less pixel crosstalk/blooming (more on this later) as well as being able to output more power per beam without running into thermal or safety limits.
Perhaps the most straightforward way to do beam steering is to just spin the whole lidar, which gives you 1D angular discernment. The first advantage is that this gives you 360 degree field of view. This also has the advantage of being highly compatible with arrays, so you can have a vertical array while spinning horizontally. Spinning lidars have basically only one moving part.
FIGURE 17 Size comparison between some spinning lidars.
An encoder is used to measure the angle of the turret.
The challenges of spinning are that:
Using a spinning polygonal mirror is one of the oldest and most reliable ways to scan a laser beam, which is again a 1D scanning method. This is used in, for example, laser printers.
FIGURE 18 Animation of a laser beam being reflected by a rotating hexagonal mirror.
As with spinning lidars, an encoder is used to measure the angle of the polygonal mirror.
Compared to spinning the whole turret, this has the main drawback of having a much narrower field of view (about 120 degrees is typical, as opposed to 360 degrees). However, it has the advantage of having a lighter moving part without having to deal with power transmission and heat dissipation and stuff.
This is a flat mirror that oscillates in angle to steer the beam, which can be either 1D or 2D.
FIGURE 19 Animation of a laser beam being reflected by an oscillating mirror.
Typically, a lightweight mirror is connected to a galvanometer in what’s called a mirror galvanometer (galvo). A galvanometer is one of the most basic ways to measure electrical current: it consists of a spring, a magnet, and a solenoid. When a current passes through the solenoid, it creates a magnetic field, which causes a torque to be applied as it tries to align itself with the magnet. The spring resists this force, so the amount it ends up turning is dependent on the current.
Nowadays, fast galvos are incredibly good and are used in all sorts of applications, like laser light shows, engraving, and so on.
Compared to spinning mirrors, this is somewhat less reliable, since reciprocating motion is typically less reliable than constant rotation.
Unlike spinning mirrors, you can have a single mirror that’s actuated in two axes (a 2D galvo) that allows you to steer your beam in both directions with a single mirror.
A MEMS (micro-electromechanical system) mirror is simply a mirror that is really small, typically an oscillating mirror. Because it is so small, it is typically considered “solid state” even if it is physically a moving part. Like macroscopic oscillating mirrors, MEMS scanner may be either 1D or 2D.
The primary advantage of MEMS is low cost and relatively better reliability. After all, the rate at which your moving part wears out is strongly dependent on the mass and moments of inertia of that moving part, so keeping it as light as possible makes it more resilient.
There are, however, a couple of drawbacks:
A phased array has many array elements whose phase is slightly offset. As the contributions from each element interfere, a beam is formed where they interfere constructively, and everywhere else, destructive interference causes it to cancel out.
Phased arrays are common in radar. However, the fundamental physical problem of phased arrays is that the element size must be close to the size of the wavelength, and the wavelength of light (about a micron) is way smaller than the wavelength of radio waves (ranging from millimeters to many meters). If your array spacing is too big, your beam would have very poor collimation and tons of side lobes.
You can use phased arrays for both the transmitter and receiver. For the receiver, you would have an array of optical antennae which are tiny nanophotonic detectors that can each measure the phase and amplitude.
So far, due to the physical challenges with phased arrays, there have been no commercial successes. The Quanergy S3 and an Israeli startup called Oryx Vision were two well-known entrants to attempt optical antennae.
This uses a frequency sweep laser with a fixed prism. Prisms have dispersion, which means that the index of refraction changes with wavelength (hence turning sunlight into a rainbow), so by changing the wavelength of the laser, the angle is changed. This allows it to scan in 1D. Baraja uses a MEMS mirror for the other axis.
FIGURE 21 Baraja lidar.
This requires using a high quality fiber laser or tunable diodes that can do large frequency sweeps, which can be costly.
A prism is a triangular piece of glass that can bend light. Risley prisms are a pair of two prisms that can rotate along the optical axis. When the prisms are lined up, they both bend light the same way, and the beam gets bent a lot. When they are opposite of each other, they cancel each other out, and the beam goes through straight without bending.
FIGURE 22 Animation of how Risley prisms work.
FIGURE 23 Simplified diagram showing how Risley prisms work.
Basically, when you have two prisms, one with angle and one with angle
, the
direction of the beam is proportional to:
Speed ratio between prisms: -0.743
The Livox lidars are notable for using Risley prisms. You can make other scan patterns by varying the speed of the prisms, and by putting an array of multiple lasers (e.g. the Livox Horizon’s 6 lasers) instead of one laser.
The advantage of Risley prisms is that, like polygonal mirrors, it’s cheap and robust to have things spinning at a constant speed. However, the disadvantage is very narrow field of view, and a weird scanning pattern. For some applications, the scan pattern can be an advantage, for example surveying applications where the lidar can be stationary for long periods of time, gradually covering a dense area.
Many lidars combine two 1D methods, e.g.:
For lidars, two choices of wavelength are popular:
The main advantage of near IR is that silicon is sensitive in that region, allowing much cheaper, more sensitive silicon detectors, as well as cheap laser sources. In contrast, with 1550 nm, you would need InGaAs semiconductors for your detectors, which are less sensitive and very expensive.
Meanwhile, the main advantage of 1550 nm is that eye safety regulations allow devices to output vastly more power at 1550 nm than in the near IR regime. As a result, 1550 nm lidars tend to have longer range in general.
You can see in the above chart that you are allowed to output hundreds of times more power in the steady state scenario (red curve) at 1550 nm compared to, say, 905 nm. The reason is that the human eyeball focuses near IR light to small spots on the retina, so intense light may damage the retina. On the other hand, 1550 nm light is not focused and is attenuated by water, but at high enough intensities, it will damage the cornea instead.
In practice, manufacturers carefully tune the power of the lasers to be just below the eye safety threshold for both 1550 nm lidars and near IR lidars. That is to say, 1550 nm lidars do in fact output up to 1,000,000 times more pulse energy than 905 nm ones.
Paradoxically, 1550 nm lidars may be more dangerous overall, because of the following reasons:
Eye safety aside, 1550 nm is also somewhat more attenuated by both water and water vapor, so they are likely to perform worse in poor weather. In fog, Mie scattering of the water droplets may also impact 1550 nm lidar more, as fog droplets are about 1.5 microns, and the scattering is more as the size of the sphere approaches the wavelength. That said, 1550 nm lidars do have better range to begin with, thanks to outputting a lot more power, so even with attenuation, they are still competitive in rainy situations.
There are basically three commonly used types of lasers:
| Laser type | Typical wavelength(s) | Beam quality (M²) | Coherence / FMCW-ready | Power per element | Cost | Array? |
|---|---|---|---|---|---|---|
| VCSEL | 850–940 nm | Very good, circular | Low–mid linewidth | Low (mW-tens mW) | Low | Excellent: monolithic 2D arrays (10²–10⁵ emitters), fine pitch, easy eye-safety |
| Edge-emitting diodes | 905 nm, 1350-1550 nm | Good (often elliptical) | Mid–high | Mid (100 mW–W class with bars) | Mid | Good: 1D bars/arrays (dozens–hundreds) |
| Fiber/ECDL | 1550 nm | Excellent | High (kHz–100 kHz LW) best for FMCW | High (W class via fiber amps) | High | Poor as dense arrays; usually single source + split/steer |
Table 3 Summary of laser types
Vertical cavity surface-emitting lasers (VCSELs) are very cheap and you can make a bunch of them on a chip in a chip-scale solid state array. They are called “vertical cavity” because the beam comes out perpendicular to the chip. You make them by depositing several layers of material on the chip. The main drawback is that they are low peak power.
Edge-emitting laser diodes are a mature technology and are cheap enough to be 1D arrays.
Fiber lasers produce high quality light that is highly coherent. But they are quite expensive so you can probably just afford one or two per lidar. Some lidars split one laser between many lidar heads, as in the case of Baraja’s lidar. Not only are fiber lasers more coherent, they can output millions of times greater power than edge-emitting diode lasers and VCSELs as well as much shorter pulses. Having shorter pulses is very advantageous for pulsed lidar as it improves the range resolution. Some fiber lasers can also vary the wavelength in highly linear chirps, allowing use in FMCW lidars.
The development of these lasers is highly driven by the telecommunications industry where they are used in fiber optics, so the lidar industry sort of profits from that for free.
Most spinning + array lidars need a calibrated list of angles, one per beam. Some manufacturers, like Ouster, provide a JSON metadata file containing the elevation and azimuth angles of each of the 128 beams, which is calibrated per lidar. Some manufacturers simply give a nominal set of beam angles for a lidar model that is assumed to be the same for each individual lidar, but in practice, each lidar varies slightly due to manufacturing tolerances. Early Velodynes had very bad beam angles as each of the many circuit boards was individually glued in place and manually aligned.
Here are some ways lidar measurements could have bad beam angles:
This would typically manifest as the ground curving slightly, or the trajectory of the robot curving up or down even when it is expected to be flat.
The well-known KITTI dataset is known to have bad beam angles, and some publications have to manually calibrate them in order to achieve good results. For example, in IMLS-SLAM by J. E. Deschaud:
The drift we get on the KITTI benchmark is not as good as the results we obtained with the Velodyne HDL32. This is due to three facts. First, we found a distortion of the scan point clouds because of a bad intrinsic calibration (we did a calibration of the intrinsic vertical angle of all laser beams of 0.22 degrees using the training data). Second, we found big errors in GPS data (used as ground truth) with, for example, more than 5 m in the beginning of sequence 8.
Lidars sometimes have different range offsets for each laser. This can happen when using discrete arrays where each laser-detector pair are separate components that need to be individually calibrated.
FIGURE 26 Two views of a flat wall in 2011_09_26/2011_09_26_drive_0084_extract/velodyne_points/data/0000000035.txt from the KITTI dataset. Due to uncalibrated range offsets for some of the lasers of the Velodyne HDL-64E used, points from certain beams are offset by several centimeters.
Blooming affects many lidars. Think of pointing a camera at the sun. There would be huge lens flare and brightness all around the sun. In effect, the light from the sun is “smeared” out onto neighboring pixels. Likewise, when there’s a strong lidar return, there could be spurious returns next to the shiny object.
With array lidars, neighboring detectors sometimes pick up on the return meant for a different detector. This is called crosstalk. However, even single beam lidars can suffer from blooming just due to the fact that the beam has some divergence and that the optics are imperfect.
FIGURE 27 An early prototype of the now-defunct Argo lidar illuminates the scene a whole column at a time, making it susceptible to blooming in the form of vertical columns. Source: Argo AI on YouTube
FIGURE 28 Spurious bloom returns around a retroreflector for an early Ouster OS1 prototype from 2022. Note that later firmware upgrades mitigated the issue. Source: Ouster marketing data.
This effect typically can’t be easily calibrated away, and is usually handled in lidar firmware.
This typically affects SPAD lidars like early Ouster lidars and the now-defunct Argo (formerly Princeton Lightwave) lidar. The reason is that when the return is very strong, all the SPADS get saturated at the beginning of the pulse.
FIGURE 29 For specular reflections, there’s a spike in the point cloud from this early prototype of the Argo lidar. Source: Argo AI on YouTube
FIGURE 30 Subtle range bias on highly reflective painted stripes of a pedestrian crossing for an early Ouster OS1 prototype from 2022. Source: Ouster marketing data.
Typically, a pulse is a few nanoseconds long, which means up to a few meters in physical length of the light pulse. Even a slight saturation effect can cause the the peak of the time series to be biased significantly. Very advanced signal processing techniques are needed to compensate for this.
Hysteresis in an encoder would typically manifest as some kind of lag, e.g. if it’s rotating clockwise, it could output measurements with slightly different offset than when it’s at the same angle but rotating counter clockwise. Some lidars, such as the Luminar Iris, use encoders for an oscillating beam scanner for vertical beam scanning. It also has a mode where part of the point cloud is an “up-scan” and the other part is a “down-scan”, and the two are superimposed. Often, the point cloud of the up- and down-scans do not align well, even when the vehicle is stationary, suggesting that there may be hysteresis in the encoder.
This may manifest as double-layer point clouds in the ground.
The encoder used in many spinning lidars is a circular ring with a bunch of ticks engraved on it at regular intervals.
However, it is possible that the encoder is physically offset to the side, because the ring is often just glued in place by humans. This results in a sinusoidal error.
FIGURE 31 Diagram of offset encoder and plot of measured angle vs true angle.
This effect can cause a straight corridor to appear consistently curved to one side.
Some lidars are packaged in such a way that there are two or more separate lidars in a box. For example, the Livox Mid 100 comprises three Mid-40s arranged side by side.
Sometimes, physically jostling the lidar can cause the multiple separate lidars to become misaligned. It would then be necessary to treat them as separate lidars and calibrate their orientations accurately.
Should it be capitalized as “LiDAR” instead of “lidar”?
No! We should use the lowercase because it’s a commonly used word just like radar. When radar was some sort of highly exotic military technology, it made sense to use all caps for the acronym “radio detection and ranging”, but by now it is so common that we should use lowercase. Many other words started out capitalized when new and exotic, but became lowercase once commonplace:
Now that most phones and some cars are equipped with lidar, it’s a good time to just use lowercase. Perhaps the main barrier to doing so is Apple’s autocorrect.
No, not necessarily. Whether or not something is solid state is based on whether it has macroscopic moving parts in it, not based on its shape.
Livox lidars are often mistakenly assumed to be solid state, but they are in fact mechanically scanning with some using Risley prisms and some using mirrors.
FIGURE 34 They may be rectangular, but they aren’t solid state.
Likewise, Luminar lidars are often assumed to be solid state, but they are not. The Luminar Hydra uses galvos and the Luminar Iris uses polygonal mirrors.
Solid state lidars are often perceived to be more durable and reliable. Lidar manufacturers have taken note of this market bias in customers, and marketed accordingly. For example, the Velodyne HDL-64 was marketed as solid state (even though it is externally spinning) in their 2016 press release announcing the VLP-32A.
Based on his experience during this challenge, Hall recognized the limitations of stereovision and developed the HDL-64 Solid-State Hybrid LiDAR sensor.
As justification, however, one might consider that it has an array of 64 lasers to distinguish vertical bearing, so perhaps it could be called 50% solid state as mechanical scanning is used only for the horizontal direction! In contrast, Luminar and Livox lidars use mechanical scanning for both directions, despite being a single non-spinning box.
FIGURE 35 Velodyne booth at CES 2020. In the back corner you can see David Hall.
FIGURE 36 Velodyne formerly marketed these as “Solid-State Hybrid”.