ALIIS in South Korea – self-driving cars and visual processing
Self-driving cars have been an “about-to-happen” technological revolution for over a decade. Recent setbacks in the drive to deliver them onto Western roads at Level 4 automation (with no human intervention strictly necessary) have seen them disappear from Western media, following the collapse of the Argo project.
But work on self-driving cars (rather than driver-assisted cars) continues in other parts of the world, with the laurels for Level 4 development now very likely to go to the APAC region.
Along the way to fully developed self-driving cars though, there are many individual problems to solve – not least, the problem of enhanced visibility and image processing, which can without fear of too much hyperbole be likened to building not just the eyes of the car, but the neural network that lets the computer brain interpret what is seen, in real-time, so that real driving decisions can be made and acted upon.
To take a look into the state of the art of these enhanced image processing questions, we spoke to Kevin Gordon of NexOptic – a company working to solve these problems for a self-driving car initiative that goes by the name of ALIIS (Alice) in South Korea.
So, what is ALIIS? What does it do and how does it work?
In AI terms, it’s part of the machine learning that needs to happen in a fully-functional self-driving car. Going further down into the taxonomy, ALIIS is a neural network. And what it does is work on every single pixel of an image, clarifying it in real time.
You have lots of different neural networks. Some might have image inputs and category outputs. Others might be doing decision-making. In the case of ALIIS, it’s at least relatively linear – images come in, images go out. But yeah, in terms of “what it is,” it’s a machine learning algorithm that we’ve trained to change images or video to either improve the quality or the resolution.
How do you go about training that algorithm, because there are real-time challenges and quality challenges involved, no? How do you train an algorithm to take input X and deliver output Improved-X?
What we were looking to do, we were an optics company making very narrow field lenses. So – super-high-powered lenses meant for handheld devices.
There are all sorts of issues with image stabilization and even blurring caused from just having too long of a shutter speed. So the question was, can we really ramp up the shutter speed to make it useful in new setups.
You can, but then the trade-off is usually more analog or digital gain, to make up for fewer photons being collected. So the next question was, can we use software to recover some of that fidelity when we’re operating in these higher noise environments?
The first variants of ALIIS were noise reduction algorithms – and only noise reduction algorithms. That motivates the direction of the training on the input side, because the input is obvious – it’s a noisy image. On the output side, it’s a clean image, supervised training, as plain and boring as you can imagine, it’s just a straightforward noisy to equivalent clean image.
With enough samples, that process has evolved – we’ve gone through several iterations of how we do that, from the very first prototyping, which was essentially going out with a camera and trying to get those paired images (noisy and clean), but because it’s a pixel-level algorithm, any kind of pixel misalignment means suddenly your data is skewed. There are all sorts of challenges now, you’re limited to only shooting stationary scenes –
That sounds particularly useless in the automotive application.
Exactly. So it really is that limited. Then there was another novel approach to it, where we were capturing images on displays. There, we could take high quality reference material and shoot it in arbitrary lighting environments in a lab setting with optical benches. So now, the alignment’s great, but there ended up being quite a few other factors we had to account for in that kind of training.
Like data acquisition. Most of the issues with that were just that for every camera that we wanted to work with, every camera has different electronics, intrinsic characteristics and noise profiles. And so, if we wanted to have the highest quality algorithm for that sensor, we could have that sensor in our lab and run a battery of tests, which might take a week, and then you have to automate the controlling of this and that.
That turns out to be quite onerous, and even though it really sped up our process, it ended up not being scalable. And so with the current generation, where we’re working on noise reduction anyway, we do what’s called profiling.
We can have the third-party capture according to a schedule, maybe 100, 200 images. And these are simple images, they don’t require any kind of fancy scenes or setups. From there, we can determine enough of the camera’s intrinsics, for which we now have a mathematical model, and then add in the lens stack.
Now we pretty much have a virtual camera that we train on it, which is interesting. If we do the forward modelling, then the neural network learns the inverse.
And that model is applicable across the board?
Pretty much. Our primary cameras right now can handle other sensor patterns, whether it’s infrared, or other multispectral cameras; they all follow the same basic rules that we have in the model.
Picking a focus.
It’s interesting to hear you’re starting to get to pixel-level algorithms, because immediately, the brain lights up with all the other things that could be useful for. How have you gone from there to using it for self-driving cars in real time?
I was brought in as a consultant and started developing this algorithm. We had the proof of concept, and it seemed to be working well. The company was down at CES, showcasing this hardware product. And a lot of the companies that we were in talks with were all “That’s an interesting one, but tell me more about this AI,” because at that time, AI were still on the cutting edge.
We met with automotive companies who were interested in the lens stack, and were looking for noise reduction, for low-light lenses. They said “If you can give us an extra 10, 15… 50 feet worth of reaction time, that’d be enormous.” So that’s how it became interesting to those companies. And of course, we follow the companies, so we started tailoring the solution to different use cases.
And how did South Korea come into the picture?
South Korea’s really forward-thinking in its policy and the way that the government is operating with business. It has a national strategy to have various levels of autonomous driving, and it’s facilitating the dataset build out, which is really quite an enabling initiative for any company that wants to participate, because these things otherwise are quite difficult for a startup to participate in.
In Part 2 of this article, we dive deeper into the technology that is helping make ALIIS a cutting-edge way of adding to the sensor deck of a developing self-driving project.
30 November 2023
29 November 2023
28 November 2023