MNYs Deep Learning Self Driving Car

3 minute read

For the code please visit our Github page

Some Context

This is our bachelor project at the Chair for Autonomous Intelligent Systems (AIS) at the University of Freiburg.

It was on a self-driving car based on the paper “End-to-End Driving via Conditional Imitation Learning” by Codevilla et al.

Here we implement the architecture given in the paper and attempted to create variations and improvements,

including a version which first attempts to first perform semantic segmentation on the camera image first.

Video Demo

Variations & Improvements

We made several different variations to the original paper.
The first was to use only one camera instead of the proposed three.
This was a necessity, due to hardware limitations of the tools provided.
We later managed to expand this to two cameras by rewriting some of the code written by the supervisors.

Network	Description
standard	Uses only one forward facing camera to drive
segmented	Uses only the ground truth segmentation of a forward facing camera
seg and normal	Uses both the image and ground truth segmentation of a forward facing camera
last image too	Uses both the current and previous image from a forward facing camera
two cams	Uses a forward facing and a camera angled towards the right
self segmentation	Uses the forward facing camera and segments the image with a network first

Results

Standard

Single camera and otherwise the implementation from the paper.
Works pretty well and is shown on the video.
This is on the testing dataset / unknown map and only with a small amount of data.

Segmented

Has the ground truth segmentation of a single camera output.
As expected the results of this artificial neural network are extremely good and it handles even difficult problems with ease.
Obtaining such a perfect segmentation in reality is of course very hard if not impossible.

Had some problems with lanes coming from the opposite direction, as they had the same coloring
(eg. left lane one color right lane another) and the left lanes from both sides had the same one.

Seg And Normal

Had both the original image of a camera as well as its ground truth segmentation.
It did not perform better in most situations, as it carried with it some errors of the image only,
like thinking multiple parking spaces in a row are another lane.
It performed better than the segmentation only on streets with two lanes for each direction,
this was likely becuase it now saw the line separating the two directions. In general it was perfoming on the same level as seg. only.

Last Image Too

This one was a bit tough and brought some problems,
as we doubled the input space it saved something similar to a state machine.
It tended to overcorrect, as it thought it was still very bad,
when it saw the last picture as the same as the new one before.
Though this was probably only because we trained it with the normal dataset.
In the end we were constrained by the end of the project.

Two Cams

A second image from the same position facing a bit to the left,
though in hindsight we should've done it to the right,
to get the outer side of the lanes in better view.
This overcorrected to the left, as it had the same problem as last img. too.
With a bigger dataset both would've probably just ignored the other image.

Self Segmentation

Here we implemented our own segmentation network to segment street,
lines indicating lanes and unimportant stuff.
Sadly on the computers provided this could barely run in real time and
lagged terribly, even though it worked quite well, if slow on our GPU.

For more info on the structure of the neural networks look into our slides

Environment

The whole simulation took place on a test map similar to the Audi Cup inside Unreal Engine.

Twitter Facebook LinkedIn

MNY Team

Some Context

Video Demo

Variations & Improvements

Results

Environment