Tag Archives: DeepRacer

AWS Deep Racer: Part III


Finally I feel like I’m making some progress with my racer. The first few times around I felt like I was just trying things and then sending my model through training with crap results. My little guy would constantly veer off course from the track for no real reason… it was frustrating to say the least…

It also didn’t help that AWS DeepRacer appeared to have a brief outage last week. I was unable to train or evaluate any of my models last Tuesday and Wednesday.

After that little blip though, the race was back on!


At first I wanted to try and improve the speed of my model. After watching some of the top racers fly around the track I figured I needed to go fast or go home Ricky Bobby style. I started with offering just a bit more of a variety in regards to rewards with various speeds:

    if speed < 1:
        reward = 0
    elif speed >= 1 and speed < 2:
        reward += 1
    elif speed >= 2:
        reward += speed

This didn’t really seem to get me anywhere though. I couldn’t tell if it was actually any faster or not….

Part of the problem was the little guy kept driving off the course. I realized it didn’t really matter how fast I was going if I couldn’t keep the model on the track so…



With this version I tried to focus more on keeping the model on the track. I added a small little function to detect when the wheels went off the track and if they did, to reduce the reward. I had meh results…

The green line is the reward my model is getting. Meaning he thinks he’s doing a good job. The blue and red lines represent how well he’s actually doing… fml…


This time around I upped some of the reward values for staying towards the center line of the track and increased the penalty for going off track. The results were… sort of positive… I still wasn’t keeping him on track though and he would randomly drive off the course completely…


This time I wanted to double down on keeping the model on track. It was still veering off course far too often to really make any progress or improvements. So along with the function to try and keep the model towards the center I added another function to reward the model based on his distance from the borders of the track.

    distance_from_border = 0.5 * track_width - distance_from_center
    if distance_from_border >= 0.05:
        reward *= 1.0
        reward = 1e-3

And we started to see a little more progress…



A few more tweaks on reward values and….


Hell to the yes we’ve done it!

The biggest change in Mark-10 was that I really fleshed out the function for keeping distance from the border. It seems like the more “levels” you provide the model to be rewarded for, the better it can progressively learn what you want it to do.

What I mean by this is instead of say… giving the model a full 1 point reward for staying on the center of the track and then a 0 for everything else, you give the model a gradual range of options to earn rewards on. This sort of coaxes the model towards the behavior that you want.

I’m currently ranked at 525 of 1135 racers for the October Qualifier and you can check out my latest qualifying video here:

AWS Deep Racer: Part II

The Race is on! Sort of…

So I’ve been diving in deeper to AWS Deep Racer, pun not intended… and I’ve managed to work through a handful of tutorials. I’ve built 4 different models to race with so far.

My first model, Mark-1, currently holds the best record on lap time, of my 4 models, for the October race sitting at position 737 of 791 total racers. The October 2020 track looks like this:


The Mark-1 model used a very basic reward function that simply trained the model to attempt to follow the yellow dashed line on the center track. It’s a default reward function that reinforces the model attempting to stay towards the center of the track. It doesn’t get you far… but it does get the job done. Total race time: 08:29.214



With the Mark-2 model I tried to just build off of the first iteration. I added some additional logic to keep the model from trying to zigzag too much on the track as it worked towards finding the best center line. I also made a classic rookie mistake and implemented a bug into the reward function where instead of incrementing the reward value when the model stayed on track, I was just applying a default… this model didn’t even rank.


With the Mark-3 model, still containing the defect from Mark-2, I played around with the Hyperparameters of the reinforcement model. As a part of the training system you can do more than just modify the reward function that trains the model. You can also adjust and modify the batch size, number of epochs, learning rate, and a handful of other Hyperparameters for training the model. Since I still had the bug in my reward function from the previous model, this model didn’t rank either.


On to Mark-4… I caught my defect.. ugh… and fixed that up. I also decided that at this point I was ready to move beyond the tutorials and really start adjusting the models. So the first thing I did was take a look at what the competition was doing… Here’s a look at the number 1 model currently sitting at the top of the October qualifier…


Well shit… I’ve got a ways to go to compete with this model. I mean this model is flying around the track. Mine looks like a fucking drunk wall-e….

So I’ve got some work to do but the current AWS Deep Racer app is down. As soon as it comes back I’m refocusing on speed first. I need to Lightning McQueen this bad boy if I’m going to stand a chance of even ranking in the top 100 for the October Qualifier.

Check back with me next week and we’ll see if I can get this little guy going!

AWS Deep Racer: Part I

So after my little experiment with Predicting Stock Prices with Python Machine Learning I came across the AWS Deep Racer League. A machine learning racing league built around the concept of a little single camera car that uses Reinforcement Learning to navigate a race track. I’ve decide to sign up and Compete in upcoming the October race and share what I learn along the way.

photo cred: aws

Let me give a little bit of background on how the Deep Racer works and the various pieces. There are two ways to train your little race car, virtually and physically. If you visit the Deep Racer site you can purchase a 1/18th scale race car that you can then build physical tracks for, train, and race on. There are also various models you can purchase with a couple of different input sensors so you can do things like teach the car to avoid obstacles or actually race the track against other little cars.

This little guy has 2 cameras for depth detection and a back-mounted sensor for detecting cars beside/behind it

The good news is you don’t actually have to buy one of these models to race or even compete. You can do all of the training and racing virtually via the AWS console. You also get enough resources to start building your first model and training it for free!

Now let’s get into how this actually works. What AWS has done is built a system that does most of the heavy complex machine learning for you. They offer you the servers and compute power to run the simulations and even give you video feedback on how your model is navigating the track. It becomes incredibly simple to get setup and running and you can follow this guide to get started.


When you get setup you’ll be asked to build a reward function. A reward function contains the programming logic you’ll use to tell your little race car if it’s doing it’s job correctly. The model starts out pretty dumb. It basically will do nothing or drive randomly, forward, backwards, zig-zags…. until you give it some incentive to follow the track correctly.

This is where the reward function comes in. In the function you provide the metrics for how the model should operate and you reward it when it does the job correctly. For example, you might reward the model for following the center line of the race track.

On each iteration of the function you’ll be handed some arguments on the model’s current status. Then you’ll do some fancy work to say, “okay, if the car is dead center reward it with 1 point but if it’s too far to the left or the right only give it a half point and then if it’s off the track give it zero.”

The model tests and evaluates where the best value is based on the reward function

The model will then run the track going around, trying out different strategies and using that reward function to try and earn a higher reward. On each lap it will start to learn that staying in the middle of the track offers the highest, most valuable reward.

At first this seems pretty easy… until you start to realize a couple of things. The model truly does test out all types of various options which means it may very well go in reverse and still think it’s doing a good job because you’ve only rewarded it for staying on the center line. The model won’t even take into account speed in this example because you’re not offering a reward for it.


As you start to compete you’ll find that other racers are getting even more complex like finding ways to identify how to train the model to take the inside lane of a race track or how to block an upcoming model that is attempting to pass it.

On the surface, the concepts and tools are extremely easy to pick up and get started with, however the competition and depth of this race are incredible. I’m looking forward to building some complex models over the next couple of weeks and sharing my results. If you’re interested in a side hobby check out the AWS Deep Racer League and maybe we can race against each other!