Finally I feel like I’m making some progress with my racer. The first few times around I felt like I was just trying things and then sending my model through training with crap results. My little guy would constantly veer off course from the track for no real reason… it was frustrating to say the least…
It also didn’t help that AWS DeepRacer appeared to have a brief outage last week. I was unable to train or evaluate any of my models last Tuesday and Wednesday.
After that little blip though, the race was back on!
At first I wanted to try and improve the speed of my model. After watching some of the top racers fly around the track I figured I needed to go fast or go home Ricky Bobby style. I started with offering just a bit more of a variety in regards to rewards with various speeds:
if speed < 1: reward = 0 elif speed >= 1 and speed < 2: reward += 1 elif speed >= 2: reward += speed
This didn’t really seem to get me anywhere though. I couldn’t tell if it was actually any faster or not….
Part of the problem was the little guy kept driving off the course. I realized it didn’t really matter how fast I was going if I couldn’t keep the model on the track so…
With this version I tried to focus more on keeping the model on the track. I added a small little function to detect when the wheels went off the track and if they did, to reduce the reward. I had meh results…
The green line is the reward my model is getting. Meaning he thinks he’s doing a good job. The blue and red lines represent how well he’s actually doing… fml…
This time around I upped some of the reward values for staying towards the center line of the track and increased the penalty for going off track. The results were… sort of positive… I still wasn’t keeping him on track though and he would randomly drive off the course completely…
This time I wanted to double down on keeping the model on track. It was still veering off course far too often to really make any progress or improvements. So along with the function to try and keep the model towards the center I added another function to reward the model based on his distance from the borders of the track.
distance_from_border = 0.5 * track_width - distance_from_center if distance_from_border >= 0.05: reward *= 1.0 else: reward = 1e-3
And we started to see a little more progress…
A few more tweaks on reward values and….
Hell to the yes we’ve done it!
The biggest change in Mark-10 was that I really fleshed out the function for keeping distance from the border. It seems like the more “levels” you provide the model to be rewarded for, the better it can progressively learn what you want it to do.
What I mean by this is instead of say… giving the model a full 1 point reward for staying on the center of the track and then a 0 for everything else, you give the model a gradual range of options to earn rewards on. This sort of coaxes the model towards the behavior that you want.
I’m currently ranked at 525 of 1135 racers for the October Qualifier and you can check out my latest qualifying video here: