Using MATLAB to look at F1 data

I’ve always been interested in Formula one, and yesterday I came across an API to access the data from an F1 race. The experimental API gives to access the historical motor racing data, which includes upto date F1 data, including the race on Sunday.

The data is available to download both in full as a database and as an API web service, I used the latter along with some simple MATLAB code to display the data. I make use of the matlab webread function to pull the data from the web service, which automatically processes  the JSON data into a data structure in matlab, the paths of the Race Name, the Total Number of Laps, and the Date of the Race.

tempdata = webread('','Timeout',30);
totalLaps = str2num(tempdata.MRData.RaceTable.Races.Results{1}.laps);
raceTrackName = tempdata.MRData.RaceTable.Races.raceName;
raceDate = datenum(;

The access of the positions and lap times of each of the cars as they go through the race. This requires incidentally working through each lap worth of data, and building a full library of the lap times for the race.

RaceRound = 9;
RaceYear = 2016;
for lap = 1:totalLaps
url = sprintf('',RaceYear,RaceRound,lap);
tempdata = webread(url,'Timeout',30);
if lap == 1
lapTimes = tempdata.MRData.RaceTable.Races.Laps;
lapTimes = [lapTimes tempdata.MRData.RaceTable.Races.Laps];

This gave me a structure of all the lap times and the positions of each driver, but the data in its current structure was not much use for plotting, so another step was needed to extract the information in a more useful form. This involved generating a matrix where each row was allocated to a driver based on there position at the end of the first lap, and then each column containing either lap time data or the position data.

%% extract the driver ID's
lapResults = zeros(size(lapTimes,2),size(lapTimes(1).Timings,1));
driverPos = zeros(size(lapTimes,2),size(lapTimes(1).Timings,1));
driverPos(driverPos == 0) = NaN; %size(lapTimes(1).Timings,1) + 1;

for id = 1:size(lapTimes(1).Timings,1)
driverList{id} = lapTimes(1).Timings(id).driverId;

%% extract the lap times and positions
for lap = 1:size(lapTimes,2)
for driverLap = 1:size(lapTimes(lap).Timings,1)
for cDriverList = 1:size(driverList,2)
if strcmp(driverList{cDriverList},lapTimes(lap).Timings(driverLap).driverId)

t = lapTimes(lap).Timings(driverLap).time;
lapResults(lap,cDriverList) = str2num(t(1:end-7)).*60 + str2num(t(end-5:end-4)) + str2num(t(end-3:end));
driverPos(lap,cDriverList) = str2num(lapTimes(lap).Timings(driverLap).position);

Now I’ve got the data in a more useful format, its time to generate some useful visualizations from it all.

%% plot the drivers positions
grid on
graphTitle = sprintf('Driver Positions in race at %s on %s', raceTrackName,datestr(raceDate));
Austrian GP Driver Positions - Figure 1

Austrian GP Driver Positions – Figure 1

The second plot I wanted to create was to look at the actual time differences between the drivers, in particular in this race the top 4 finishers.

%% plot for total lap times
lapSum = zeros(size(lapResults,1),size(lapResults,2));
lapDiff = zeros(size(lapResults,1),size(lapResults,2));
lapSum(1,:) = lapResults(1,:);

refDriver = 1;
for lap = 2:(size(lapTimes,2))
lapSum(lap,:) = lapSum(lap-1,:)+lapResults(lap,:);
lapDiff(lap,:) = lapSum(lap,refDriver) - lapSum(lap,:);

%% plot the difference between the selected drivers

grid on
graphTitle = sprintf('Cars arround %s at %s on %s', driverList{refDriver}, raceTrackName,datestr(raceDate));
ylabel('Time difference (Seconds)')
Austrian GP Driver Time - Figure 2

Austrian GP Driver Time – Figure 2

This plot shows both the progress in relation to the actual time between drivers at the end of the race, with hamilton who won as reference time. The other times, when positive shows a driver in front of hamilton, and negative shows the driver is behind, all times shown are in seconds.

A copy of all the code I used is available here in a text file, just need to change the extension to “.m” and run the script, tested only in MATLAB 2016a on OS X.

Now to see if I can think of any other ideas, as to what to do with the data I have access to using this API.

#EuRef Sentiment distribution

The #EuRef on twitter

Well last week we had a referendum on if we should stay in the EU or not, and well… 52%  of people decided we should leave for some reason. I decided to take a look on twitter, and see what I could conclude from what people were posting in there tweets.

The volume of Sentiment of tweets collected for #EuRef

The volume of tweets collected for #EuRef and the average sentiment

Over 30 hours using the twitter search API and MATLAB I collected almost 2 Million tweets containing the hashtag #EuRef, then calculated the sentiment for each of the tweets, grouping the results into 15 minute averages and volumes shown in the graph above.

The Average sentiment seemed very negative in all the tweets, including even the ones before the result of the vote was announced on the Friday morning. So I split the tweet volumes looking at the number of tweets that had a positive and those that had a negative sentiment, there seemed to be a lot more negative tweets that positive, although the peaks in each were in similar places during the days.

Volumes of negative and positive sentiment tweets

Volumes of negative and positive sentiment tweets

This says a lot about how negative all the talk about the voting was even before the result was announced, but I decided I needed to look in more detail at how the sentiment was distributed at each point in time. To do this I plotted a histogram of the sentiment for each 15 minute period against the time.

#EuRef Sentiment distribution

#EuRef Sentiment distribution

This shows how all the tweets were continuously distributed with a negative sentiment, the scale (for the colour) is in percentage which is based on the total number of tweets in the 15 minute segment.

I hoping to do the same analysis on the #brexit hash tag, but the process in the collecting those tweets is taking a bit longer as there is a few more of them!


Apple watch 

Following the price drop and new bands announced in the event last month, I caved and brought myself an Apple Watch. I opted for the Apple Watch Sport 38mm in space grey, with the Black Woven Nylon Band.

My Apple Watch Sport

After the initial play with all the setting I’ve settled on a red watch face, with a small amount of information arround the edge, most useful ones I’ve found so far have been; remaining battery, day of the month, sunrise/sunset time, and the current weather.

Next task now is to work out the best apps to get on my watch…

2016-02-29 (2)

Over clocking my Raspberry Pi cluster

With my newly created raspberry pi cluster being made out of the Raspberry Pi 2, I don’t want it to lag to far behind the freshly announced Raspberry Pi 3. To try and keep up with the new hardware without replacing all of my Pi’s I devices it was worth a go at overclocking them.

Overlooking my Pi’s should allow them to work a bit quicker, but it will reduce power efficient and increase the tempurature that the cluster runs at. Overclocking can be a dangerous game, as if you run the Pi’s to hard you could easily cause hardware damage to the boards if the tempurature gets to high. Thankfully there are plenty of guides on the Internet to talk you through sure tasks which is what I used.

All you have to do is edit the

sudo nano /boot/config.txt

I then added the following at the end of the file, which increase not only the clocking frequency, but also the sdram and core frequency. It also enables over voltage and sets an upper temperature limit.

temp_limit=80 #Will throttle to default clock speed if hit
After which all there is to do is reboot the Raspberry Pi with
sudo reboot
Now time to test some code of my cluster and see what the performance increase is
Raspberry Pi Cluster

Octave on my Raspberry Pi Cluster

The main aim of building my Raspberry Pi Cluster was so that I could run octave code on it, now I’m not expecting to break any records, but ideally I want some performance. The best way I found to do this was with the parallel’s package.

With octave and the parallels package installed on each of the Raspberry Pi 2’s in the cluster, I setup SSH so that no passwords are required between the nodes. The first step in running the octave scripts was to launch the parallels server on each of the  3 other Raspberry Pi’s that I’m not connected to

HOSTS="rpi1 rpi2 rpi3"
SCRIPT="cd Documents/Octave; octave -q clusterStart.m"
for HOSTNAME in ${HOSTS} ; do

This launches Octave and runs the following script:

pkg load parallel

With each of these raspberry pi’s now running on the 3 other Pi’s it is now possible to send simple commands to them and get the results returned to the first Pi. The octave script that I used to test this out is bellow

pkg load parallel
hosts = { 'rpi0', 'rpi1', 'rpi2', 'rpi3' };
sockets = connect(hosts);
psum = zeros(1,3);
reval( "send(sum([1:10]),sockets(1,:))", sockets(2,:));
reval( "send(sum([11:20]),sockets(1,:))", sockets(3,:));
reval( "send(sum([21:30]),sockets(1,:))",sockets(4,:));
psum(1) = recv(sockets(2,:));
psum(2) = recv(sockets(3,:));
psum(3) = recv(sockets(4,:));

This sets up the sockets for all 4 Raspberry Pi’s including the one the code is run on (rpi0) and the 3 that are running the pserver (rpi1, rpi2 and rpi3). The code then sends commands as strings to be evaluated at the pserver’s, which in this case are simple summation’s, and then return the values to rpi0, which adds the 3 values they generated together and displays it on the screen.

The next step for me in this project is to get it running my Twitter Sentiment Code.