How to plot flight routes using Plotly
Intro
In this post, I show how to plot a series of flight routes using a great tool called Plotly; you can find plenty of tutorials, examples and applications in its official website. With a few rows of code written in Python, it is possible to create this dynamic and fancy looking chart.
Requirements
To get to plot the above chart, we will need a few things:
- Python 3 installed in your machine, possibly using a virtual environment;
- A Plotly account (to get your plot online). Otherwise, you can use the offline option;
- The data! All the data I use in this website is collected from open sources, so anyone can reproduce everything I do, step by step. Visit this website (OpenFlights.org) and download the following files:
airports.dat and routes.dat
The airports.dat file will look like this:
AirportID | Name | City | Country | IATA | ICAO | Latitude | Longitude | Altitude | Timezone | DST | Tz | Type | Source |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | “Goroka Airport” | “Goroka” | “Papua New Guinea” | “GKA” | “AYGA” | -6.081689834590001 | 145.391998291 | 5282 | 10 | “U” | “Pacific/Port_Moresby” | “airport” | “OurAirports” |
2 | “Madang Airport” | “Madang” | “Papua New Guinea” | “MAG” | “AYMD” | -5.20707988739 | 145.789001465 | 20 | 10 | “U” | “Pacific/Port_Moresby” | “airport” | “OurAirports” |
3 | “Mount Hagen Kagamuga Airport” | “Mount Hagen” | “Papua New Guinea” | “HGU” | “AYMH” | -5.826789855957031 | 144.29600524902344 | 5388 | 10 | “U” | “Pacific/Port_Moresby” | “airport” | “OurAirports” |
4 | “Nadzab Airport” | “Nadzab” | “Papua New Guinea” | “LAE” | “AYNZ” | -6.569803 | 146.725977 | 239 | 10 | “U” | “Pacific/Port_Moresby” | “airport” | “OurAirports” |
5 | “Port Moresby Jacksons International Airport” | “Port Moresby” | “Papua New Guinea” | “POM” | “AYPY” | -9.443380355834961 | 147.22000122070312 | 146 | 10 | “U” | “Pacific/Port_Moresby” | “airport” | “OurAirports” |
Whereas the routes.dat file will contain something similar to this:
Airline | Airline ID | Source airport | Source airport ID | Destination airport | Destination airport ID | Codeshare | Stops | Equipment |
---|---|---|---|---|---|---|---|---|
2B | 410 | AER | 2965 | KZN | 2990 | 0 | CR2 | |
2B | 410 | ASF | 2966 | KZN | 2990 | 0 | CR2 | |
2B | 410 | ASF | 2966 | MRV | 2962 | 0 | CR2 | |
2B | 410 | CEK | 2968 | KZN | 2990 | 0 | CR2 | |
2B | 410 | CEK | 2968 | OVB | 4078 | 0 | CR2 |
The former contains information regarding each airport such as name, location, ICAO and IATA ID code. We are interested in the unique identifier (IATA ID code) and the position (GPS coordinates). The latter contains the routes travelled by “all” the flights publicly available throughout the planet, as of June 2014. We will take only the origin airport (Source airport) and its destination (Destination airport). These will be used as key to link the route to the origin and destination locations. In other words, knowing the route (e.g. LHR-IST, namely London Heathrow to Istanbul Atatürk), we can append the GPS coordinates of the starting and ending point of the flight trajectory.
Code
Now that we know what we need, we’ll write the code to read the data into a Pandas dataframe, do some stuff with it, and then plot it using Plotly. In this case I used the offline option to store the HTML file with the plot, although it is possible to store it online using Plotly’s website. Here is the code:
# Import libraries
import pandas as pd
#import plotly.plotly as py
import plotly.offline as ol
from geographiclib.geodesic import Geodesic
geod = Geodesic.WGS84
# Define function to calculate distance (in meters) between two points
def dist(p1Lat, p1Lon, p2Lat, p2Lon):
return geod.Inverse(p1Lat, p1Lon, p2Lat, p2Lon, Geodesic.DISTANCE)['s12']
# Read the data into a dataframe (specifying the column names)
df = pd.read_csv('routes.dat', sep=',', header=None,
names=['Airline','Airline ID','Source airport','Source airport ID',
'Destination airport','Destination airport ID','Codeshare',
'Stops','Equipment'])
# Remove duplicates (only one trajectory per route)
df = df[['Source airport','Destination airport']].drop_duplicates(keep='first', inplace=False)
# Read the data into a dataframe (specifying the column names)
df_airports = pd.read_csv('airports.dat', sep=',', header=None,
names=['Airport','Name','City','Country','IATA','ICAO',
'Latitude','Longitude','Altitude','Timezone','DST',
'Tz','Type','Source'])
# Select only those routes starting or ending from a London Airport
# in order London City, Heathrow, Gatwick, Luton, Stansted, Southend
df = df.loc[(df['Source airport'].isin(['LCY','LHR','LGW','LTN','STN','SEN'])) | (df['Destination airport'].isin(['LCY','LHR','LGW','LTN','STN','SEN']))]
# Append the origin airport's coordinates to the routes' dataframe
df = pd.merge(df, df_airports[['IATA','Latitude','Longitude']],
how='inner', left_on='Source airport', right_on='IATA', suffixes=('_Orig','_Dest'))
# Append the destination airport's coordinates to the routes' dataframe
df = pd.merge(df, df_airports[['IATA','Latitude','Longitude']],
how='inner', left_on='Destination airport', right_on='IATA', suffixes=('_Orig','_Dest'))
# Keep only Origin/Destination IATA ID columns, and their Latitude/Longitude
df = df.drop(columns=['Source airport','Destination airport'])
# Calculate the distance (great circle distance) between the origin and destination airports
df['Distance'] = ''
for index, row in df.iterrows():
df.loc[df.index==index,'Distance'] = dist(row.Latitude_Orig, row.Longitude_Orig, row.Latitude_Dest, row.Longitude_Dest)/1000
# Initialise the data list that will be used to feed the plot
data = []
# Append all airports (blue dots) to the map
data.append(dict(
type = 'scattergeo',
locationmode = 'ISO-3',
showlegend = False,
lon = df_airports['Longitude'],
lat = df_airports['Latitude'],
hoverinfo = 'text',
text = df_airports['IATA'],
mode = 'markers',
marker = dict(
size=2,
color='rgb(0, 0, 255)',
line = dict(
width=3,
color='rgba(68, 68, 68, 0)'
)
))
)
# Append the longest route to the map
data.append(
dict(
type = 'scattergeo',
locationmode = 'ISO-3',
name = 'Longest Route',
showlegend = True,
lon = [ df.loc[df['Distance']==df['Distance'].max(),'Longitude_Orig'].values[0], df.loc[df['Distance']==df['Distance'].max(),'Longitude_Dest'].values[0] ],
lat = [ df.loc[df['Distance']==df['Distance'].max(),'Latitude_Orig'].values[0], df.loc[df['Distance']==df['Distance'].max(),'Latitude_Dest'].values[0] ],
mode = 'lines',
line = dict(
width = 2,
color = 'red',
),
opacity = 1,
)
)
# Append all other routes to the map
for i in range(len(df)):
data.append(
dict(
type = 'scattergeo',
locationmode = 'ISO-3',
name = str(df['IATA_Orig'][i]) + ' - ' + str(df['IATA_Dest'][i]),
showlegend = True if df['IATA_Orig'][i] in ['LCY','LHR','LGW','LTN','STN','SEN'] else False,
lon = [ df['Longitude_Orig'][i], df['Longitude_Dest'][i] ],
lat = [ df['Latitude_Orig'][i], df['Latitude_Dest'][i] ],
mode = 'lines',
line = dict(
width = 1,
color = 'green',
),
opacity = 0.3,
)
)
# Define the plot's layout
layout = dict(
title = 'Airports and Routes',
showlegend = True,
geo = dict(
scope='world',
projection=dict( type='azimuthal equal area' ),
showland = True,
landcolor = 'rgb(255, 255, 255)',
countrycolor = 'rgb(0, 0, 0)',
),
)
# Create the figure to be plotted
fig = dict( data=data, layout=layout )
#py.plot(fig, world_readable=True)
ol.plot(fig, filename='Airports and routes.html')
Once the script has been ran, an HTML file will be created with the plot store in it.
Should you want to create an online version, you will need to create a Plotly account and make the plot publicly available (world_readable=True
).
Any comments? Let me know!