How to plot flight routes using Plotly

Intro

In this post, I show how to plot a series of flight routes using a great tool called Plotly; you can find plenty of tutorials, examples and applications in its official website. With a few rows of code written in Python, it is possible to create this dynamic and fancy looking chart.

Requirements

To get to plot the above chart, we will need a few things:

  1. Python 3 installed in your machine, possibly using a virtual environment;
  2. A Plotly account (to get your plot online). Otherwise, you can use the offline option;
  3. The data! All the data I use in this website is collected from open sources, so anyone can reproduce everything I do, step by step. Visit this website (OpenFlights.org) and download the following files:

airports.dat and routes.dat

The airports.dat file will look like this:

AirportID Name City Country IATA ICAO Latitude Longitude Altitude Timezone DST Tz Type Source
1 “Goroka Airport” “Goroka” “Papua New Guinea” “GKA” “AYGA” -6.081689834590001 145.391998291 5282 10 “U” “Pacific/Port_Moresby” “airport” “OurAirports”
2 “Madang Airport” “Madang” “Papua New Guinea” “MAG” “AYMD” -5.20707988739 145.789001465 20 10 “U” “Pacific/Port_Moresby” “airport” “OurAirports”
3 “Mount Hagen Kagamuga Airport” “Mount Hagen” “Papua New Guinea” “HGU” “AYMH” -5.826789855957031 144.29600524902344 5388 10 “U” “Pacific/Port_Moresby” “airport” “OurAirports”
4 “Nadzab Airport” “Nadzab” “Papua New Guinea” “LAE” “AYNZ” -6.569803 146.725977 239 10 “U” “Pacific/Port_Moresby” “airport” “OurAirports”
5 “Port Moresby Jacksons International Airport” “Port Moresby” “Papua New Guinea” “POM” “AYPY” -9.443380355834961 147.22000122070312 146 10 “U” “Pacific/Port_Moresby” “airport” “OurAirports”

Whereas the routes.dat file will contain something similar to this:

Airline Airline ID Source airport Source airport ID Destination airport Destination airport ID Codeshare Stops Equipment
2B 410 AER 2965 KZN 2990 0 CR2
2B 410 ASF 2966 KZN 2990 0 CR2
2B 410 ASF 2966 MRV 2962 0 CR2
2B 410 CEK 2968 KZN 2990 0 CR2
2B 410 CEK 2968 OVB 4078 0 CR2

The former contains information regarding each airport such as name, location, ICAO and IATA ID code. We are interested in the unique identifier (IATA ID code) and the position (GPS coordinates). The latter contains the routes travelled by “all” the flights publicly available throughout the planet, as of June 2014. We will take only the origin airport (Source airport) and its destination (Destination airport). These will be used as key to link the route to the origin and destination locations. In other words, knowing the route (e.g. LHR-IST, namely London Heathrow to Istanbul Atatürk), we can append the GPS coordinates of the starting and ending point of the flight trajectory.

Code

Now that we know what we need, we’ll write the code to read the data into a Pandas dataframe, do some stuff with it, and then plot it using Plotly. In this case I used the offline option to store the HTML file with the plot, although it is possible to store it online using Plotly’s website. Here is the code:

# Import libraries
import pandas as pd
#import plotly.plotly as py
import plotly.offline as ol
from geographiclib.geodesic import Geodesic
geod = Geodesic.WGS84

# Define function to calculate distance (in meters) between two points
def dist(p1Lat, p1Lon, p2Lat, p2Lon):
    return geod.Inverse(p1Lat, p1Lon, p2Lat, p2Lon, Geodesic.DISTANCE)['s12']

# Read the data into a dataframe (specifying the column names)
df = pd.read_csv('routes.dat', sep=',', header=None,
                 names=['Airline','Airline ID','Source airport','Source airport ID',
                        'Destination airport','Destination airport ID','Codeshare',
                        'Stops','Equipment'])

# Remove duplicates (only one trajectory per route)
df = df[['Source airport','Destination airport']].drop_duplicates(keep='first', inplace=False)

# Read the data into a dataframe (specifying the column names)
df_airports = pd.read_csv('airports.dat', sep=',', header=None,
                          names=['Airport','Name','City','Country','IATA','ICAO',
                                 'Latitude','Longitude','Altitude','Timezone','DST',
                                 'Tz','Type','Source'])

# Select only those routes starting or ending from a London Airport
# in order London City, Heathrow, Gatwick, Luton, Stansted, Southend
df = df.loc[(df['Source airport'].isin(['LCY','LHR','LGW','LTN','STN','SEN'])) | (df['Destination airport'].isin(['LCY','LHR','LGW','LTN','STN','SEN']))]

# Append the origin airport's coordinates to the routes' dataframe
df = pd.merge(df, df_airports[['IATA','Latitude','Longitude']],
              how='inner', left_on='Source airport', right_on='IATA', suffixes=('_Orig','_Dest'))

# Append the destination airport's coordinates to the routes' dataframe
df = pd.merge(df, df_airports[['IATA','Latitude','Longitude']],
              how='inner', left_on='Destination airport', right_on='IATA', suffixes=('_Orig','_Dest'))

# Keep only Origin/Destination IATA ID columns, and their Latitude/Longitude
df = df.drop(columns=['Source airport','Destination airport'])

# Calculate the distance (great circle distance) between the origin and destination airports
df['Distance'] = ''
for index, row in df.iterrows():
    df.loc[df.index==index,'Distance'] = dist(row.Latitude_Orig, row.Longitude_Orig, row.Latitude_Dest, row.Longitude_Dest)/1000

# Initialise the data list that will be used to feed the plot
data = []

# Append all airports (blue dots) to the map
data.append(dict(
                type = 'scattergeo',
                locationmode = 'ISO-3',
                showlegend = False,
                lon = df_airports['Longitude'],
                lat = df_airports['Latitude'],
                hoverinfo = 'text',
                text = df_airports['IATA'],
                mode = 'markers',
                marker = dict(
                    size=2,
                    color='rgb(0, 0, 255)',
                    line = dict(
                        width=3,
                        color='rgba(68, 68, 68, 0)'
                    )
                ))
        )

# Append the longest route to the map
data.append(
        dict(
            type = 'scattergeo',
            locationmode = 'ISO-3',
            name = 'Longest Route',
            showlegend = True,
            lon = [ df.loc[df['Distance']==df['Distance'].max(),'Longitude_Orig'].values[0], df.loc[df['Distance']==df['Distance'].max(),'Longitude_Dest'].values[0] ],
            lat = [ df.loc[df['Distance']==df['Distance'].max(),'Latitude_Orig'].values[0], df.loc[df['Distance']==df['Distance'].max(),'Latitude_Dest'].values[0] ],
            mode = 'lines',
            line = dict(
                width = 2,
                color = 'red',
            ),
            opacity = 1,
        )
    )

# Append all other routes to the map
for i in range(len(df)):
    data.append(
        dict(
            type = 'scattergeo',
            locationmode = 'ISO-3',
            name = str(df['IATA_Orig'][i]) + ' - ' + str(df['IATA_Dest'][i]),
            showlegend = True if df['IATA_Orig'][i] in ['LCY','LHR','LGW','LTN','STN','SEN'] else False,
            lon = [ df['Longitude_Orig'][i], df['Longitude_Dest'][i] ],
            lat = [ df['Latitude_Orig'][i], df['Latitude_Dest'][i] ],
            mode = 'lines',
            line = dict(
                width = 1,
                color = 'green',
            ),
            opacity = 0.3,
        )
    )

# Define the plot's layout
layout = dict(
        title = 'Airports and Routes',
        showlegend = True,
        geo = dict(
            scope='world',
            projection=dict( type='azimuthal equal area' ),
            showland = True,
            landcolor = 'rgb(255, 255, 255)',
            countrycolor = 'rgb(0, 0, 0)',
        ),
    )

# Create the figure to be plotted
fig = dict( data=data, layout=layout )
#py.plot(fig, world_readable=True)
ol.plot(fig, filename='Airports and routes.html')

Once the script has been ran, an HTML file will be created with the plot store in it. Should you want to create an online version, you will need to create a Plotly account and make the plot publicly available (world_readable=True).

Any comments? Let me know!

comments powered by Disqus