You can download the updated notebook from my Github page
This project creates a dashboard to show the CO2 emissions using Python with with Panel and hvplot. The data is obtained from Our World in Data. Reference is made to Statistica for Top 10 countries by share of global manufacturing output in 2020 .
I found Panel, a high-level app and dashboarding framework for Python very interesting and useful. Panel is a fully open source Python library which enables users to create interactive, web-based data dashboards. It connects user widgets to plots, images, tables, and text. It works with IPython notebook files which can leverage the Jupyter server or can be used with a normal web server, using code. I give credit to Sophia Yang for the inspiration. Out of curiosity, I decided to try it out as a personal project to see the effects of CO2 emission in different countries/continents. I filter out the data a little bit more using Pandas and somehow separated the data into two different dataframes:
Ultimately we shall get a deplyable interactive dashboard as a local host exactly as the image shown below in case you decide to use my notebook. Ofcourse exploring the dataset might give anyone a different analysis mindset. Within the actual local host at the end, we can easily see the interactivity with the slider adjustment in the side panel or by tuggling between the desired CO2widgets.
We need Panel, Pandas, Numpy and Hvplot. Hvplot is a high-level plotting API for the PyData ecosystem built on Holoviz which we can use to quickly generate interactive plots from our data. First we import the csv dataset and take a quick look. The columns of the first 5 rows show different CO2 sources, population, GDP, energy per capita, etc for different countries/continents.
import pandas as pd # DataFrame, Series (columnar/tabular data)
import numpy as np # for arrays and matrices i/o
import panel as pn
pn.extension('tabulator')
import hvplot.pandas # Hvplotis high-level plotting API for the PyData ecosystem built on HoloViews.
df = pd.read_csv('https://www.obilorjim.com/wp-content/uploads/2022/05/covid-co2-data.csv')
print('Data downloaded and read into a dataframe!')
# Check the link in Introduction for an optional data download source
Data downloaded and read into a dataframe!
# cache data to improve dashboard performance
if 'data' not in pn.state.cache.keys():
df = pd.read_csv('https://www.obilorjim.com/wp-content/uploads/2022/05/covid-co2-data.csv')
pn.state.cache['data'] = df.copy()
else:
df = pn.state.cache['data']
df.head() # view first 5 rows
iso_code | country | year | co2 | co2_per_capita | trade_co2 | cement_co2 | cement_co2_per_capita | coal_co2 | coal_co2_per_capita | ... | ghg_excluding_lucf_per_capita | methane | methane_per_capita | nitrous_oxide | nitrous_oxide_per_capita | population | gdp | primary_energy_consumption | energy_per_capita | energy_per_gdp | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | AFG | Afghanistan | 1949 | 0.015 | 0.002 | NaN | NaN | NaN | 0.015 | 0.002 | ... | NaN | NaN | NaN | NaN | NaN | 7624058.0 | NaN | NaN | NaN | NaN |
1 | AFG | Afghanistan | 1950 | 0.084 | 0.011 | NaN | NaN | NaN | 0.021 | 0.003 | ... | NaN | NaN | NaN | NaN | NaN | 7752117.0 | 9.421400e+09 | NaN | NaN | NaN |
2 | AFG | Afghanistan | 1951 | 0.092 | 0.012 | NaN | NaN | NaN | 0.026 | 0.003 | ... | NaN | NaN | NaN | NaN | NaN | 7840151.0 | 9.692280e+09 | NaN | NaN | NaN |
3 | AFG | Afghanistan | 1952 | 0.092 | 0.012 | NaN | NaN | NaN | 0.032 | 0.004 | ... | NaN | NaN | NaN | NaN | NaN | 7935996.0 | 1.001733e+10 | NaN | NaN | NaN |
4 | AFG | Afghanistan | 1953 | 0.106 | 0.013 | NaN | NaN | NaN | 0.038 | 0.005 | ... | NaN | NaN | NaN | NaN | NaN | 8039684.0 | 1.063052e+10 | NaN | NaN | NaN |
5 rows × 60 columns
We can have a better overview of the data column names. Different kinds of analysis to think of but we are interested in a few of the columns.
df.columns
Index(['iso_code', 'country', 'year', 'co2', 'co2_per_capita', 'trade_co2', 'cement_co2', 'cement_co2_per_capita', 'coal_co2', 'coal_co2_per_capita', 'flaring_co2', 'flaring_co2_per_capita', 'gas_co2', 'gas_co2_per_capita', 'oil_co2', 'oil_co2_per_capita', 'other_industry_co2', 'other_co2_per_capita', 'co2_growth_prct', 'co2_growth_abs', 'co2_per_gdp', 'co2_per_unit_energy', 'consumption_co2', 'consumption_co2_per_capita', 'consumption_co2_per_gdp', 'cumulative_co2', 'cumulative_cement_co2', 'cumulative_coal_co2', 'cumulative_flaring_co2', 'cumulative_gas_co2', 'cumulative_oil_co2', 'cumulative_other_co2', 'trade_co2_share', 'share_global_co2', 'share_global_cement_co2', 'share_global_coal_co2', 'share_global_flaring_co2', 'share_global_gas_co2', 'share_global_oil_co2', 'share_global_other_co2', 'share_global_cumulative_co2', 'share_global_cumulative_cement_co2', 'share_global_cumulative_coal_co2', 'share_global_cumulative_flaring_co2', 'share_global_cumulative_gas_co2', 'share_global_cumulative_oil_co2', 'share_global_cumulative_other_co2', 'total_ghg', 'ghg_per_capita', 'total_ghg_excluding_lucf', 'ghg_excluding_lucf_per_capita', 'methane', 'methane_per_capita', 'nitrous_oxide', 'nitrous_oxide_per_capita', 'population', 'gdp', 'primary_energy_consumption', 'energy_per_capita', 'energy_per_gdp'], dtype='object')
df = df.fillna(0)# Replace missing values with zeroes
df = df.rename(columns = {'country':'Place'})
df = df.drop(['iso_code'] , axis=1)
df["Place"].replace({"United States": "US", "United Kingdom": "UK"}, inplace=True)
df['gdp_per_capita'] = np.where(df['population']!= 0, df['gdp']/ df['population'], 0)
df.head() # View 5 rows
Place | year | co2 | co2_per_capita | trade_co2 | cement_co2 | cement_co2_per_capita | coal_co2 | coal_co2_per_capita | flaring_co2 | ... | methane | methane_per_capita | nitrous_oxide | nitrous_oxide_per_capita | population | gdp | primary_energy_consumption | energy_per_capita | energy_per_gdp | gdp_per_capita | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Afghanistan | 1949 | 0.015 | 0.002 | 0.0 | 0.0 | 0.0 | 0.015 | 0.002 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 7624058.0 | 0.000000e+00 | 0.0 | 0.0 | 0.0 | 0.000000 |
1 | Afghanistan | 1950 | 0.084 | 0.011 | 0.0 | 0.0 | 0.0 | 0.021 | 0.003 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 7752117.0 | 9.421400e+09 | 0.0 | 0.0 | 0.0 | 1215.332543 |
2 | Afghanistan | 1951 | 0.092 | 0.012 | 0.0 | 0.0 | 0.0 | 0.026 | 0.003 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 7840151.0 | 9.692280e+09 | 0.0 | 0.0 | 0.0 | 1236.236369 |
3 | Afghanistan | 1952 | 0.092 | 0.012 | 0.0 | 0.0 | 0.0 | 0.032 | 0.004 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 7935996.0 | 1.001733e+10 | 0.0 | 0.0 | 0.0 | 1262.264378 |
4 | Afghanistan | 1953 | 0.106 | 0.013 | 0.0 | 0.0 | 0.0 | 0.038 | 0.005 | 0.0 | ... | 0.0 | 0.0 | 0.0 | 0.0 | 8039684.0 | 1.063052e+10 | 0.0 | 0.0 | 0.0 | 1322.255925 |
5 rows × 60 columns
#df.to_csv('co2_data2.csv', sep=',', header=True, index=True)
df.shape # may want to check the total number of columns and rows
(25989, 60)
We can create widgets for our visualisations such as sliders, radio buttons or drop-down menus that can be used for configurations by users. First we create a panel widget for a 'year slider' using the minimum and maximum value for the years. For various widgets, please take a look at this section of the Panel User Guide.
Make Year slider:
year_slider = pn.widgets.IntSlider(name='Year slider', start=1750, end=2020, step=5, value=1960)
year_slider
co2_values = pn.widgets.RadioButtonGroup(
name='Y axis', options=['co2', 'co2_per_capita'], button_type='success')
co2_values
Make dataframe interactive and connect the data pipeline with the widgets in such a way that, as the widgets change, the underlying data for our visualization is also updated. Next we group the data by year and country.
interactive_df = df.interactive() # make dataframe interactive
Regions = ['Africa', 'Asia', 'Europe', 'North America', 'South America', 'Oceania']
co2_pipeline = (
interactive_df[
(interactive_df.year <= year_slider) &
(interactive_df.Place.isin(Regions))
]
.groupby(['Place', 'year'])[co2_values].mean()
.to_frame()
.reset_index()
.sort_values(by='year')
.reset_index(drop=True)
)
co2_pipeline.head() # View the 1st 5 rows
We create a chart using the above pipeline: A display of CO2 as a function of the year and we notice that, as the slider is moved, both CO2 and CO2 per capita values get adjusted, making our chat quite interactive.
co2_plot = co2_pipeline.hvplot(x = 'year', by='Place', y=co2_values,line_width=1.6, title = "CO2 Emission by Continent")
co2_plot
It is a good idea to showcase the data in a table format using the tabulator extension that we have already imported and making use of the same widgets.
co2_table = co2_pipeline.pipe(pn.widgets.Tabulator, pagination='remote', page_size = 10, sizing_mode='stretch_width')
co2_table
The goal is to create an interactive bar chart showing the CO2 sources of 10 manufacturing countries (call them superpowers): China, United States, Germany, Japan, India, South Korea, United Kingdom, Italy, France and Indonesia. We do this by creating a new radio button for three different sources of CO2 emissions with reference to coal, oil and gas. As above, we create a data pipeline and connect it with the widgets and make the display with Hvplot.
Top10_Countries = pn.widgets.RadioButtonGroup(
name='Y axis',
options=['coal_co2', 'oil_co2', 'gas_co2'],
button_type='success'
)
continents_excl_world = ['China', 'US', 'Germany', 'Japan', 'India',
'South Korea', 'UK', 'Italy', 'France', 'Indonesia']
co2_source_bar_pipeline = (
interactive_df[
(interactive_df.year == year_slider) &
(interactive_df.Place.isin(continents_excl_world))
]
.groupby(['year', 'Place'])[Top10_Countries].sum()
.to_frame()
.reset_index()
.sort_values(by='year')
.reset_index(drop=True)
)
co2_source_bar_plot = co2_source_bar_pipeline.hvplot(kind='bar',
x='Place',
y=Top10_Countries,
title='CO2 Emissions: Top 10 Manufacturing Countries',
height=400, width=850)
co2_source_bar_plot
#Layout using Template
template = pn.template.FastListTemplate(
title='World CO2 emission dashboard',
sidebar=[pn.pane.Markdown("## Climate Change and CO2 Emissions"),
pn.pane.Markdown("Many believe that carbon dioxide emissions are the primary cause of global climate change. The recommendation to curb this is for the world to strategically reduce emissions. How this goal can be achieved is a subject of discussions in different countries and among different experts."),
pn.pane.PNG('climate_day.png', sizing_mode='scale_both'),
pn.pane.Markdown("## Settings"),
year_slider],
main=[pn.Row(pn.Column(co2_values,
co2_plot.panel(width=700), margin=(0,25)),
co2_table.panel(width=500)),
pn.Row(pn.Column(Top10_Countries, co2_source_bar_plot.panel(width=800)))],
accent_base_color="#88d8b0",
header_background="#88d8b0",
)
template.show()
template.servable();
Launching server at http://localhost:59873
Overall, by running the above codes, we can easily obtain an interactive dashboard with panel in addition to only utilizing numpy and pandas libraries. Commenting out the code line template.show()
does not automatically launch the dashboard. Note that we can simply run in the terminal(for Mac) the text command, <panel serve Notebook_name.ipynb
to play around with our interactive dashboard.
Since we have our dashboard up and running, the next question is: how can this be consumed or how can the app be deployed? There are several ways for deployment. Panel apps are supported by Jupyter, Bokeh, and Voilà servers. So you can configure your application to run on any of these given servers. A good place to look at when it comes to Panel App deployment is in Panel User Guide, 'Depoy and export'
There are other Python-server deployment procedures on different platforms like AWS, Google Cloud, Heroku, and Anaconda Enterprise. In many cases, Panel objects can also be exported to static, standalone HTML. But the goal of this project is to show that we can use Panel to create an Interactive dashboard using Python librararies. Really cool!