Skip to content

com-480-data-visualization/Click-to-add-name

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project of Data Visualization (COM-480)

Student's name SCIPER
Lin Xiaoya 423134
Wu Yiqian 423147
Liu Tingsen 422014

Milestone 1Milestone 2Milestone 3

Milestone 1 (21st March, 5pm)

10% of the final grade

This is a preliminary milestone to let you set up goals for your final project and assess the feasibility of your ideas. Please, fill the following sections about your project.

(max. 2000 characters per section)

Dataset

For our project, we combined three publicly available international datasets from the World Health Organization (WHO) and the World Bank.

The sources of these data are reliable and authoritative. However, due to coming from different organizations/institutions, we need to do some data cleaning and integration (such as integrating two datasets by year and country to obtain a more correlated integrated dataset)

Datasets:

  1. Life Expectancy at Birth (WHO): https://www.who.int/data/gho/data/indicators/indicator-details/GHO/life-expectancy-at-birth-(years)
  2. GDP per Capita (World Bank): https://data.worldbank.org/indicator/NY.GDP.PCAP.CD
  3. NCD Mortality Rate (World Bank): https://data.worldbank.org/indicator/SH.DYN.NCOM.ZS

Problematic

Life expectancy is one of the most widely used indicators of a country’s overall well-being. It reflects not only healthcare quality but also economic conditions, education, public policy, and social inequality. At the same time, economic development, often measured through GDP per capita, is commonly assumed to improve living standards and health outcomes. However, the strength and nature of this relationship is not always straightforward.

Our project aims to explore the following central questions:

  • How strongly is GDP per capita associated with life expectancy across countries?
  • What are the differences in life expectancy between men and women globally?
  • How is GDP related to mortality from non-communicable diseases (NCDs), and does higher income necessarily imply lower NCD mortality?

By visualising these relationships, we aim to better understand the interplay between economic development and public health.

This project is relevant to students of economics, public health, and global development, as well as anyone interested in understanding global inequality. By presenting interactive visualisations and statistical summaries, we provide a clear and accessible overview of how wealth, gender, and disease burden relate to longevity.

Exploratory Data Analysis

All three datasets were loaded into pandas DataFrames. Since they originate from different sources, preprocessing was necessary before merging:

  • Year variables were converted to consistent integer formats.
  • Country names were standardised to ensure correct joins.
  • Rows missing essential values (GDP per capita, life expectancy, or NCD mortality rate) were removed.
  • GDP per capita was log-transformed to better capture non-linear relationships and reduce skewness.

The datasets were then merged using inner joins on country and year, ensuring that only observations present in all three datasets were retained. The resulting dataset spans from 2000 to 2021, with 12,060 total records.

All these works can be found in the Jupyter Notebook EDA.ipynb.

Key findings:

  1. On average across the dataset, women live 4.84 years longer than men.
  2. GDP and Life Expectancy maintain a strong logarithmic correlation
  3. Higher GDP tends to be associated with lower NCD mortality. However, substantial variance remains even among high-income countries.

Related work

While giants like Gapminder and the IHME’s GBD Compare offer comprehensive data on health and wealth, they function more like digital encyclopedias than narrative tools. The connection between wealth and preventable death is a story hidden in plain sight. But for most people, uncovering that story requires a tedious trek across platforms that treat human lives like static rows of data. The data is "there," but it isn't always "alive."

Our project takes the high-quality data provided by the World Bank Open Data. We’ve stripped away the academic density of the World Bank's archives to investigate a singular mystery: The Wealth Paradox. Why do some nations with high GDPs see their citizens die years earlier than those in countries with far fewer resources? By focusing on the 'exceptions', nations like The Bahamas, we look past the spreadsheets to uncover the cultural habits, dietary shifts, and hidden inequalities that determine who actually gets to grow old.

Visually, we were inspired by the clean, interactive aesthetics of The Pudding. By bringing GDP, NCD mortality, and gendered longevity into one animated interface, we transform complex public health statistics into an interactive journey.

(Note: The datasets utilized in this project have not been explored by our team in any previous ML, ADA, or semester projects).


Milestone 2 (18th April, 5pm)

10% of the final grade

The Longevity Equation

Project Report

Our comprehensive Milestone 2 report contains our detailed project goals, visualization sketches, technical tool mapping to the COM-480 syllabus, and our implementation roadmap.

Functional Prototype

The initial website skeleton and functional prototype are now live. This version demonstrates our paginated narrative structure and the layout for our upcoming D3.js visualizations.

Current Progress and Technical Implementation

For this milestone, we have focused on building a robust foundation for our data story:

  • Web Skeleton: We developed a navigation system using HTML, CSS, and JavaScript. The site supports vertical transitions between major topics and horizontal navigation for detailed rankings.
  • Narrative Flow: The investigative journey is fully drafted, moving from global demographic trends (The Gender Divide) to specific case studies (The Wealth Paradox).
  • Visualization Containers: We have implemented responsive SVG containers for our D3.js widgets.
  • D3.js Preparation: Our unified dataset from the WHO and World Bank has been pre-processed and is ready for the implementation of the Butterfly Chart, Racing Bar Chart, and the normalized Radar Chart.

Core MVP Goals

  • Deliver a fully navigable website with structured data storytelling.
  • Functional interactive World Map with a manual year timeline slider.
  • Normalized Radar Chart for individual country health profiles.

Creative Extras

  • Audio Sonification: Heartbeat sound effects that scale with data trends.
  • Personalized Marker: User-driven data input for statistical comparison.

Milestone 3 (30th May, 5pm)

80% of the final grade

Late policy

  • < 24h: 80% of the grade for the milestone
  • < 48h: 70% of the grade for the milestone

About

Initial group project repository for the data visualization course at EPFL

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors