U.S. NUCLEAR ENERGY

Image source: stock.adobe.com

How reliable is nuclear energy?

In short: Extremely.  Nuclear energy is by far the most efficient energy source on the planet. This visualization compares the reliability of nuclear power plants in the United States. Each dot represents a nuclear power plant or group of plants, and its size represents the estimated electricity generated.

Click to view.

Scroll to learn more.

Research

Although this step of problem solving doesn't have the appeal of a pretty visualization, a great deal of time was spent in the research phase. 

I found myself returning again and again to the electricity and nuclear power generation data the United States government democratizes through public datasets. Checking and double checking my results with the sites revealed small inconsistencies between the two major datasets used in this visualization.

The electricity generation dataset includes reactor-generated electricity from all 95 operating nuclear reactors up to 2021, while the reactor-specific dataset only contains information on 91 of the regulated reactors and terminates in 2020.

The strange thing is that there is NRC data for 2021 and 2022, but it has not been used to update the annual summarized operating performance dataset.

This project will need to be updated once the NRC dataset is updated. Alternatively, I will return to this project to create my own summarized dataset following many of the same steps and procedures below.

Data Sources 

&

Notebooks

Click to view.

Scroll to learn more.

Capacity Factor by Energy Source Notebook

Nuclear Reactor Annual Performance Notebook

Capacity Factory by Energy Source

Load & Tidy the Datasets

After reading in the two datasets for fossil fuel and non-fossil fuel sources, the goal is to parse out the capacity factor for each fuel type only and create a newly combined and cleaned dataset for export.

Merge the Datasets

The two raw datasets below were prepared and used in the creation of the resultant dataset for use in the data model.

Raw Fossil Fuels Dataset

Clean Fossil Fuels Dataset

Raw Non-Fossil Fuels Dataset

Clean Non-Fossil Fuels Dataset

Combined Dataset of Fossil Fuels & Non-Fossil Fuels Capacity Factors

Melt the Dataset

After cleaning, the last step was to preparing the data for the visual and export.

Unpivoting the dataset to allow for temporal visualization using the pandas melt method.

Nuclear Reactor Annual Performance

Data Preparation

After loading the dataset, it appears there are 91 reactors with 44 different columns.

Immediately, I notice the 'Unit Number' has a leading space. One is enough to strip all the columns for solidarity's sake.

Slice & Melt the Dataset

I can also see that the years columns have valuable information that will need to be unpivoted, or melted.

First I need to take a slice of only the annual capacity factor data. Next, I need to melt the sliced data.

I take only years from the unpivoted dataset.

Rename the column.

Sort by years, and reindex the dataframe.

Now I merge the years back onto the dataset as its own column.

Lastly, I merge the entire melted data with the original reactor dataset.

Tidy the Dataset

Split 'Plant Name' to extract the plant names.

Split  'Location' to extract state abbreviations.

Illinois is not 'Il' bu 'IL', and 'PA' has a county name in front of it. These were corrected.

Feature Engineering

Calculated a new column based on the assumption that average capacity factor could be multiplied by the MWe figure given by the manufacturer. While this measure is not exact, it allows for a relative comparison of electrical power generation.

Exploratory Data Analysis 

NRC Reactor Region 

Capacity MWe by Estimated Generation

Estimated Generation by Capacity Factor 

Estimated Generation by Year

Reactor Containment Type

Capactiy MWe by Estimated Generation

Estimated Generation by Capacity Factor 

Estimated Generation by Year