Intelligent Tornado Prediction Engine   

We are developing deep learning models to detect and predict tornadoes in real time.
A photo show radar returns of a thunderstorm and black lines showing tornado paths.
The Intelligent Tornado Prediction Engine utilizes a massive open-source dataset and deep learning models to detect and predict tornadoes. A case example shows probabilistic detection of a tornado in Alabama on April 28, 2014.

Tornadoes in the Southeastern United States pose a unique threat to life and property. This region experiences a combination of different storm types (e.g., supercellular, quasi-linear convective systems [QLCSs], and tropical) that can form at day or night, and has a high density of manufactured homes and other vulnerable residential structures. At the same time, tornadoes continue to pose a significant threat throughout the United States in an ever-changing climate, leading to changes in tornado location, intensity, and frequency. Tornado-warning lead times continue to stagnate, especially with attempts to reduce high false-alarm rates. This is especially the case for QLCS and tropical cyclone tornadoes where, historically, warning lead times have been significantly shorter than for supercellular tornadoes. Thus, new paradigms must be explored for improving lead times using the plethora of data now available to National Weather Service forecasters with the advent of rapid-update satellite, radar, and numerical weather prediction models.  

One of the most rapidly advancing approaches to many problems in the atmospheric sciences is deep learning, a form of artificial intelligence that is popular in image processing for extracting high-level features from extremely large datasets. Deep learning is capable of combining many very large datasets in order to “learn” trends and features in the data based on history, making it an ideal candidate to search for combined precursors amongst several different data sources. When a large number of tornadic cases are combined with a large number of null cases (from similar-looking storms), deep learning has the potential to be able to discern between tornadic and non-tornadic precursors. Discovery of new combinations of precursors could be used to train forecasters, determine new physical mechanisms for tornadogenesis, and even feed future probabilistic prediction techniques. 

We have curated and released an open-source, benchmark dataset of radar data for every tornado, false alarm, and many similar-looking severe thunderstorms across the United States from 2013 to 2022.  Benchmark datasets are popular in various artificial intelligence fields, and are becoming increasingly available in the atmospheric sciences.  This dataset (accessible at https://github.com/mit-ll/tornet) allows anyone in the world to start with over 200,000 images already collected and organized, and serves as a baseline for research teams to compare their results. We have also developed and open-sourced baseline models for tornado detection with this dataset, along with code, notebooks, and examples for anyone to use. Our models perform admirably for tornado detection, with results that outperform other techniques currently in the literature, but we hope that the research community will continue to improve our results.

Ongoing work includes expanding our dataset backwards in time in order to design baseline tornado prediction models, as well as adding more data modalities. These datasets include satellite, lightning, and numerical weather prediction models to aid in the design of tornado detection and prediction models. Once completed, additional baselines and datasets will be released to the public. Work to run these models in real-time in a web-based interface is also anticipated in the future.