Data Scientist Explores Effective COVID-19 Data-Driven Model

For the Hewlett Packard Enterprise Data Science Institute Seminar Series, Marco Sampaio, Ph.D., delivered a presentation regarding distinct models for disease spread and analysis.


COVID-19 Model

In connection with the HPE Data Science Institute, Marco Sampaio, Ph.D., explored, “COVID-19: A Surprisingly Effective Data-Driven Model.”

Marco Sampaio, a senior research data scientist at Feedzai, discussed distinct epidemiological models and data visualization tools that analyze disease spread, specifically pertaining to COVID-19. Models help determine the parameters for population and disease. 

The predictions from epidemiological models are known to be relevant to assist in decision-making by public health authorities during early disease outbreaks, and to help plan long-term vaccination strategies. He discussed modeling approaches and presented an analysis based on public datasets to display the evolution of daily mortality rates.

“There are basically three groups of modeling approaches and they are not mutually exclusive,” Sampaio said.

Of the different modeling approaches, the three most well-known include phenomenological, metapopulation and network approaches. Phenomenological approaches can be useful in the analysis of the early growth of an outbreak. 

While presenting an overview of the field, Sampaio referred to a few counter-intuitive examples in the realm of data science and disease. One involved the prevalence of infection on the population after a few decades, which can grow at intermediate vaccination coverage compared to zero coverage. Secondly, prioritizing vaccination only on high-risk groups may lead to high rates of infections, after a few decades, if the disease develops resistance.

Epidemics will always plague our community. Open source tools such as Python can build constructive data-driven models to discover solutions for these outbreaks.