marialma.github.io


Hi, I’m Maria Ma. I got my MPH from the UC Berkeley School of Public Health, with a focus in Infectious Disease and Vaccinology. I took a bunch of classes in the Epidemiology department. My undergraduate degree was in Microbiology.

In my past life, I was a synthetic biologist working on filamentous fungi and yeast, making enzymes.

I am based in the San Francisco Bay Area, and am currently working as a contract Epidemiologist for LA County Department of Public Health. Previously, I worked for a digital global health nonprofit as an Data Scientist. Email me at i.am.maria.ma at gmail.

What I Do

I move numbers around, deduplicate records, set up automation. I write reproducible statistical analysis code (R), turn data into figures, then write about those figures in the most accessible way I can. Sometimes, I turn these into packages for internal use. Most of what I do is data cleaning. A lot of the data in the sector I’m currently employed in is not ready for big time data science tools; it’s a big enough problem to understand data quality, let alone do machine learning.

I also do analyses to drive product improvements, and create trainings to help the organization become more data driven.

What I’m Good At

I love designing elegant studies, using both observational data and setting up data collection. This mostly means I’m great at thinking about causal inference. I’m also pretty good at R (tidyverse and ggplot), and data visualization. I approach all of my work with failure mode analysis always in the back of my head. Healthcare is my domain; I see it as my duty to make sure we don’t harm people.

I really like Sankey diagrams.

I speak conversational Mandarin Chinese, I understand Shanghainese, and I’m slowly learning Swahili and French.

Interests

I’m interested in improving quality of healthcare, and I believe in the promise of technology to achieve this.

Projects

  • bakeR
    • A function to help scale up or scale down recipes
  • COVID-19 Outcomes by Vaccination Status
    • A dashboard I built for LA County.
  • Synthetic Health Data with Agent Based Models
    • Agent based models/ “state machines” seem to me to be the most reasonable and reliable way to generate synthetic health data. Using existing machine learning approaches can generate flawed data that doesn’t make sense due to a time & state-dependent componnt, eg, a woman giving birth shortly after having a miscarriage. Existing simulation models work off of robust data sources which are also generally tailored for a Global North context. The data we often see in rural settings in the Global South is much less rich and often has significant gaps. This project is in progress, my goal is to try to generate a synthetic dataset that also mimics the missingness of the data. A synthetic dataset such as this could help people build and test tools (and maybe even algorithms) without potentially exposing real peoples’ health data.
  • My comprehensive Masters paper
    • In this project, I took a novel approach to modeling antibiotic resistance by trying to make it relate to how much antibiotic is used, as well as mutation rates. Ultimately, I wanted to try to model antibiotic resistant diseases and estimate the economic impact that they might have. This is primarily theoretical work, as there weren’t accessible databases that I could use to estimate parameters with.
  • Hospital closures and rural access project - Write up here
    • This project uses Python and R to look at access to emergency care across the US. Rural hospital closures impact health access for an already disadvantaged population. I was interested in seeing by how much. A few months after I finished this project, Pew came up with a similar analysis, that used survey data.
  • An effort to better understand the 2017 Cholera epidemic in Yemen - Write up here
    • A project using R to clean up and visualize data related to the ongoing cholera epidemic in Yemen. This crisis was not getting quite the attention it deserved.

I’m compiling a list of robust, publically available data sources. You can find that here.

Where else to find me

I run a weekly newsletter on public health and global development. Check out my archives! On Hiatus

My LinkedIn is here.

My Medium account is here, where I try to write things about public health and R.

My Tableau Public account is here, where you can browse some of the other things I’ve worked on.

Extra bonus!

I wrote a piece of public health fan fiction that I am inordinately proud of. It utilizes epidemiological concepts and takes inspiration from real-world events… as well as a particular pun involving the father of epidemiology and a Game of Thrones character.