Research data science + public health

Biosurveillance and disease forecasting

The juxtaposition of big data analytics with rigorous disease modeling has driven research into scalable health analytics and predictive modeling.

We explore methods to detecting and predicting threats to public health that fuse "traditional" sources of health information (CDC flu reports, electronic health care reimbursement claims) with "non-traditional" sources of information (social media) to obtain the best of both worlds: nearly-real-time analytics with robust disease modeling.


Ciliary motion analysis

Motile cilia lining the nasal and bronchial passages beat synchronously to clear mucus and foreign matter from the respiratory tract. This mucociliary defense mechanism is essential for pulmonary health, because respiratory ciliary motion defects, such as those in patients with primary ciliary dyskinesia or congenital heart disease can cause severe sinopulmonary disease necessitating organ transplant.

The visual examination of nasal or bronchial biopsies is critical for the diagnosis of ciliary motion defects, but these analyses are highly subjective and error-prone. For this and many other reasons, we seek to develop an unbiased, computational method for analyzing ciliary motion.


Bioimaging for disease phenotyping

We are only beginning to scratch the surface in terms of how infections affect their host. Studying spatiotemporal changes in cellular structure as it responds to viral or bacterial infection is a crucial first step to developing therapies.

In this project, we aim to develop a general and scalable software framework for 4D tracking of spatiotemporal evolution of tagged organelles in fluorescence microscope images.


Computational olfaction

Of all the senses, olfaction is the least understood in terms of an explicit mapping from the input to perceptual space. Prior work has revealed a low-dimensional embedding of odorants in the perceptual space, but work is still ongoing in the development of a physicochemical mapping to the perceptual space.

Our work focuses primarily on developing scalable methods for generating perceptual hypotheses for odorants given physicochemical properties, and incorporating prior information regarding odorant similarity to categorize small molecules.


BigNeuron and neuron tracing

A core component of the BigNeuron project is to create scalable neuroimaging tools from the Vaa3D framework to ultimately generate neuron tracings for the complete cerebral cortex. The ultimate goal is to piece together the connectome down to the neuron resolution, providing a complete foundation for insights into the brain.

The challenge of bringing traditional neuron tracing algorithms to high-throughput, scalable settings is nontrivial.


Distributed numerical analysis

Efficient linear solvers are a cornerstone in modern data science and analytics. Unfortunately, this is a very expensive computational process; finding eigenvectors and eigenvalues of a corresponding system is not cheap.

We are investigating how hierarchical multigrid solvers can be applied in the big data space. This effort encompasses both engineering considerations (data representation, algorithmic heuristics) and theoretical proofs and guarantees of performance.


Open source development

Underlying all our work is a commitment to Open Science. We place heavy emphasis on communicating our results not only through clear and cogent scientific writing and speaking, but also through making our data and tools publicly available under open source licenses. We encourage everyone in the group to contribute to other open source projects.

In addition to research materials, we also release course and teaching materials.

Contributors to: