The Data Science Institute fills the gap between research software and domain science by working with research teams at the cutting edge of data-driven discovery. The Data Science Institute targets emerging computational technologies to help a wide array of science. Our team of data scientists have expertise in these tools and techniques, ensuring that the best people are available to work on cutting edge research problems.
Currently, we are focused on the following areas:
Natural Language Processing – (NLP)
Many research areas involve large quantities of text such as transcribed speeches, corpora of works, and internet-scale text (e.g., Tweets). Natural Language Processing enables us to use computational linguistics do a Sentiment Analysis, topic modeling, word embeddings of the text to extract and summarize otherwise difficult concepts such as, concepts, meaning, and intent.
Machine Learning – (ML)
Machine Learning allows researchers to classify data, extract patterns, inferences, and features, and make predictions of future outcomes. For example, machine learning can be used to identify features in images, develop predictive models of health outcomes, and infer changes in local climate.
Large-Scale Data Visualization
Complex, multidimensional data often cannot be visualized in a bar chart. Researchers often require advanced, high performance, and interactive data visualizations to make sense of these datasets. Large-Scale Visualization (Viz) allows researchers to visualize and interact with multidimensional data without compromising the underlying dimensionality of the dataset. Importantly, it also enables researchers to manipulate the data through zoom-and-filter applications, revealing complex relationships across data types and enabling them to derive meaning of underlying processes.
Image Informatics
Imaging technologies are allowing researchers to generate high-resolution 2D/3D/4D/5D image stacks. These datasets quickly grow to sizes that are impossible to manually curate. Image informatics provide a suite of advanced algorithms to extract features, make measurements, and classify patterns among tens of thousands of images.