There is an arms race to perform increasingly sophisticated data
analysis on ever more varied types of data (text, audio, video,
OCR, sensor data, etc.). Current data processing systems typically
assume that the data have rigid, precise semantics, which these new
data sources do not possess. On the other hand, many of the
state-of-the-art approaches to both cope with variations in the
structure of data and to deeply anlayze data are statistical. The Hazy
project is exploring integrating statistical processing techniques
with data processing systems with the goal of making such systems
easier to build, to deploy, and to maintain.
The key technical hypothesis behind Hazy is that a large fraction of
the processing performed by applications that use and analyze these
new sources of data can be captured using a small handful of
primitives. Identifying this small handful of primitives is one of Hazy's
chief goals.
To understand this hypothesis, Professor Re's group is building
several applications, including a system to read large amounts of text
and answer sophisticated questions (see WiscI and GeoDeepDive) and a system
to help find Neutrinos (see IceCube),
along with building general primitives for data analytics.
Support
Hazy is generously supported by the Air Force Research Laboratory (AFRL) under prime contract No. FA8750-09-C-0181, No. FA8750-13-2-0039, and FA9550-13-1-0138, the National Science Foundation CAREER Award under No. IIS-1054009 and EAGER Award under No. EAR-1242902, the Office of Naval Research under awards No. N000141210041 and No. N000141310129, the University of Wisconsin-Madison, and gifts, research awards or contracts from American Family Insurance, Google, Greenplum, Johnson Controls, LogicBlox, Microsoft, Oracle, Raytheon and the CHTC. Any opinions, findings, and conclusions or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of any of the above sponsors including DARPA, AFRL, ONR or the US government.
Media
Visit our YouTube channel, or check out some project overviews right here:













