Project Tuffy

Meet Tuffy (Ver 0.3 released May 1, 2011!)

"We balance probabilities and choose the most likely. It is the scientific use of the imagination." -- The Master

Tuffy is an open-source Markov Logic Network inference engine, and part of Felix.

Check out our new demos built with Tuffy/Felix!

Markov Logic Networks (MLNs) is a powerful framework that combines statistical and logical reasoning; they have been applied to many data intensive problems including information extraction, entity resolution, text mining, and natural language processing. Based on principled data management techniques, Tuffy is an MLN inference engine that achieves scalability and orders of magnitude speedup compared to prior art implementations. It is written in Java and relies on PostgreSQL. For a brief introduction to MLNs and the technical details of Tuffy, please see our upcoming paper or the technical report.

When designing and developing the user interface of Tuffy, we used Alchemy as a reference system. Thus, users who have experiences with Alchemy should be able to pick up Tuffy easily.

The current version (0.3) of Tuffy is capable of the following MLN tasks:

  • MRF partitioning, a technique that can result in dramatically improved result quality (see our paper);
  • MAP inference, where we want to find out the most likely possible world;
  • Marginal inference, where we want to estimate marginal probabilities;
  • Weight learning, where we want to learn the weights of MLN rules given training data.

Furthermore, Tuffy provides the following functionalities beyond the realm of MLNs:

  • Datalog: In addition to MLN rules, you can also execute Datalog rules in Tuffy.
  • Functions: Tuffy comes with a library of common numeric/string/boolean functions, which can be used inside an MLN rule. In particular, you can perform arithmetic manipulation and comparison in MLN rules.
  • Predicate scoping: Sometimes even grounding the atoms of one predicate would blow up your RAM. On the other hand, it's often the case that you only care about a particular subset of the exhaustive set of ground atoms. This feature allows you to explicitly specify the atoms you are interested in so that your program becomes runnable again.

For more technical details about Tuffy, please refer to our VLDB 2011 paper or the documentation page.

Please also check out our new project Felix, which is set out to be the successor of Tuffy. Felix is conceptually very different from Tuffy in that, instead of adhering to the monolithic approach to MLN inference (as Tuffy does), Felix explores the rich opportunities of relational optimization, including task decomposition and data partitioning.

Tuffy is released under the GPL v3 license. You can download the source code with Java doc from the download page.

As part of the ongoing DARPA Machine Reading project, Tuffy is generously supported by the Air Force Research Laboratory (AFRL) under prime contract no. FA8750-09-C-0181, and gifts or research awards from Microsoft, Google, LogicBlox, Johnson Controls, Inc.. Any opinions, findings, and conclusion or recommendations expressed in this work are those of the authors and do not necessarily reflect the views of any of the above sponsors including DARPA, AFRL, or the US government.