derek ruths


Data Science

Flex is a framework that rethinks how data science is done. Think make meets Mathematica. It elegantly solves the problems of self-documenting research code, keeping track of data, and making work easily reproducible.

Zen is a Python library for working with networks. Its aim is to be the fastest and easiest to use network library available. Currently, its core routines outperform all other available open-source network libraries (regardless of platform and language). Furthermore, it is able to work with networks tens of time larger than NetworkX. It exposes two different interfaces that enable the developer or scientist to write code for simplicity or performance.

Programming Utilities

Arghandler is a Python library that extends argparse, making it easy to configure logging levels and develop command-line tools which support subcommands.

Autocache is a Python library that provides dictionaries, lists, and sets (standard containers in Python) that persist on disk: they store and load their structure to disk in such a way that their memory and disk representations are always in sync. Unlike pickling, their contents is written to disk in an incremental process so (1) there is no serious performance overhead to using it and (2) there’s never a need to explicitly save the structure – even if your code crashes, the autocache structures will have exactly the contents when you last modified it.

Retired Projects

TweetCoder is a light-weight application that enables rapid and customizable human coding of Twitter data. The researcher uses a simple scripting language to specify the coding schema and data. TweetCoder reads these scripts and presents the coder an interface that allows them to use the mouse and/or keyboard to code the data.

Topp is a software tool for automating the calculation of citation metrics using search engine results. Raw search engine results can severely overestimate citation metrics due to name ambiguity – this is the central issue addressed by this tool.

Monarch is a tool that allows biologists to build predictive models of signaling network dynamics using limited amounts of qualitative data.

PathwayOracle is a tool that allows rapid exploration of both static and dynamic properties of biological signaling networks. The tool includes an implementation of the signaling Petri net model and simulation algorithm.