Mnemo

Multi-Muse Flows

Many anticipated use cases will flow through multiple Muses as illustrated below.

Trinity → Refactoring → TypeWriter → Hypothesis
The Test View Muse collects input/output examples from a Python program. These examples are then passed to the Trinity muse which synthesizes an implementation in an appropriate DSL. The implementation flows back to the Argot Core muse which converts the DSL to Python for insertion into the program. The refactoring muse then wraps this Python snippet into a function. The TypeWriter muse synthesizes a type for this new function. The Type View muse inserts the type into the file. This type then flows to the Hypothesis muse which inserts a new test into the file and performs property-based testing of the function.
→ Use case video.
TypeWriter → Hypothesis → GenPatcher
The TypeWriter Muse collects types for a function in a Python program. These types are then used by the Hypothesis property based testing tool to generate test inputs for the function. The GenPatcher automated program repair tool is then invoked using these tests to evolve a version of the function which is able to successfully execute against all valid test inputs.
Autocomplete ↔ GenPatcher ↔ Refactorings
GenPatcher is invoked because of a failing test for a Python function. GenPatcher is then able to leverage other Muses while it searches for a repair. In effect GenPatcher operates much like a developer guiding a team of Muses towards a new version of the source code. In particular GenPatcher might make use of the Refactorings mutations to take large structured steps through the repair space modifying disparate source code, and might make use of the autocomplete Muse to take large unstructured steps through repair space modifying localized source code.
Function Generator → Hypothesis
The function generator autocomplete is invoked, populating a function body using OpenAI's Codex model. Upon acceptance of the suggested completion, hypothesis is invoked to insert a new property-based test for the function, protecting against weaknesses which may be present in the model result.
→ Use case video.