Regolith

Summary

Regolith is the layer of soil and large boulders between the solid Bedrock and the Topsoil.

In terms of the architecture of the next generation of AboutUs software, Regolith is the low level distributed object persistence and user action framework part of a Bedrock server that allows for the cool features of Topsoil.

The boulders are objects that are created once and then never modified again. The crushed rock and dirt that packs around the boulders is the meta-information that is continually undergoing change as new boulders are created. For example, an index that keeps track of all the incoming links to a particular boulder would undergo change that is quite independent of the boulder itself.

Key Considerations

  • The contents of a boulder should be verifiable without loading any additional boulders
  • Simple objects like strings are themselves boulders
  • Compositional objects that include references to other objects fracture into a directed graph of boulders when stored
  • Currently there is no support for loopy objects

BoulderName = Collapsed Object

A BoulderName is really just a collapsed object. A BoulderName is the hash of the serialization of an object that has been fractured.

Fracturing = Decomposition

Fracturing an object means replacing all of its programming language and environment specific references to other objects with the BoulderName for each of those objects. When a compositional object is put, it is recursively fractured and put so that the entire structure has been put as a collection of Boulders linked together by BoulderNames.

Decomposition of objects allows you to have many compositional objects that share pieces. This is both a storage savings and a mechanism to compute back references. An essential feature of fracturing is that the same expanded object always maps to the same BoulderName.

Boulderized Containers

Some objects such as Arrays and Hashes are themselves containers for other objects. During fracturing, the objects they contain must be fractured and put. Then a "Boulderized" version of the original container object is stored. The "Boulderized" version replaces all top level objects in the container with a "Boulderized" version of the objects.

One-to-one

It is essential that a particular set of object contents always collapses to the same BoulderName. This makes it possible to determine when an object is part of several different compositional objects. This is the one-to-one property. A particular object always collapses to the same BoulderName. Likewise, no other object with different contents collapses to that same BoulderName. We use a sha1 hash to accomplish this.

Loops

When a data-structure has loops, there is a decompositional solution that maintains the one-to-one property, but that solution is effectively uncomputable.

N-squared Loop Decomposition

For any loopy datastructure having N nodes, the storage complexity is N-squared. Basically you pick up the entire structure by each node in turn and write that node out while watching for revisits of a node that you already have written. Whenever you revisit a node that you've already written, you break the cycle with a self reference.

The collapse of the graph depends upon the order of traversal, so selecting a deterministic order is important.

Next Actions

  • Refactor to use BoulderNames that are computed after each object has been Boulderized rather than before.
  • Refactor to use the terminology of the metaphor
  • Add AttachTag and DetachTag actions
  • Add Tag querying capabilities
  • Create a Bedrock server that uses webrick or something similar as the front-end
  • Write a javascript version of the functionality so far


Retrieved from "http://aboutus.com/index.php?title=Regolith&oldid=10529295"