Jump to content

Memorypotent kb

From Meta, a Wikimedia project coordination wiki

This page is particular to emergent optimization: a rough outline of the algorithm.

Each page is considered a “node”, or, more specifically, a finite state automaton. All processing is done at the node-level; the problem is attacked in this way; cellularly.

The node has 3 “parts”, respective of finite state automaton:

Input: the pages(nodes) that link to the current page(node). These are called the “arrival ports”. Each port is identified with a number, when a code is received through a port, it generates an “arrival code”. This is just the number of the port that the code arrived at.
State: The internal state of the node.
Output: the pages(nodes) that link from the current page(node). These are called the “departure ports”. Each port is identified with a number.

When a link is transversed, a code is computed from the internal state, and sent through the departure port of the current node to the arrival port of the new node. This code represents, in an abstract way, the current “topic”; it is the “route memory” carried along with the user. It is called the “route code”.


Thus, when a page(node) is visited,

(1)an arrival code and a route code(the received code) is stored in the internal state of the node.

(2)At this stage, the composite code(arrival+route) can be looked up in the “general probability table” for the node, which is stored in the node’s internal state space. This table predicts the most likely departure ports for the user, based on all previous visits to the node by previous users.

(3)When a user selects a departure port (i.e. goes to another page), a departure code is generated.

(4)The “general probability table” is updated accordingly (using a time-averaging (logarithmic decay) algorithm).

(5)Each departure port has it’s own independent probability table, called the “specific probability table”. It stores the conditional probability that, given that a visit leaves at the given departure port, the composite arrival-route code is X.

(6)This is the “fun” part; this is where it all happens -- where the output route code is generated from the composite arrival-route-departure code.

We’ve already translated the departure code by selecting the respective specific probability table.

Now, our task is to “renormalize” the route code. That is, make it so that, in the limit(on average), all output codes from this port have the same probability. (this maximizes the information stored in the code, since, for full entropy compression, the length of the code should be respective to it’s probability, thus, since all codes have the same length, they should have approximately the same probability.)

But since our arrival code was the most recent information, and this remains the most significant, we want to make sure that we encode this information in the route code without confusion. So we first divide up the available codes among the arrival codes, proportionally to their probability of arrival at the selected departure port. Thus, each arrival-departure code has an outgoing code-range allotted to it.

Now, for each arrival-departure code, we have a table selected, which has the following:

  1. a route code(incoming)
  2. a probability for each incoming arrival-route code, conditional to the departure port
  3. A set of available route codes(outgoing) to be distributed among the incoming route codes, such that each outgoing route code has the same probability.

(#3) implies the final translation step, and the condition that it must satisfy. This condition is easily met. There are an arbitrary number of ways to divide the codes up evenly, the most straight-forward being a sequential assignment. However, we want to find the method of distribution that minimizes “topic drift”; that is, that retains the most significant information in our route code.

This implies that there is some way to identify the most significant information. We introduce this through the process that retains this – we choose the most significant bits of the code to represent the most significant information.

To distribute the codes in a way that best maintains the most significant information, we distribute the codes from the most significant bit first, descending, respective to number of occurrences of those bits(in the route codes) among the arrival codes, and then we distribute the resulting sets among the route codes respective to probability in the same manner.

Disambiguating: If a incoming route code ends up with multiple outgoing route codes assigned to it, two different strategies can be used to select among them when one is demanded: random selection, or lowest numerical value.

Thus, the incoming route code is translated to an outgoing route code, and sent to the following node’s arrival port.

(7)The “specific probability table” is updated accordingly, in the same manner as the “general probability table”.

(8)The code distribution is adjusted to incorporate the new information, as implied by (#6).


-For dataspace considerations, the tables can be “partial” tables; they need not have room to include all the possible codes. Thus they would be “hash tables”, where the codes are possible staticly or dynamicaly truncated. I defer this discussion is elsewhere.