Thursday, December 16, 2004

Collective Intelligence Quanitifed[sic] for Computer-Mediated Group Problem Solving

(http://arxiv.org/abs/cs/0412064)

I think this is a good premise for a paper, but a bad execution; too little data. I hypothesize that they had trouble writing the code for the experiment. If they had had many tasks, and many ways of combining people, then they could have had a much better chance of getting some positive results. As it is, the simplicity of the 8-puzzle combined with the lack of positive results (okay, the group solutions were slightly better than the individual solutions for the "hardest" problems; but mostly no positive results) greatly detracts from the interest of the paper.

I recommend that people do similar experiments, and write papers about them.

Tuesday, December 14, 2004

Gyroscopically Stabilized Robot: Balance and Tracking

(http://arxiv.org/abs/cs/0412050)
There's a fun experience (that I hope everyone who might be reading this has had at one time or another) - you think of a good idea, and then you realize "That's a really good idea! Someone else must have thought of it first!", and then you go to google and type in keywords related to your good idea, and there it is! Someone else has thought of your idea, and developed it, and published it, and now you can read all about it, without doing the work.

Gyrover was such an experience for me, in 2000. Reading this paper (which I really don't understand - I just think the premise of the robot is cool.)

Possibly the model of Gyrover's dynamics in this paper (and the 1996 paper by Xu, which is the one I read when I first thought of it) is sufficient to write a reasonable computer game.

I recommend Gyrover's premise, and I have no idea whether the nonholonomic control research is useful or interesting.

Thursday, December 09, 2004

Limits and the system of near-numbers

(http://arxiv.org/abs/math/0412157)
I am in ecstasies.

This is how first-semester calculus ought to be taught.

My one complaint is that the list of references is incomplete. It doesn't reference, for example, Abraham Robinson's infinitesimals. I think there are intuitionists who work with intervals in formalizing a rigorous and constructive theory of calculus and analysis, but I can't find a good reference in a few google searches.

I highly recommend this paper.

Wednesday, December 08, 2004

Quantum logic as motivated by quantum computing

(http://arxiv.org/abs/math/0412144)

The title looked interesting, and I noticed Dunn was one of the co-authors. I recognize Dunn from his work on relevant logic, and I'm told he's a smart guy. Also, I like titles with the word "motivate" in them.

The paper explained what quantum logic is better than anything I've read so far. It didn't motivate quantum logic, though, or explain what it has to do with quantum computing.

If you're reading this, you're probably familiar with what a qubit is- sortof a complex number (actually a point on the Bloch sphere, but okay). If {0, 1}^n is the space of , then, C^n is (sortof) the space of n-qubit registers. As far as I can tell, the paper says quantum logic relates to quantum computing in that both have C^n in them. (Quantum logic does have something to do with quantum computing; but it models measurement, rather than quantum logic gates. Most published quantum algorithms use quantum logic gates cleverly and often, and measurement once, and at the end.)

I don't think you can build the meet, join, and complement operations of quantum logic out of quantum logic gates (which are actually (theoretically) used in quantum computing).

I recommend this paper if you're interested in orthomodular lattices, not if you're interested in quantum computing.

Tuesday, December 07, 2004

Pattern Discovery in Time Series

It's an algorithm that will take your time series (discrete time, nominal values - meaning things like sequences of bits or letters), and return a best fit markov model for it. NB: A special kind of markov model, which in the paper they call "deterministic". "Deterministic" doesn't mean that if you know the state of the model, you will be able to predict the model's output. It means that if you know the state of the model and one output symbol you will be able to predict the next state of the model.

It's better than the hidden markov model training algorithms because the number of states is not fixed. Also, the code is GPLed, and available, and easy to compile, which is very nice. I didn't give it any tricky data, but on the periodic sequences I tried it on, it did fine, with a pleasant dot.

I think that some people have worked hard on DFA-learning, which is similar to this, though of course they're talking about really deterministic automata.

Cons:
The sentence "Given data produced by a process, how can one extract meaningful, predictive patterns from it, without fixing in advance the kind of patterns one looks for?" You can't. This algorithm looks for patterns expressible by a small \epsilon-machine.
Overusing the word "causal" and the phrase "computational mechanics".
Not enough examples of \epsilon-machines (though the draft of part II, apparently abandoned, has a satisfactory number.)
The best answer for "how do you know the correct number of states for a HMM" that I know of is "use the minimum message length principle". This doesn't seem to be mentioned at all. On the other hand, I can't find a MML-based algorithm to download.

Overall, I recommend this paper.

Monday, December 06, 2004

Computational Mechanics of Cellular Automata: An Example

(http://www.santafe.edu/projects/CompMech/papers/ECA54TitlePage.html)

I wish I were smart like Cosma Shalizi. He's cool. (He works on CAs, among other things.).

I take back what I said about recommending "Computational Mechanics of Cellular Automata: an Example". I mean, I understand it's pioneering and breaking new ground (published 1995), but the obsession with "synchronization" is really bad. Saying that a finite automaton "effectively builds a finite queue" is really bad. And what's up with those 4 and 8 "unrecognized defects"? It's unprofessional to have unrecognized defects- couldn't you have mentioned them earlier, like in the list of particle interactions

(Of course, it does have the delightful terms "space-time machine" and "gamma gun"- as I said before, it has plenty of merit.)

I would like to see the XOR of a solid domain with a space-time diagram. There's probably a group structure of some sort on the set of possible "phases" of the regular domain (a checkerboard domain has two phases, and at the locations where one phase has white squares the other has black squares and vice versa). (translations of the plane quotiented by the translations that leave the regular domain identical?) The different particles each are associated to some element of this group, describing how the phase changes as you walk from one side to another. This constrains the possible particle interactions:

Like, hypothetically, an interaction might be: alpha + alpha + gamma -> beta + beta.
The product of the group elements associated with the the particles on the left would have to equal the product of the group elements associated with the particles on the right. If g is the function from particles to group elements:

g(alpha)*g(alpha)*g(gamma)=g(beta)*g(beta)

This is exciting. I think I could go off and compute these right now! But I'm not going to.

They call their process a "particle filter", which is mildly amusing, to me. Once I thought "particle filter" means "it filters out the particles" (like Brita or something). Then recently I read about "particle filter" meaning "it's a signal processor that uses (virtual) particles". This is a "particle filter" that adds particles.

My idea for how to build a particle filter would be to grow equivalence relations- equivalence meaning "in the same domain". You could use Tarjan's union-find algorithm, and eventually you'd get a partition of the space-time diagram, and the big equivalence classes would be the domains. I think I know how to do it. Maybe I don't know what I'm talking about, or maybe it'd be inefficient (I mean, finite automata are linear time, and that's hard to beat!).

I recommend "Classifying Cellular Automata Automatically". It has a much better explanation of LMC complexity ("medium spikyness of histograms") than the LMC complexity paper (that I talked about previously), even though it doesn't mention LMC complexity.

Sunday, December 05, 2004

Complexity of Two-dimensional patterns

I haven't posted in the last couple of days, because I didn't see anything in the recent arXiv updates that caught my fancy.
Rather than let this go on, I'm posting about something that was written in 1997, but does catch my fancy.

One of the coolest and most effective techniques for studying one-dimensional cellular automata is the regular language method. I tried to write something about it for a class, and the professor put the class papers up on the web, which is embarrassing.

The idea is that, in most cellular automata, there are "gardens of eden" - patterns that never arise from running the rule. As a simple example, consider a rule where "every live cell with at least one live neighbor dies". After one step, the pattern will consist entirely of isolated live cells.

So if you're a CA researcher, and you want to know facts about a rule like "After one step, the pattern will consist entirely of isolated live cells". To translate that into jargon, you say "The finite time set at t=1 is the golden mean shift". ("finite time set" = "the set of patterns possible at a certain time" , and "golden mean shift" = "the set of strings of 0s and 1s that do not contain two adjacent 0s").

So the "regular language method" is just a result of Wolfram's that the finite time set of any 1-dimensional automaton, at any t, is a regular language.

This is cool, and there are various applications (for example, in filtering the space-time diagram of a CA to get particle dynamics - I haven't read "Computational Mechanics of Cellular Automata: an example" carefully yet, but I recommend it.). Still, one wonders why such a simple and reasonable technique has not been extended to the two-dimensional case.

"Complexity of Two-Dimensional Patterns" explains clearly (if you have some theory of computation background knowledge) why it is tricky to extend to the two-dimensional case.

You probably know that deterministic finite automata are equivalent to nondeterministic finite automata. It's easy to believe (though I don't remember hearing about it in class), that deterministic finite automata with the ability to move backward as well as forward on the input, are equivalent to ordinary DFAs. There are several more definitions of "regular language", all equivalent. Each of these definitions has a reasonable generalization to two dimensions. Unfortunately, they are no longer equivalent.

I recommend this paper.

Wednesday, December 01, 2004

Duplication-divergence model of protein interaction network

(http://arxiv.org/abs/q-bio/0411052)
I like this paper a lot.

I like genetic algorithms (not as optimizers, but as models of evolution) but the usual version is Mendelian, in that there are a fixed number of alleles, and each allele has a fixed meaning. This is a genetic algorithm that is very simple, with general increase in complexity over time.

This reminds me of Koza's "Reverse Engineering and Automatic Synthesis of Metabolic Pathways from Observed Data Using Genetic Programming".

Koza's networks are metabolic, (proteins interacting with substrates, like sugars), while the duplication-divergence networks are of "protein-protein interactions" (I'm not sure what those consist of.)

The duplication-divergence paper is much more lightweight than Koza's, to it's credit. Koza has an agenda of demonstrating that genetic programming can solve practical problems, while the duplication-divergence paper is entirely about modeling real-life evolution.

I recommend this paper.

Detecting synchronization in spatially extended discrete systems by complexity measurements

(http://arxiv.org/abs/nlin/0411063)

I feel ambigous about this paper. On the one hand, it's about cellular automata, and I like cellular automata a lot. It has a short and very specific definition of the things he's studying:

Two L-cell cellular automata with the same evolution rule are started from different random initial conditions for each automaton. Then at each time step, the dynamics of the coupled CA is goverened by the successive application of two evolution operators; the independent evolution of each CA according to its corresponding rule and the application of a stochastic operator that compares the states sigma^{1}_{i} and sigma^{2}_{i} of all the cells i, ..., L in each automaton. If sigma^{1}_{i} = sigma^{2}_i, both states are kept invariant. If sigma^{1}_{i} != sigma^{2}_{i}, they are left unchanged with probability 1-p, but both states are updated to either to sigma^{1}_{i} or to sigma ^{2}_{i} with equal probability.

There are pretty pictures in the back, both of coupled cellular automata, and of the complexity peaking at medium values of p.

On the other hand, the author clearly likes this "LMC" complexity measure (which seems pretty ad-hoc to me) a lot (he is the L), and the task of detecting when two CAs synchronize is pretty easy - when the difference automaton goes solid, they're synchronized. He doesn't give any alternative measures of complexity.

About "LMC complexity":

This is a function from a histogram to a number. On the one hand, you can compute the Shannon entropy of a histogram - as the histogram goes from spiky to flat, the entropy will go from low to high. On the other hand, you can consider the histogram to be a point in n-dimensional space, and directly measure the euclidean distance to a flat histogram - as the histogram goes from spiky to flat, this distance will go from high to low.

If you multiply these two measures together, you get LMC complexity, which goes from low to high to low. It is high for "medium spiky" histograms.

I recommend browsing this paper, because it's only 4 pages, 2 of which are figures.