Dim Red Glow

A blog about data mining, games, stocks and adventures.

more genetic code thoughts...

So I've been thinking about deep neural networks and genetic algorithms and b-trees. first let me say that i made some simplifications in the tree export (both in concept and in genes) and got the exported size down some. it should be in the neighborhood of 1/3 the the size. I say "should" as i only exported a 4 depth 4 stack tree and that isn't anywhere near as big as a 6 depth 16 stack tree. the whole action was i think academic at this point.

At the time I was still hopeful i could have the genetic program optimize it. It turns out that tree based data mining while systematic in approach and isn't very efficient. There are almost always far better ways to get the scores you are looking for and the genetic programs tend to find them and throw our your tree entirely. The reason the tree is used is really a case of methodology. It's a generic way to produce a systematic narrowing of results to statistically model the data. the genetic mechanism tends to find mathy ways to model the data. they could be in any form, tree based or otherwise.

This leads me to some serious thoughts on what is going on in deep neural networks. They tend to have a number of layers each layer has access to previous layers or the original data.... possibly both (depending on the nature of the convolution). Its a strong strategy for figuring things out that require a group of features to work together and be evaluated.

It turns out this is kind of what the introduction of channels is doing for me... its also (in one way of looking at ) what stacking results in GBM. Each channel or tree has their own concern. This made me realize that by limiting the channels to a fix number i was trying to shoehorn what it actually needs to describe the data in to two ideas that get merged. because of the strong adaptability of the channels this can work, but it isnt ideal. ideally you let it go as wide in channels as it needs to. in fact you really should let channels stack too.

I implemented the idea of random channel creation (or removal) and reworked the way genes are merged/created with that in mind. the results have not disappointed. it hasnt existed long so i cant tell you how far it will go but it does tend to get to a result faster than before.

I think there are 3 more major improvements to make. right now, i'm still just taking the sum of the channels to get my final output. i think this can be improved by doing a least squared on each channels' output with the expected result to find a coefficient for the channel. This isnt needed persay, but it will help get me to where i'm going faster.

The 2nd improvement is to make it so there can be layers... layers get access to the previous channels in addition to the features and what not. layers could be randomly added or removed like channels. if a layer references a previous channel and it doesn't exist due to mutation I would just return 0.

the 3rd improvement is to add some system of reinforcement. right now i do breed better scorers more often . But I think that isn't enough, I think some system needs to be devised that eliminates under performing genes when breeding. This is really tricky of course, because who can say there isn't some other impact. Essentially which genes are the good ones? I think some sort of heuristic needs to be added to a gene to track how well it has worked. Basically a weight and a score. if a gene is copied unmodified the weight goes up by 1, the score is the average score of the whole cell. adding. if a copy is made and some change happens to the gene or if the gene is new the data is set the score and weight of just that particular cell (no average history). When deciding which gene to take when breeding two cells the odds would be reflected by the two average score. or possibly by just the current score. I dont know how well this will really work in practice but if not this... something.