Dim Red Glow

A blog about data mining, games, stocks and adventures.

i just can't stop working on genetic algorithm stuff

hello hello hello. i don't even know where to begin. I guess some short stories, the improvements on the genetic algorithm have been steady and successful. I do still struggle with bias some. I also struggle with speed. now a little more on each.

For bias i'm pretty certain my strategy of scale-able scoring (or whatever i'm calling it) is the way to go. that is regardless of the size of the sample the sample scores the same. the more samples of varied size the more accurate the score is (with more certainty). Basically, you use the worst score from the group of samples. you should always include a full 100% sample as well as the various sub samples, but i run that one last and only if the sub-samples indicate the result is still qualifying. in fact i quit at anytime the score drops below a cutoff. this actually makes it faster on the whole too.

For speed, i've managed to get some sample GPU code to work, which is great! but alas, i haven't found time to write client/server code and implement a distributed version of the genetic stuff. I will, i just need more weeks/weekends. this will hopefully give me something like a 50-1000x boost of processing power.

All this work has been on https://www.kaggle.com/c/nomad2018-predict-transparent-conductors which is rather ideal for my purposes here. you can read the indepth on-goings here https://www.kaggle.com/c/nomad2018-predict-transparent-conductors/discussion/46239 . I'm really hoping i end up in the top 10 before its all over.

There also happens to be a new contest https://www.kaggle.com/c/data-science-bowl-2018 which needs image analysis to be completed.... however! I think this might be also be a contender for the genetic algorithm, though maybe a different version. I could certainly load the images in to the database and let the genetic algorithm figure out what is what... but i think there might be a better way. I think it might be better/more fun. to  design a creature (yes creature) that can move over the board and adjust its shape/size, and when it thinks its found a cell it it sends the mask back for a yes/no. after, i dunno 1000 tries it is scored and we make a new creature that does the same thing... breed winners etc etc. or we could go the reinforcement route. whenever it sends back a mistake we tell it "bad". and when it sends back a success we send back "good". in that way there would be only 1 organism and the learning would be at a logic level inside of it instead of having new versions of itself over and over again. I haven't decided which i'll do, but i think its probably something i'd get a kick out of writing.