Search
rss Logo
Final Position 100!
6/16/2012
Guess what?:) Don't play games. It's to early on a Saturday to play games, it's barely past 11:30. in fact, you know what, screw this I'm going to lay on the couch. Er.. What? When they scored final results on the best set of data I moved up in the ranks, like way up!:) My final position was 100 (of 700 teams) before the final scoring I was around 170-180... still good but wow. what a jump! My score (and others) improved a ton. The scores we were previously ranked on represent 1/4 of the final scores... when we are scored against everything... well what an increase in performance. final winning score was 0.37356 and my final score was 0.38932 ... not to shabby! Not as good as winning but not bad either! I have a rank overall now too! 1100 :D. you can go have a look here https://www.kaggle.com/users/27377/j-scheibel#profile-results.
One other quick thing, if you play my game Tumulus of Fen either the light or normal version, there is a patch out. Apparently there was an issue with random people having the game crash when it started. Not sure why that would happen all of a sudden. So I rebuilt it with the latest monotouch libraries and set the required OS back to 3.0. Hopefully that fixes things since I've never been able to reproduced the problem. I really need to get crackin' on a new game. Did I ever mention how a day job gets in the way of so many things?
You think this over? We are only just beginning
6/15/2012
I take it by the header you didn't win... Nope loser!,technically it's not over till 7 CST tonight and I'll do one more submission but I wont be changing much so it's not gonna win. It's more to see how the algorithm performs in a certain scenario but...here come the excuses I know my next step!too late! My next step is to pursue implementing probability based estimates at the leaf spots on my tree instead simple positive or negative indicators and then to apply probability scaling both to the tree and to the result. I have no idea how much better that will make the forest. But the few things I've seen indicate that is the next logical improvement. I also don't know if my performance is all that good considering my final algorithm is little more than a really good plain vanilla Random forest. I mean I improved on the splitter, found a good way to use all the training data during each classifier build and a good way to get a relatively accurate weight (better than standard oob it seems). I mean it seems good... .44108 compared to a base of around .46182... and it should work that way out of the box every time on any data set. And I made my program lightning fast. So to me that seems like something. but I dunno, I was a long, long way from being number 1 that's the spirit.
So, PET (probability estimation tree) and scaling? Yep! :) First one then the other. I'll pick a new contest to start working on this weekend so I can get a real world idea of how good it's going and start then trying to implement the stuff. I might actually take a day off from in front of the computer too!crazy talk!
I could think of things I never thunk before. And then I'd sit and think some more....
6/14/2012
We join our narrator already ready in the process of hitting his head on a wall. one hit for each syllable.play along at home kidsdon'tMUST ... FIND ... BET ... TER ... WAY ... TO ... CLASS ...IF ... Y ... CRAP.
Ya... no improvements.only 1.5 days left! Get Crackin!that's 3 more submissions for those playing the home game What really gets me is the the amount of room for improvement and the ever so small separation between the top competitors and me. In terms of score that is. but, Jamie it seems like 10% difference ya, well don't let the numbers fool you. it's not. In some ways it's much more, in others it's much less. less you say? Yes, I can't remember if I explained this once before but I will explain real quick to be clear on the small difference between 1 score and the next.
The scoring system is done very simply each prediction for a row is between 0 and 1. There is of course a correct answer that goes with that row. So to score what is done is the prediction is subtracted from the correct answer and then that is added to the inverse of the prediction subtracted from the inverse of the answer. huh? like so. Y = Answer * prediction + (1-Answer) * (1-prediction). The effect is if you guess 0 and the answer is 0 the result is 1. if you guess 1 and the answer is 1. if you guess 0.5 the result is 0.5 . basically, it is built to adjust for which side the answer is on and give the same result it would for positive response as it would for negative response. if you guess .25 and the answer is 0 you get .75 if the answer is 1 it would be .25.
So why is it called log loss? Well that result is only part of it. you take the computed result there and get the logarithm of it, and the negate it. You negate it because they want positive results and all values returned by the log function between 1 and 0 are negative. so if you are right and the result is 1 ... -log of 1 is 0 the score is zero (the best score you can have) if you are completely wrong and you get a result of 0 ... -log of 0 is infinity.So 1 guess wrong destroys your results entirely? Not really, they just use a large number instead of infinity, but technically yes it should. if my program ever predicts a 1 or a 0, just to be safe i move it to .9999 or .0001. which if wrong, score around 7... a heck of a lot better than infinity All of the results are averaged and then that is your final score, that's it!
So the current best score is near 0.39, you have a score near 0.44, is that a lot? Yes and no. No because, if you assume every result scored the same 0.39 is an accuracy (working backwards) of 67.7% on each result and 0.44 is 64.4% ... So, not much difference there. And tons of room for improvement on both! the problem is, that's not the case! the wrong answers are what get you. That's the real problem. a few wrong answers can turn a great set of results in to crap. which is why it's so hard to improve. you could have 95% accuracy overall but if the 5% you got wrong score at a value of 7. Your overall score would be 1.4! So, ya.... it's better to guess 0.5which scores at 69.31%69! if you don't have a clue what the answer is, guessing wrong is penalized very heavily. All that being said. It's time for me to go back to staring at the screen and wondering how I can improve my results overall by 5% :)sounds easy!hah!
Wait I just, just need... almost got it solved ... I just need to ........... I got nothing.
6/11/2012
Nearing the home stretch... Just over 3 days left and I think I'm out of mundane refinements that I could try to my program That only took 4 weeks :). I managed to squeeze a little more score out of my program by basically rigorously testing ever variation of any idea I could come up with. Things like how best to calculate percentage so-and-so. And what if I split this particular data in half... how about thirds... what if I make the program do a half pike somersault over the fire and land in to the dried up lake bed... etc that sounds like it should work!. Needless to say, all I've done is refine and refine. The ideas about filtering by types of groups and subsets was more or less a bust. Well except for one variation on it that I do use. Using a filter based on training data I can predict accuracy slightly better than an OOB test set. Which is where i squeezed out the few extra points, but it wasn't a ground breaking discovery it just a slight improvement.
So just a few more points on the ladder nothing else worth noting? Well one other thing I did today. I made the program way... I mean WAY faster. Turns out creating tens of thousands of objects is really really slow. And I mean more than the overhead of using objects in the first place. Something odd is going on in the underpinnings of the computer. I'm not sure what CPU overhead is involved but the program was starting to bottle neck at around 50-60% CPU usage. That is, all 8 cores sitting there not willing to go any faster no matter how many threads I threw at them. Just to test that it was in fact the objects. I made an class that had no data or methods to test with. I then commented out the entire program other than a single loop in each thread execution that did nothing but create an object over and over again to see the problem first hand. If I removed the object and say, created an array instead. I'd peg all of the CPUs. If I used the object it would float at around 50-60%. So clearly the CPU is trying to do something with each object at creation and the decision is counting as idle time. Needless to say I went on a tear rarr! and rewrote all my objects to be structures and any function that got called over and over again that needed objects was made use global that never got recreated. The program is night and day faster. A test that took 9 hours to run can now be run in about 15 minutes. Granted a lot of that is from the use of primitives over objects, but some of it is from the fact that once again I can use all my CPU.
So that's it for now. Hopefully something will change in the next 2 days that will be awesome news! If not there is always the next contest *grin*.no comment
Apparently Ernie was right about rearanging the cookies and still having the same number of cookies.
6/8/2012
Data mining where do we stand!? 1 week left! I've been trying rather diligently to filter and slice and dice the data so that the voter in the random forest can only vote for entries that voter knows anything about. This seemsstill like the most logical improvement to me right now. have you considered hiring monkeys? But it has not gone well. I haven't made any improvements in some time now. Generally, i approach my best but never pass it. I did manage to get my old grouping method to work using an OOB mechanism but while the accuracy was decent (75%) the log-loss was terrrrrrrrible (.60). basically, unusable.
I was hoping I could use 1 method with the other since they work in fundamentally different ways. Basically, using 1 as validation for the other and modify accuracy accordingly, but it's not worked out well not as a direct comparison of two predictions on the same data. the results are always poorer. I expect that has something to do with the log-loss score on my grouping method. I did manage to get a small "improvement" better than what I had been doing, but not quite better than when I split the data down both sides of the tree by using a filter based on the classifier tree to help determine if the tree had a valid opinion/vote. I also found a way to merge two statistics using said filters to give a statistic better than the two children (most of the time anyway), much like voting does but using those filters in tandem. These ideas or a variants there in I think have the most potential for improvements over all.
Basically, both idea hinge on the same thing. One method weights an individual result. the other a group of results. In layman's terms what I'm doing is asking the person voting he just turned a classifier tree in to a person! how good are they expecting their vote to be based on what they learned NOT based on the OOB data. Though that is figured in when they do vote.. I just need to get better at doing it. I'm thinking I might need to introduce a multivariate normal curve to do that properly. That is definitely something I'm really looking forward to. read: lot of work, not sure it will pan out when it's done. The weekend lies ahead. The last weekend before this contest ends. So one way or the other this level of focus will end soon!
smoke 'em if ya got 'em
6/5/2012
Two blogs back to back?! Crazy!! Heh, so today something real quick. The contest that has been motivating me to a near frenzy of focus and programming ends in a week and a few days. I imagine at the end of that contest I'll be more human and talk about other things...we all know this isn't true hehe. But in the mean time work continues. Also, the I need to say that the super simple classifier I was working on had a bug in it the worst kind of bug... the kind that means it was using all the data to train on :(it happens never happens to meliar!nope perfect code, every time. By training on OOB data i was building on data I shouldn't have known. As soon as I removed the extra data the performance dropped radically down to like 55%. Basically that idea is a bust and I need to go back to the drawing board for a better classifier.
hmmm.. On that note... It's interesting how thoughts change. I was just looking back on my comments on bias and accuracy from a 3 weeks back. I tried to explain that they are different. It's worth saying that when it comes to Out of bag data I think they are exactly the same, or very close. But that's not necessarily the case in general as far as I can tell. I just wanted to throw that out there at first, I'll come back to it in a moment.
So for tonight's development?I have 2 new ideas for tonight to try. One involves eliminating attributes/variables/features call them what you will. They are all the same thing to me that do not contribute to the result. Either because they act as noise or they never vary. There has been a lot of work on this and it seems like a good source of improvement. That is, IF here we go my improved splitter and attribute selection mechanism isn't more or less already doing this 6 of 1 half dozen of the other. It isn't directly, I wrote it with other thoughts in mind but it has a weighting system for picking the appropriate attribute to use, so it may in effect already be cutting those bad attributes out. I think possibly attributes with a lot of noise are still in there so I'll be interested to see if those can be filtered out and improve the result.
And the other idea?The other idea involves a new technique for identifying effectiveness of a particular tree. I was talking about that yesterday with regards to the simple classifier, but the ideas I tried last night just didn't work. Hopefully, this new method will be better at it. If it does work the idea is to use it as a filter, then measure accuracy on data that gets through the filter. If the filter works, it improves accuracy of what is measured. And here is the bias thing I mentioned earlier. The bias is still there but now that a filter is in essence removing some of the cases that don't fit the model's bias if not, all of them!dream on. the net effect is increased accuracy! The bias of any given tree will always remain the same, you just need to find the right cases to use it on. Incidentally, If I can remove attributes that are noisy, that should actually help both bias and accuracy.