queenlua: (Default)
i have no idea how widely this is being covered by traditional news sources, since my news feed has, uh, a very tech-heavy bias, but!

if you haven't seen it yet, google's alphago program just won the first two matches of a five-game series, playing Go against Lee Sedol, one of the top Go players in the world.

this is actually a very cool, very big deal and y'all should take a peek at a bit of it while it's going on. it's totally the deep blue vs kasparov of our time.

i think the thing that's striking to me is how quickly this has happened. when i was studying AI in college, around 2011 or so, the impression me and my fellow students generally had was, Go is a really really hard game for computers, for a number of reasons, and most of us weren't expecting it to be solved anytime soon.

in particular, we can compare against chess. chess is a hard game, and deep blue was a huge accomplishment for its time—but. turns out chess has a definite win condition a computer can aim for (put the king in checkmate); go is based on however many points are earned by the end of the game, and turns out like 80% of those points are "up in the air" until near the end of the game most the time (you "capture" stones as the game progresses but most stones don't get captured til near the end). chess boards are small, 8x8, and thus have a smaller number of possible game states; go boards are nineteen by nineteen which puts an insane multiplier on the number of possible board states. chess pieces are in a fixed position at the start of every game; go pieces are placed on the board one at a time as play progresses. there are nice heuristics for evaluating how strong/weak someone's position on a chess board is at any given time (i.e. count up the value of each players' pieces), whereas it can be really hard to tell who's winning a go game at any time. and so on.

there were efforts at go AI while i was in college, of course, but most of them were only effective on much smaller boards, or against inferior opponents (players in the european go associations are noticeably weaker than asian go association players on average, etc), and used some newfangled search trees. i haven't had a chance to read up on alphago yet, but the gist i've gotten is that it relies heavily on neural networks—an area of AI research that's exploded at a crazy rate in the past few years—as well as those newfangled search trees. i also understand that the techniques the computer used to learn go are surprisingly general—that is, it wasn't super-duper-hardcoded to work with only Go, and thus theoretically could be applied to a wide range of AI problems.

it's been a long time since i was especially interested in AI, but this is super-cool research; i'm planning to read the papers and such about it and maybe do a writeup of it later. in the meantime, yeah, everyone go check out some alphago matches; it'll be a good time.
queenlua: (Default)
According to I Write Like...
  • "Wings Dancing in the Darkness" reads like Margaret Atwood
  • "Every Little Thing" reads like Chuck Palahniuk
  • "Delicately, Madly" reads like Charles Dickens
  • "White Like Bone" reads like Anne Rice
  • "Pyre" reads like Raymond Chandler
  • "Dog in the Vineyard" reads like Dan Brown (...yuck)
  • "Crush" reads like Chuck Palahniuk
  • annnnd Remnants of Restoration reads like Kurt Vonnegut
Conclusion: either my writing is wildly inconsistent or the website's algorithm is, and I strongly suspected the latter...

...but then I discovered the source code for IWL is available online (eee) so I decided to poke at its innards for a bit and see what's what

Lua sets up a local instance and installs shit: the liveblog! (terribly boring do not read) )

Once I had a local instance running, I decided to do some experiments for teh lulz (and perhaps tangentially teh science).

I cleaned out the authors included with the IWL download and used some fanfic authors instead: arbitrarily I chose myself, [personal profile] amielleon, and [personal profile] mark_asphodel (hello, unwitting volunteers! :D;;; ). I used the three latest fics by these three authors for training data, then took a few of the other works by each author to see how accurately IWL could guess the true author of a work:

Data! )

...okay wow, based on that data, IWL seems to suck. Badly. As in, a-random-number-generator-could-do-a-better-job-for-anyone-not-named-Mark1.

Time to look at the code and see what the methodology at play is...
  • Analysis seems to be based on both "tokens" and "readability"

  • The readability metric is just the Flesch Reading Ease score, which has been discussed here before as being a somewhat problematic and inconsistent metric

  • Tokens is more unclear to me on this quick skim, but what I'm pretty sure is going on is: they're basically making a giant table of "words appearing in the text plus their frequencies," and based on that, they calculate a "rating" based on how the relative probability of those words is distributed (i.e. if A and B both use the words "obnoxious" and "teetotaler" a lot, the algorithm will notice that and assume A and B are more similar)
...so yeah, while the metrics IWL uses are better than a random number generator, they're still pretty unrigorous/underwhelming (quite possibly by design—I know I've seen this website pop up in my friends' circles more than once, and it does make a fun little two-minute time-waster when you first stumble upon it—it doesn't really need to be The Greatest Algorithm Evar TM to accomplish that).

Footnote )

Syndicate

RSS Atom

Expand Cut Tags

No cut tags