queenlua | I Write Like

"Wings Dancing in the Darkness" reads like Margaret Atwood
"Every Little Thing" reads like Chuck Palahniuk
"Delicately, Madly" reads like Charles Dickens
"White Like Bone" reads like Anne Rice
"Pyre" reads like Raymond Chandler
"Dog in the Vineyard" reads like Dan Brown (...yuck)
"Crush" reads like Chuck Palahniuk
annnnd Remnants of Restoration reads like Kurt Vonnegut

Conclusion: either my writing is wildly inconsistent or the website's algorithm is, and I strongly suspected the latter...

...but then I discovered the source code for IWL is available online (eee) so I decided to poke at its innards for a bit and see what's what

in a sort of cute move they decided to write this in some hipster language i've barely heard of

...okay what kind of programming language does not have a simple "make install" command and instead gives me some bullshit GUI and forces me to manually set my path geez

...oh fuck i overwrote my path, welcome to n00b mistake of the night, oh fuck ls and vim are not working did i just break bash

crisis averted (but that was the most terrifying handful of minutes in my life)

uh okay evidently hitting "Analyze" on my local instance gets me a page that says "not found" that seems sort of useless

mm i love the feeling of adding my first expletive to the code (404 errors are much more attractive as "the fuck")

uh okay there's a bug somewhere in dispatch-rules what's that about

bluuuh this is hard to fix without an actual debugger but the racket documentation's pretty vague about how i might use such a thing via the command line

oh interesting, evidently there's a compatibility issue between Racket 5.3.1 (which I was trying to use) and Racket 5.1, which was causing my instance to Not Work ^TM. there's a known compatibility issue between 5.0 and 5.1 but nothing online about this issue; I'll file a bug report and maybe look into it in the morning

Once I had a local instance running, I decided to do some experiments for teh lulz (and perhaps tangentially teh science).

I cleaned out the authors included with the IWL download and used some fanfic authors instead: arbitrarily I chose myself,

amielleon, and

mark_asphodel (hello, unwitting volunteers! :D;;; ). I used the three latest fics by these three authors for training data, then took a few of the other works by each author to see how accurately IWL could guess the true author of a work:

Lua's stuff: IWL incorrectly thinks that Mark wrote "White Like Bone," "Dog in the Vineyard," and chapter 1 of Remnants of Restoration. It correctly thinks I wrote "Pyre" and "Crush." (Accuracy: 2/5)

Ammie's stuff: IWL incorrectly thinks I wrote "lucius listens to the rain," "Ghost Stories," and "a visitor at any hour." It thinks Mark wrote "Coin in Palm" and "New World." It correctly thinks Ammie wrote "In Questioning Ghosts." (Accuracy: 1/6)

Mark's stuff: IWL correctly thinks that Mark wrote "Gold for Salt," "Blackout," "The Losing End," and "In Transition." It thinks I wrote "Without Vocation." (Accuracy: 4/5)

...okay wow, based on that data, IWL seems to suck. Badly. As in, a-random-number-generator-could-do-a-better-job-for-anyone-not-named-Mark¹.

Time to look at the code and see what the methodology at play is...

Analysis seems to be based on both "tokens" and "readability"

The readability metric is just the Flesch Reading Ease score, which has been discussed here before as being a somewhat problematic and inconsistent metric

Tokens is more unclear to me on this quick skim, but what I'm pretty sure is going on is: they're basically making a giant table of "words appearing in the text plus their frequencies," and based on that, they calculate a "rating" based on how the relative probability of those words is distributed (i.e. if A and B both use the words "obnoxious" and "teetotaler" a lot, the algorithm will notice that and assume A and B are more similar)

...so yeah, while the metrics IWL uses are better than a random number generator, they're still pretty unrigorous/underwhelming (quite possibly by design—I know I've seen this website pop up in my friends' circles more than once, and it does make a fun little two-minute time-waster when you first stumble upon it—it doesn't really need to be The Greatest Algorithm Evar ^TM to accomplish that).

¹ It is probably worth noting that the fics used for Ammie's training set might've skewed her results; "Benefits" and "In the City" are perhaps not the most representative samples from her corpus. Whups.

Flat | Top-Level Comments Only

From:

amielleon

Hahahahaha.

Okay but even if you had a good algorithm, I think my corpus (or at least, the way you've chosen it) may be inherently problematic. While I don't deny having a "general" voice and some very strong generalities in terms of theme, I'm fond of deliberately using slightly different voices in different pieces. A Terrycloth Mother is probably closest to what I consider some kind of "standard serious," though I haven't actually used the "standard serious" voice much aha. lucius, Coin in Palm, and New World I'd classify as "whimsical." And the rest are pretty much separate categories unto themselves, unless you dive into my fic not posted there at FFN. (visitor is exceptionally and markedly different.)

But, theoretically, if you were trying to build an algorithm that could identify the author of a piece even when the author were trying to consciously disguise it, I would be an excellent test.

Also, shouldn't accuracy also be counted for false positives? That Mark result looks much less impressive when you consider that it gave Mark 4 false positives. Before I realized that fact I was tempted to chalk up Mark's unusually high accuracy to the fact that she uses a very similar writing style throughout her corpus... though granted it still does better with her than either of us.

Incidentally, if it's a matter of reading score and frequencies I suspect it might okay with

blankspectrum's stuff, as she has a fairly consistent writing voice and some favorite words. (*cough words based on "soft" cough*)

tl;dr yeah it's just one of those "fun waste of time" things.

raphiael

There are quite a few images going around the tubes of ridiculous inputs being met with famous authors. I'm pretty sure that one "gimme the booty" song came up as Poe, resulting in the image I've used as my icon :B

But I was always curious as to how it worked. The "male/female" test that goes around every once in a while is at least upfront about it. (My writing usually comes up "masculine" -- and the "feminine" words are typically relationship-focused rather than environmental. Did not like that.)

But yeah. It's cool that you were able to rig that up like that! And the results are interesting, if not especially meaningful.

queenlua

I have not seen the Poe "gimmie the booty" output before. That is pretty fabulous :D

also, that male/female test one made me super-happy because when I got curious about how it worked, not only was there a pretty clear methodology, but the dude posted his master's thesis which was related to the topic and then I spent the afternoon trapped in academic CS papers /dork

yeah I recognized the issues with your corpus as I was running the numbers, and I considered deliberately picking only stuff with your most consistent voice... but then I was like "lua you can't just cherry-pick the pieces you're using it's not sufficiently random that is not science" :P

incidentally & interestingly, identifying an author that's deliberately trying to disguise their writing style is a problem that's known to be pretty damn difficult (and here's a semi-related blog entry just because I find it interesting :P )

and yeah, I should've mentioned false positives for accuracy, herp

(another slight consideration/flaw I noticed in the original program's training data that I failed to mention before: they had like 50 different authors, which is an awful lot of bins. say everything is scored from 1 to 100 and you have a separate bin for each number, but suppose even a consistent author tends to write in the 30-35 range—they're going to get wildly inconsistent results even though their scores tend to be clustering around the same value.)

Interesting, and very sad that courts admit such easily forged evidence.

Also will you be up for tinkering with this later? It seems like such a cute toy once it's customizable.

Hell if it's open source it'd be a lot of fun to throw up a customizable mirror.

throwing it up somewhere might be in the cards, actually—it'd be trivial to throw it on Tobias (NJ webserver) with my own set of training data (maybe just train it with "fic written by people on my flist?"), assuming no unexpected weird hooking-stuff-up issues

letting users input their own training data would be a bit more obnoxious, esp. since I'm not familiar with the language or framework used in this project, but I'd be curious to try doing it anyway, esp. if there's interest in it

mark_asphodel

This was funny as hell to read. You run such interesting experiments!

I'd go with Ammie that my recent body of work is pretty much of a type (same 'verse, mostly the same characters, mostly low-key drama) and the consistent results would make sense... except for the false positives. Wow.

Glad you found it amusing :D

Lost in Lualand

I Write Like

I Write Like

no subject

no subject

no subject

no subject

no subject

no subject

no subject

no subject

Profile

Links

Expand Cut Tags