queenlua: A black-and-blue jay perched on a branch. (Yucatan Jay)
[personal profile] queenlua
whenever i'm editing a piece that i'm being somewhat-to-very tryhard about, i usually make an effort to read the piece aloud to myself. ideally i'd read the whole thing, slowly, audiobook-style, but more often i'm doing some mix of that + "just muttering passages quietly to myself." it's pretty good for catching the sorts of errors that the brain's too good at "filtering out" while reading (e.g. repeating a word, an awkward dialogue tag, etc).

but, i got curious the other night about the state of text-to-speech software, because hey, that's one of the few domains where "just throw more GPUs at it" does seem pretty useful, and i ran out of podcasts for this week's commute, and yeah i'm absolutely vain enough to make a computer audiobookify my own shit haha.

so, lo, here's the random software i decided to play with after a google search. cursory observations:

* these voices are pretty good. like, you're not going to mistake them for a human reader (in particular there's a weirdly "clipped" quality to the way they finish a lot of their sentences, and a sort of monotony/regularity to the way in which they do end-of-sentence/end-of-paragraph-type pauses that sounds distinctly unnatural when you listen for longer periods of time), and i'd certainly rather pay money for the human-read audiobook version of any narrative i actually wanted to enjoy (the lack of any attempt to create different "voices" for each character is a huge drawback), BUT, this is leagues better than the standard-accessibility-suite robo-voice i remember from 00s-era mac osx lol. reasonably pleasant, not too grating, totally works for "being forced to hear my own writing" purposes

* this particular software absolutely cannot handle italics, rip. admittedly this ends up serving as a good reminder that i should be using italics less anyway, but, y'know, sometimes i do need that extra emphasis!!!

* the "audiobook" is excellent for forcing me to notice "stupid" errors (repeated words etc), and i think it miiiight give me a better sense of pacing fuckups? in the sense that, if i've been staring at a wall of text for a while, it's hard for me to get a sense of where a reader might lose interest, whereas if these words are washing over me while i'm in some standstill traffic on the f$@&*ing bridge again, i'm getting a decent intuitive sense of "ok how long is this part going on for & is it actually interesting"

* (unfortunately if i'm listening while in some standstill traffic on the f$@&*ing bridge again, i can't exactly, uh, stop to take notes or fix the manuscript right there, so i'm relying on "just remembering what sounded off," but eh a little mental exercise is good for you)

* i certainly wouldn't want to use this as the only source of reading-aloud-ness since the computer-voice-guy makes some repeated "flow" choices that i just think are WRONG lol. for instance, i think the voice guy gives literally every comma about equal weight, which makes any standalone super-short demarked-by-a-comma phrase, like this one, sound REALLY awkward in a way that i think any ordinary human reading a passage will not find awkward.

* tellius inside baseball observations: this tool pronounces tibarn as "TYE-barn", pronounces as reyson "rey-SON", and pronounces naesala "nae-SA-la," all of which i deem the WRONG way to pronounce those respective names lol. (i'm aware FE Heroes disagrees with me re: naesala, but that just means heroes is wrong too sorry!!!) also nikolias gets pronounced "neh-COE-li-as" which is ALSO wrong. and i made that one up so i'm objectively right here for sure lmao

further observations to be reported when/if they prove interesting

Date: 2024-08-13 08:39 pm (UTC)
kradeelav: (Default)
From: [personal profile] kradeelav
somehow i knew this was going to be about Birb Manuscript before even clicking the readmore.

super fascinating!!! having used dragon naturallyspeaking once or twice back in the 00's as you mentioned it's awesome hearing that it's improved since then. (tbh automatic captions/translations are another area i've seen a distinct marked improvement over the decades - i recently participated in a meeting with only mexican-spanish speakers and understood 95% of it right off of pure captioning, it was awesome, 10/10 would be a fly on the wall again.

cheering for u in the home stretch there!

Expand Cut Tags

No cut tags