playing around with text-to-speech
whenever i'm editing a piece that i'm being somewhat-to-very tryhard about, i usually make an effort to read the piece aloud to myself. ideally i'd read the whole thing, slowly, audiobook-style, but more often i'm doing some mix of that + "just muttering passages quietly to myself." it's pretty good for catching the sorts of errors that the brain's too good at "filtering out" while reading (e.g. repeating a word, an awkward dialogue tag, etc).
but, i got curious the other night about the state of text-to-speech software, because hey, that's one of the few domains where "just throw more GPUs at it" does seem pretty useful, and i ran out of podcasts for this week's commute, and yeah i'm absolutely vain enough to make a computer audiobookify my own shit haha.
so, lo, here's the random software i decided to play with after a google search. cursory observations:
* these voices are pretty good. like, you're not going to mistake them for a human reader (in particular there's a weirdly "clipped" quality to the way they finish a lot of their sentences, and a sort of monotony/regularity to the way in which they do end-of-sentence/end-of-paragraph-type pauses that sounds distinctly unnatural when you listen for longer periods of time), and i'd certainly rather pay money for the human-read audiobook version of any narrative i actually wanted to enjoy (the lack of any attempt to create different "voices" for each character is a huge drawback), BUT, this is leagues better than the standard-accessibility-suite robo-voice i remember from 00s-era mac osx lol. reasonably pleasant, not too grating, totally works for "being forced to hear my own writing" purposes
* this particular software absolutely cannot handle italics, rip. admittedly this ends up serving as a good reminder that i should be using italics less anyway, but, y'know, sometimes i do need that extra emphasis!!!
* the "audiobook" is excellent for forcing me to notice "stupid" errors (repeated words etc), and i think it miiiight give me a better sense of pacing fuckups? in the sense that, if i've been staring at a wall of text for a while, it's hard for me to get a sense of where a reader might lose interest, whereas if these words are washing over me while i'm in some standstill traffic on the f$@&*ing bridge again, i'm getting a decent intuitive sense of "ok how long is this part going on for & is it actually interesting"
* (unfortunately if i'm listening while in some standstill traffic on the f$@&*ing bridge again, i can't exactly, uh, stop to take notes or fix the manuscript right there, so i'm relying on "just remembering what sounded off," but eh a little mental exercise is good for you)
* i certainly wouldn't want to use this as the only source of reading-aloud-ness since the computer-voice-guy makes some repeated "flow" choices that i just think are WRONG lol. for instance, i think the voice guy gives literally every comma about equal weight, which makes any standalone super-short demarked-by-a-comma phrase, like this one, sound REALLY awkward in a way that i think any ordinary human reading a passage will not find awkward.
* tellius inside baseball observations: this tool pronounces tibarn as "TYE-barn", pronounces as reyson "rey-SON", and pronounces naesala "nae-SA-la," all of which i deem the WRONG way to pronounce those respective names lol. (i'm aware FE Heroes disagrees with me re: naesala, but that just means heroes is wrong too sorry!!!) also nikolias gets pronounced "neh-COE-li-as" which is ALSO wrong. and i made that one up so i'm objectively right here for sure lmao
further observations to be reported when/if they prove interesting
but, i got curious the other night about the state of text-to-speech software, because hey, that's one of the few domains where "just throw more GPUs at it" does seem pretty useful, and i ran out of podcasts for this week's commute, and yeah i'm absolutely vain enough to make a computer audiobookify my own shit haha.
so, lo, here's the random software i decided to play with after a google search. cursory observations:
* these voices are pretty good. like, you're not going to mistake them for a human reader (in particular there's a weirdly "clipped" quality to the way they finish a lot of their sentences, and a sort of monotony/regularity to the way in which they do end-of-sentence/end-of-paragraph-type pauses that sounds distinctly unnatural when you listen for longer periods of time), and i'd certainly rather pay money for the human-read audiobook version of any narrative i actually wanted to enjoy (the lack of any attempt to create different "voices" for each character is a huge drawback), BUT, this is leagues better than the standard-accessibility-suite robo-voice i remember from 00s-era mac osx lol. reasonably pleasant, not too grating, totally works for "being forced to hear my own writing" purposes
* this particular software absolutely cannot handle italics, rip. admittedly this ends up serving as a good reminder that i should be using italics less anyway, but, y'know, sometimes i do need that extra emphasis!!!
* the "audiobook" is excellent for forcing me to notice "stupid" errors (repeated words etc), and i think it miiiight give me a better sense of pacing fuckups? in the sense that, if i've been staring at a wall of text for a while, it's hard for me to get a sense of where a reader might lose interest, whereas if these words are washing over me while i'm in some standstill traffic on the f$@&*ing bridge again, i'm getting a decent intuitive sense of "ok how long is this part going on for & is it actually interesting"
* (unfortunately if i'm listening while in some standstill traffic on the f$@&*ing bridge again, i can't exactly, uh, stop to take notes or fix the manuscript right there, so i'm relying on "just remembering what sounded off," but eh a little mental exercise is good for you)
* i certainly wouldn't want to use this as the only source of reading-aloud-ness since the computer-voice-guy makes some repeated "flow" choices that i just think are WRONG lol. for instance, i think the voice guy gives literally every comma about equal weight, which makes any standalone super-short demarked-by-a-comma phrase, like this one, sound REALLY awkward in a way that i think any ordinary human reading a passage will not find awkward.
* tellius inside baseball observations: this tool pronounces tibarn as "TYE-barn", pronounces as reyson "rey-SON", and pronounces naesala "nae-SA-la," all of which i deem the WRONG way to pronounce those respective names lol. (i'm aware FE Heroes disagrees with me re: naesala, but that just means heroes is wrong too sorry!!!) also nikolias gets pronounced "neh-COE-li-as" which is ALSO wrong. and i made that one up so i'm objectively right here for sure lmao
further observations to be reported when/if they prove interesting
no subject
super fascinating!!! having used dragon naturallyspeaking once or twice back in the 00's as you mentioned it's awesome hearing that it's improved since then. (tbh automatic captions/translations are another area i've seen a distinct marked improvement over the decades - i recently participated in a meeting with only mexican-spanish speakers and understood 95% of it right off of pure captioning, it was awesome, 10/10 would be a fly on the wall again.
cheering for u in the home stretch there!
no subject
no subject
I found it good for the little errors and some flow things, and that it's good for giving me a little more distance from the work. And some parts of it feel very passive to me in a way that reading doesn't! The other good thing for me is that there are a lot of things that I might pick out if I were reading my own writing, but it turns out that when I listen to them read aloud, they doesn't bother me. So it's curbing the perfectionism a bit for me
Also, much easier to edit while petting cat.
no subject
there are a lot of things that I might pick out if I were reading my own writing, but it turns out that when I listen to them read aloud, they doesn't bother me
oh, yes, i noticed this as well!!! had a whole paragraph i'd been agonizing over b/c i was like "this is garbo in some subtle way, ugh, i'm gonna have to rewrite this," but when i listened to it i was like... wait this is totally fine. why would i waste my time rewriting this, it gets the job done. GREAT VICTORY
no subject
no subject
RAY-son
NAE-sa-la (putting the emphasis on the first syllable gives it a very different feel imo!)
ni-ko-LIE-us (the "ni" as in the "kni" of "knick"; the "ko" as in the "co" of "covert")
i can bust out an IPA chart if need be but hopefully that paints the picture :P
(am curious for your disagreement/agreement with these objectively correct things!)
no subject
and I agree all around! I do pronounce naesala differently, but I know that the thing I'm doing in my head with his name is wrong and sinful. (it goes sorta like... NEI-SHA-la, with a sort of a double stress)
(I also read "nikolias" a bit more NI-ko-lee-as, but, well, I think that's just the more lusophonic reading as opposed to your anglophonic reading)
no subject
no subject
I bet that technology has improved a lot in the past 8 years too.
no subject