• moosetwin@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    18
    arrow-down
    3
    ·
    15 hours ago

    I don’t mind the idea, but I would be curious where the training data comes from. You can’t just train them off of the user’s (unsubtitled) videos, because you need subtitles to know if the output is right or wrong. I checked their twitter post, but it didn’t seem to help.

    • Warl0k3@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      14 hours ago

      I hope they’re using Open Subtitles, or one of the many academic Speech To Text datasets that exist.