• Etterra@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      Microsoft: I know this will only be used for evil, but I’ll be damned if I’m gonna pass up on the hype-boost to my market share.

      Every other big corp: same!

    • MysticKetchup@lemmy.world
      link
      fedilink
      English
      arrow-up
      7
      arrow-down
      1
      ·
      5 months ago

      Like what even is a legitimate use case for these? It just seems tailor made for either misinformation or pointless memes, neither of which seem like a good sales pitch

      • Deceptichum@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        4
        ·
        edit-2
        5 months ago

        I could see a few uses, but the biggest would probably be advertising. Tailored ads that look like they’re coming from a real person.

        Imagine Jake from State Farm addressing you personally about your insurance in an ad.

        Not that I endorse advertising, I’d like to see it all banned.

        I think it could be useful to humanise some things though and talking to a “person” AI in a video call might be more comfortable for some people wanting to do tasks such as say navigate my mobile phone carriers shitty AI help system.

        Really any sort of AI assistant device could benefit from a human imprint.

      • chiisana@lemmy.chiisana.net
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        5 months ago

        Say you’re a movie studio director making the next big movie with some big name celebs. Filming is in progress, and one of the actor dies in the most on brand way possible. Everyone decides that the film must be finished to honor the actor’s legacy, but how can you film someone who is dead? This technology would enable you to create footage the VFX team can use to lay over top of stand-in actor’s face and provide a better experience for your audience.

        I’m sure there are other uses, but this one pops to mind as a very legitimate use case that could’ve benefited from the technology.

        • Pheonixdown@lemm.ee
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          Gotta crank up that dystopia meter.

          This is slowly moving toward having Content On Demand. Imagine being able to prompt your content app for a movie/series you want to watch, and it just makes it and streams it to you.

        • catloaf@lemm.ee
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          5 months ago

          how can you film someone who is dead?

          Hot take: don’t? They’re dead, leave them dead. Rewrite and reshoot if you really have to.

          • chiisana@lemmy.chiisana.net
            link
            fedilink
            English
            arrow-up
            2
            ·
            5 months ago

            Sure that’s an entirely valid option; but not the one the producing team and the deceased’s family opted for… and they had a much larger say in it than you and I combined.

        • MysticKetchup@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          5 months ago

          We’ve already recreated dead actors or older actors whole cloth with VFX. Plus it still seems like a niche use case for something that can be done by VFX artists that can also do way more

          • chiisana@lemmy.chiisana.net
            link
            fedilink
            English
            arrow-up
            1
            ·
            5 months ago

            Having done something before doesn’t mean they shouldn’t find ways to make it better though. The “deepfake”-esque techniques can provide much better quality replicas. Not to mention, as resolution demand increases, it would be harder to leverage older assets and techniques to meet the new demands.

            Another similar area is what LLM is doing to/for developers. We already have developers, why do we need AI to code? Well, they can help with synthesizing simpler code and freeing up devs to focus on more complicated problems. They can also democratize the ability to develop solutions to non-developers, just like how the deepfake solutions could democratize content creation for non/less-skilled VFX specialists, helping the industry create better content for everyone.

      • Even_Adder@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        1
        ·
        5 months ago

        I think you’re falling for the overblown fearmongering headline, and pointless memes is a great reason to make things.

  • MeekerThanBeaker@lemmy.world
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    1
    ·
    5 months ago

    This is why I don’t post my picture online and I never talk to anyone ever, while hiding my head inside a nylon stocking (unrelated).

  • dhork@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    ·
    5 months ago

    Vasa? Like, the Swedish ship that sank 10 minutes after it was launched? Who named that project?

  • Ms. ArmoredThirteen@lemmy.ml
    link
    fedilink
    English
    arrow-up
    11
    ·
    5 months ago

    These vids are just off enough that I think doing a bunch of mushrooms and watching them would be a deeply haunting experience

  • ReallyActuallyFrankenstein@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    8
    ·
    5 months ago

    I mean, I know it’s scary, but I’ll admit it is impressive, even when I watched it with jaded “every day is another AI breakthrough” exhaustion.

    The subtle face movements, eyebrow expression, everything seems to correctly infer how the face would articulate those specific words. When you think of how many decades something like this would be in the uncanny valley even with a team of trained people hand -tweaking the image and video, and this is doing it better in nearly every way, automatically, with just an image? Insane.

  • slaacaa@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    ·
    5 months ago

    “At long last, we have created the Torment Nexus from classic sci-fi novel Don’t Create The Torment Nexus”

  • BetaDoggo_@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    5 months ago

    The “why would they make this” people don’t understand how important this type of research is. It’s important to show what’s possible so that we can be ready for it. There are many bad actors already pursuing similar tools if they don’t have them already. The worst case is being blindsided by something not seen before.

    • spiderman@ani.social
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      how important this type of research

      I hope they also figure a way to find the bad actors who might use their tools for harmful purposes. You can’t just create something for “research” purposes like this and not find a way to stop bad actors using these for harmful purposes.

  • AutoTL;DR@lemmings.worldB
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    This is the best summary I could come up with:


    On Tuesday, Microsoft Research Asia unveiled VASA-1, an AI model that can create a synchronized animated video of a person talking or singing from a single photo and an existing audio track.

    In the future, it could power virtual avatars that render locally and don’t require video feeds—or allow anyone with similar tools to take a photo of a person found online and make them appear to say whatever they want.

    To show off the model, Microsoft created a VASA-1 research page featuring many sample videos of the tool in action, including people singing and speaking in sync with pre-recorded audio tracks.

    The examples also include some more fanciful generations, such as Mona Lisa rapping to an audio track of Anne Hathaway performing a “Paparazzi” song on Conan O’Brien.

    While the Microsoft researchers tout potential positive applications like enhancing educational equity, improving accessibility, and providing therapeutic companionship, the technology could also easily be misused.

    “We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection,” write the researchers.


    The original article contains 797 words, the summary contains 183 words. Saved 77%. I’m a bot and I’m open source!

  • Dasus@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    edit-2
    5 months ago

    One use of this I’m in favour of is recreating Majel Barret’s voice as an AI for computer systems.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      5 months ago

      This project doesn’t recreate or simulate voices at all.

      It takes a still photograph and created a lip synched video of that person saying the paired full audio clip.

      There’s other projects that simulate voices.

  • werefreeatlast@lemmy.world
    link
    fedilink
    English
    arrow-up
    2
    ·
    5 months ago

    Freddie, this is your mom. Look all I want for my birthday is for you to please start using teams new. It’s so much better than teams classic. I alread… Microsoft already installed it for you. Okay honey? And could you also start using a microsoft.com account so you can get financially hooked like all the Gmail users? It’s pretty smart. Don’t you want to be smart like Jonny? Tata!

  • antlion@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    1
    ·
    5 months ago

    Since it’s trained on celebrities, can it do ugly people or would it try to make them prettier in animation?

    The teeth change sizes, which is kinda weird, but probably fixable.

    It’s not too hard to notice for an up close face shot, but if it was farther away it might be hard - the intonation and facial expressions are spot on. They should use this to re-do all the digital faces in Star Wars.