• Bear@sh.itjust.works
    link
    fedilink
    arrow-up
    27
    arrow-down
    1
    ·
    11 months ago

    Of course this is possible. Is it practical? Nope. There is already so much data harvested by the likes you Google and Facebook that they can tell what you like, what videos or articles you read, what you share, in some cases who you talk to. Importing a shit ton of audio data is pointless, they already know what you like.

    • jard@sopuli.xyz
      link
      fedilink
      arrow-up
      24
      arrow-down
      4
      ·
      edit-2
      11 months ago

      The sheer amount of audio data that would have to be processed by Google and Amazon every second for every Google Home/Amazon Echo/Facebook Whatever would be a technical and logistical nightmare. It’s far easier for them to wait until you voluntarily give them that data yourself, whether it’s clicking on their ads, searching for specific things in Google/Amazon, and way more slimy methods that they use to track your everyday likes and needs.

      • DavidGarcia@feddit.nl
        link
        fedilink
        arrow-up
        21
        arrow-down
        3
        ·
        11 months ago

        you just need to process the audio on the devices and then send keywords to Google etc. it’s technically trivial since most phones already have dedicated hardware for that. your phone listens to activation words all the time, unless you disable it. there is no reason why they can’t also forward anything else it hears as text

        • jard@sopuli.xyz
          link
          fedilink
          arrow-up
          14
          arrow-down
          6
          ·
          edit-2
          11 months ago

          Do you have any evidence for this claim? Voice recognition and processing is very power and energy intensive, so things like power consumption and heat dissipation should be readily measurable; especially if an app like Google or Amazon is doing it on an effectively constant basis.

          Keywords are being sent to Google — have you sniffed this traffic with Wireshark and analyzed the packets being sent?

          Phones have dedicated hardware for voice processing, true, but that’s when you voluntarily enable it through voice dictation or train it with very specific and optimally chosen key phrases (“Okay Google,” “Hey Siri,” …). For apps that otherwise allegedly listen to voice audio constantly, they would need to be utilizing his hardware continuously and constantly. Do you have any evidence that apps like Google continuously utilize this hardware (knowing that it is a power intensive and heat-inducing process?)

          I’m not trying to argue in bad faith. As an engineer, I’m having trouble mentally architecting such a surveillance system in my head which would also not leave blatantly obvious evidence behind on the device for researchers to collect. These are all the questions that I naturally came up while thinking of the ramifications of your statement. I want to keep an open mind and consider the facts here.

          • krotti@sh.itjust.works
            link
            fedilink
            arrow-up
            2
            ·
            11 months ago

            I would assume that you are right, considering how much gargage you collect if listening.

            Now imagine recording those who have not given consent, or the device saving full scripts of movies.

            • jard@sopuli.xyz
              link
              fedilink
              arrow-up
              5
              arrow-down
              4
              ·
              edit-2
              11 months ago

              Right. The legality of just recording everything in a room, without any consent, is already incredibly dubious at best, so companies aren’t going to risk it. At least with voice dictation or wakewords, you need to voluntarily say something or push a button which signifies your consent to the device recording you.

              Also, another problem with the idea of on-device conversion to a keyword that is sent to Google or Amazon: with constant recording from millions of devices, even text forms of keywords will still be an infeasible amount of data to process. Discord’s ~200 million active users send almost a billion text messages each day, yet Discord can’t use algorithmic AI to detect hate speech from Nazis or pedophiles approaching vulnerable children — it is simply far too much data to timely process.

              Amazon has 500 million Amazon Echo’s sold, and that’s just Amazon. From an infrastructure-standpoint, how is Amazon supposed to deal with processing near 24/7 keyword spam from 500 million Echo devices every single day? Such a solution would also have to be, in theory, infinitely scalable as the amount of traffic is directly proportional to the number of devices sold/being actively used.

              It’s just technologically infeasible.

              • Subverb@lemmy.world
                link
                fedilink
                arrow-up
                7
                arrow-down
                1
                ·
                11 months ago

                Anecdotally, the odds are near zero that my wife and I can talk once about maybe buying some obscure thing like electric blinds and suddenly targetted ads for them somehow pop up on our devices.

                This happens a lot.

                I think you’re being naive if you believe they don’t locally distill our discussions into key words and phrases and transmit those.

        • TORFdot0@lemmy.world
          link
          fedilink
          English
          arrow-up
          7
          arrow-down
          1
          ·
          11 months ago

          Ok but third parties have no access to this in the background. My guess is they are buying marketing data from their listed “partners” and making wide claims about how they obtained it. Still a huge breach of privacy though!