• 3 Posts
  • 245 Comments
Joined 1 year ago
cake
Cake day: January 24th, 2024

help-circle

  • But what is meant by “integrity of the model, inputs and outputs”?

    I guess I don’t understand the attack vector, what’s the threat here? Someone messes with the model file or refines a model towards a specific malicious bias like inserting scam links where legit links would go and passes it off as the real deal?

    I’m more general cybersec than crypto so idk but isn’t that what hash sums are for?

    Surely if someone messed with my .ckpt or .safetensors it won’t be the same file anymore?

    And what does that have to do with validity of the inputs?



  • No, because that’s not how elections work you fucking goal post shifter as I already explained to one of you delulu mfers in the comments.

    But this is the whole thing isn’t it? You know that. You don’t have any allegiance to the truth or honesty or integrity or any actual values, you just happened to be on the side vaguely not actively against those things.

    You are exactly the reason why we’re in this fucking mess and there’s no difference between you and a trump supporter who voted because they thought 2 parts ivermectin per glass of water in the water supply is gonna cure his 5G.

    This level of intellectual dishonesty is no different, and just as profoundly anti-enlightenment.

    You can lead a horse to water but you can’t make it drink, and that’s why trump won, and will probably stay winning for the foreseeable future.











  • Yeah i should know, but I’m too lazy haha. Didn’t lose anything completely irreplaceable but my beautiful bind9 local DNS zone written and annotated by hand is gone.

    Plus I have basically nowhere to back up to.

    At least the first thing I did when reinstalling Debian was set up an an rsync cron job to fetch the home, etc and some other select dirs, but this is backing up to a Raspberry Pi with a busted micro SD slot that runs off a rather dodgy USB enclosure’d 120 gig mSATA SSD that already failed before that originally transplanted from a busted MSI gaming laptop I sold for coke cash in the mid-2010s.

    Not ideal. That pi also periodically shits the bed. It’s exposed to the elements a bit because it’s also in use in 2 DIY iot projects.

    Is there a decent non-shit non-megacorp-empowering affordable way of doing off-site backups on a small scale?



  • Iktf, I had my 1060-6GB die and for a while I was gaming on a 750Ti lol.

    Recently my Crucial 1TB SATA SSD suddenly died, no errors, no SMART, no detection on any computer or via USB adapter, only coil whine, taking with it exactly all the things I never backed up because SSDs are supposed to be good :(




  • It’s complicated.

    I know Stable Diffusion best so I’ll speak to that, they used to the LAION-5B dataset, which is, in practice freely available to download and use:

    https://www.kaggle.com/code/vitaliykinakh/guie-laion-5b-collect-and-download

    https://github.com/opendatalab/laion5b-downloader

    It’s also on HuggingFace but it’s unavailable.

    https://huggingface.co/datasets/danielz01/laion-5b

    But you can use this smaller newer version:

    https://huggingface.co/datasets/laion/relaion2B-en-research

    Whether it’s appropriately licensed is an unsolved question though.

    The dataset itself and the text portion of the text-imags pairs needed for training is CC-BY-SA, the newer versions linked above are CC-BY-4.0. https://creativecommons.org/licenses/by/4.0/deed.en

    The images however are technically under their own copyright, which in practice means each of the billions of images could or could not have a licence that implicitly or explicitly forbids AI training use or forbids it only for commercial use.

    Whether such a license is legally binding is at present unknown though, since licenses primarily deal with reproductions, which the pro-AI folks argue isn’t the case, and that training of NNs is more akin to viewing an image and memorising the patterns and relationships within, like a person viewing it.

    That would make it non-infringing and therefore the model itself libre. In that case Mistral and LLaMa are also libre as long as the model itself is open source, which in this case really means “open weights”, so not like GPT and anything by “”“OpenAI”“”.

    Weights are the result of a model being trained essentially. They’re they key bit that makes it or breaks it and how it works. Given that and knowing the structure of the model and framework used you can refine, modify and distribute it.

    Those against AI will say that it’s more akin to file compression and that in one form or another it’s misuse. That would make the model an infringing derivative work and as such nor libre even if the model weights are open source.

    In a way though you could argue that me vaguely memorising the imagery of a dude dressed in white holding a laser sword is just a lossy compressed copy of the copyrighted work of Star wars, and it’d be absurd to think that’s a violation and that infringement only occurs if I reproduce a work of substantial similarity commercially from that memory.

    If I use Krita and draw a beautiful landscape which has been informed and inspired by at least in part by a movie I saw, is that copyright infringement or not? What if I use AI?

    Well, current laws don’t say. We measure infringement in substantial similarity, provenance of information only comes in later (e.g. to prove against accidental similarity).

    That’s also my own personal stance on the legal side of things, so up to you how you see it.


  • He’s using windows.

    But while we’re on the subject, ~/.local/share is cancer and shouldn’t exist.

    The appropriate path is /usr/share.

    EDIT: Okay to be clear I mean that anything that could be global should go into /usr/share and massively save on space and effort if another user needs the same stuff.

    Anything that doesn’t need to be global doesn’t need to go into /use/share but somewhere else in ~/.

    The way it is now my ~/.local is a massive dumping ground of crap from configs to static app resources that should go into /usr/share to entire applications with snap or flatpak (why I don’t use them) to random config files.

    It’s just a nasty mess on my home partition when it in most cases really doesn’t need to be.

    Users below rightfully pointed out many exceptions like venvs and while I still believe there should be a more correct place for them to go e.g. (~/.venv, ~/.flatpak), but obviously they shouldn’t go into /usr/share willy-nilly.

    I have removed the sass below because I should’ve been more comprehensive in my criticism before ad-hominem.