Skip Navigation

Posts
12
Comments
130
Joined
3 yr. ago

  • Get llama.cpp and try Qwen3.6-35B-A3B. Just came out and looks good. You'll have to look into optimal settings, as it's a Mixture of Experts (MoE) model with only 3B parameters active. That means that the rest can stay in RAM for quick inference.

    You could also try the dense model (Qwen3.5-27B), but that will be significantly slower. Put these in a coding harness like Oh-My-Pi, OpenCode, etc. and see how it fares for your tasks. Should be ok for small tasks, but don't expect Opus / Sonnet 4.6 quality, more like better than Haiku.

  • Proposed instance defederation for multiverse.soulism.net - an instance administrated by Grail aka Dronerights aka Hardlightcereal aka Exocrinous aka Drag

    Jump
  • Although I don't have much of a sway in these votes, I'd still say no. If they're the the only ones interacting on that instance, what stops them from making another instance in a month or two? Do we trigger another vote then to block that one as well? It could be a fun activity to have the monthly "Defederate from Grail's instance", but this doesn't really make sense. I vote to leave the clean-up to mods of instances where this person creates problems. If they don't give up, all they need is yet another account/instance to troll communities here.

  • There must be something that ensures the response is legitimate. Otherwise, if it's client-side and fully offline, I can just spoof the app to return the response "Yes, over 18". If it's not the government doing the verification, it's Google or Apple, which will give them access to all the "adult" websites you visit. Also, another reason for the EU to push for strict device attestation, without any DIY stuff (i.e., no more GrapheneOS, LineageOS, etc).

    I couldn't find a desktop app on the EU's GitHub (another red flag, btw, using GitHub for this). All that seems to be available is code for the Android or iOS apps. Could you share it, if you can?

  • Even with the Zero Knowledge approach, you will still run an app on a phone (what if I don't have one) that will make some call to the government's servers, which will most likely know what website you're trying to access. We're moving the data mining from some third party to the government, which can be wrongly used later if some idiot comes into power. If it's not making a call to a government's servers, I would be surprised, since you could imagine someone just bypassing this to always return "Over 18".

    Even funnier (read "sad"), this initiative will probably rely on Google and Apple to keep it robust, and will likely have no availability on rooted phones or non-Google Play Services ones. It's premature at best to deploy this in a meaningfully safe way.

  • Oh, I remember this! Thanks for sharing, very nice find! Could be a worthwhile approach once we have the data :)

  • Apologies for the late reply! Busy days :D

    I agree with you. Crowd-sourcing this type of research would be a completely different goal than what the AI Horde was built for, and would probably not be sustainable with part-time / volunteer researchers. Perhaps it's best for us to just wait until others make more substantial progress.

    The goal would still have been inference for the Horde, but with sharing of feedback based on the model's outputs, to align it more with the original one. However, after considering this approach more, I am afraid that the maths behind it makes it impossible to "reconstruct" the original model's manifold, or at least capture the same behaviour in all use cases.

    I came here to propose this idea because, to the best of my knowledge, this is only LLM community that actually pushes for sharing of resources. However, I have seen a few days ago a post on the LocalLlama community advocating for sharing of OpenCode sessions in order to crowd-source a fine-tuning dataset, so it seems that more people are having the same thoughts! :)

    I will keep an eye out on other advancements, and if I actually end up having some time, perhaps I'll return with some contributions. I agree with you that such a project mostly relies on inference, in which case the AI Horde is not the only one that can provide that capability. What we would need is deploying such a model on HuggingFace, and creating an API endpoint for sharing training data for people that are interested in contributing.

    Thanks a lot for offering your thoughts, and taking the time to write such lengthy responses to me! I hope you have a nice weekend!

  • I hope things get better for you! I would have recommended you get out of your country before they cancel your passport or something, but I'm not sure if it gets any better in other places. Even Europe seems to be speed-running fascism, and it's probably a matter of time before we follow in the US' steps...

    Stay strong, and if things get really hairy, consider living off-grid. Gather together with other people that go through this, and make an escape plan. I believe it's becoming more and more sustainable nowadays with the advancements of solar panel technology to live outside civilization well.

  • Indeed, the quantization described in the Microsoft paper (and even in this NanoQuant paper) severely messes up the behaviour of the model. Even in this newer paper, you'd still incur ~2x performance loss (which is better than what was reported by 1.68 bit paper, if true), in terms of perplexity. However, as per the other paper I have added in the edited post, it is possible to further align a quantized model with the original one. In the end, LLMs are just fancy math that seek to maximize human preferences, and most of the bigger models were just better trained at doing that. With this approach, all we would have to do is just further refine the LoRA weights until we can match the behaviour of the unquantized model, which wouldn't be that expensive if all we have to do is fine-tune a few million parameters. It might be that at the beginning we're seeing worse performance compared to a 3B parameter model, but with more refinement we can further unlock some of the original performance.

    Regarding the use of the Horde, I believe that behaviour alignment can't be done without actually using it. Just like corpo-AI are giving away their models so that they can further get data, we could have a similar, but much more compute-efficient, community-driven approach. Models by the people, for the people, if you will. Furthermore, as I mentioned, I think this would be the only community that has the compute and desire to push improvements on such an idea long-term, as it isn't profit-driven.

    Let's say that this whole experiment starts with an extreme case, the MiniMax M2.5 model, and we abstract away from any architectural fancy stuff. At ~230B parameters, we would have a 1-bit model size of ~28.75 GB, and, as per Table 2 of NanoQuant, ~23 GB if we were to prune 20% of the weights. This would be enough to fully fit it on a 24GB VRAM GPU. Following this, we could get a well-balanced list (i.e., easy, medium, hard) of reasoning tasks, and fine-tune the LoRA layer to match the output. Heck, we could even tailor this to specific tasks, such as role-playing, coding, etc. It will be a long-term experiment where we might serve two answers (depending on Horde availability), one generated by the quantized model + LoRA and another that is regularly deployed. The user could then choose the model they prefer, and use that information later for further training.

    This would indeed be quite cumbersome to set up, and could very well be wasted time. Users might even opt out from this because it could take too much time to help. But hey, I still think it would be a cool experiment to see if consumers could actually use these larger models on regular hardware, and get close to the original performance without paying for all the compute that is needed.

  • The models themselves would indeed be costly to train if you were to go for the regular approach. You would have to "upscale" the weights to be fp32 from binary, which would make the models only trainable on the usual amount of GPUs. That is because the training process relies on back-propagation, which only makes sense if your operations are differentiable. Since addition is not differentiable, your binary weights would only be updated by 0, so no change.

    However, LoRA (16-bit) QLoRA (4/8-bit) fine-tuning can be done on a single GPU, assuming you can fit the model on it. Everything is frozen, except for a separate small network, which is updated during training. This can have BF16 or F32 precision, and would be trained as you would a regular network.

    What I am suggesting is to actually leverage bigger models that come out, and attempt to compress them using the proposed algorithm (if it actually scales to bigger models). From there, we could employ some tricks to improve performance, think latent reasoning, community-driven RLHF only on the (Q)LoRA layers, etc. With time, we would be able to pool together a dataset and a pipeline that can be applied to any open-weight model that is released.

    But it does sound a bit easier than it would be in practice. This heavily relies on re-purposing the Horde to also store training data (with user consent, of course), user scores, and later introduce a training queue.

  • AI Horde @lemmy.dbzer0.com

    Community-driven efficient LLM development - Possible?

  • Some chap invested a lot of time into making the Skyrim experience nicer. I recommend you check out CHIM :)

    Quite a lovely project, but you will have to spend some time to set things up. For example, if you have a good GPU available, you can set up TTS for NPCs, STT for yourself, and then a decent LLM to handle the world interactions. The NPCs then can listen to you talk, follow you, do stuff you tell them (like attack someone, or pick something off the floor), etc. It's something quite revolutionary, if you can spend the time to get it to work. If you're looking for some LLM provider on the cheap, nano-gpt has an 8 dollar per month tier that gives you "fair-use unlimited" access to open source models. Worth a shot!

    Note: You won't be able to run all the models and the game on the same computer. The CHIM wiki has some suggestions on the amount of compute needed, and alternatives for the services so that you don't have to run everything locally.

  • Wero is being rolled out slowly in Western Europe. I believe it's already a thing in Germany, France, Belgium, and followed soon by the Netherlands.

  • All great on paper, but why would EU leaders fight against Trump? They're all fragmented, trying to hold on to their countries' benefit. Given the most recent decision to cripple the 2035 ban on ICE cars due to pressure from Germany and Italy, I really doubt the leaders here are capable of punishing Trump.

    Even if von der Leyen pushed for this (which I doubt, see her behaviour in the first USA EU Deal), the EU would have to act unanimously to do it, and we already know that it is not possible. Given recent polls, it might be possible that Hungary will keep Orban as PM, and other countries will soon vote in their own MAGA-like leaders. See Romania, Bulgaria, and quite a few others in Eastern Europe where corruption is rampant. They'd sell their own mother if it got them a second villa.

    I wish to be as hopeful as the author, but the EU leaders keep proving the opposite. In order to survive, the EU should have focused on separating from the US during the Bush era, or even as late as Trump's first mandate. As it is, the EU will probably just wait for Trump's term to be over, and then return to business as usual. I am afraid we won't be seeing any big disturbances to the current world order for a while, unless the AI bubble bursts (which will probably be due to other factors, not EU intervention).

  • I have a friend who set up a Dreame L10s Ultra. I helped them solder the breakout board, and was there when they flashed the new firmware. Relatively straight forward! Just follow the guide on the website and you should be good.

    The robot is now accessible only on the local network, and they got it working in Home Assistant. The only feature that is missing now is direct camera view, which the original robot had. Basically, you could get a live feed of the robot's camers at any time. Looked fun, but it was not necessary.

  • This article just screams rage-bait. Not that I am against making people aware of this kind of privacy invasion, but the authors did not bother to do any fact checking.

    Firstly, they mention that the vacuum was "transmitting logs and telemetry that [the guy] had never consented to share". If you set up an app with the robot vacuum company, I'm pretty sure you'll get a rather long terms and services document that you just skip past, because who bothers reading that?

    Secondly, the ADB part is rather weird. The person probably tried to install Valetudo on it? Otherwise, I have no clue what they tried to say with "reprinting the devices’ circuit boards". I doubt that this guy was able to reverse engineer an entire circuit board, but was surprised when seeing that ADB is enabled? This is what makes some devices rather straight forward to install custom firmware that block all the cloud shenanigans, so I'm not sure why they're painting this as a horrifying thing. Of course, you're broadcasting your map data to the manufacturer so that you can use their shitty app.

    The part saying that it had full root access and a kill-switch is a bit worse, but still... It doesn't have to be like this. Shout-out to the people working on the Valetudo project. If you're interested in getting a privacy-friendly robot vacuum, have a look at their website. It requires some know-how, but once it's done, you know for sure you don't need to worry about a 3rd party spying on you.

  • Deleted

    Permanently Deleted

    Jump
  • That's pretty cool! Does anyone know if this allows one to play anti-cheat enabled games? Would be interesting to know if we can spoof HWID stuff with this to make it look like we're playing on an actual Windows device.

  • Right, but then rich people can no longer exploit other regions if everyone is considered equal! Think about the shareholders for a bit :/ (I'm sarcastic, in case it is not clear :D)

    The main problem here is that people flocking to positions of power are often the ones that do it for the wrong reasons. Until that part is sorted out, we will keep having leaders that will enforce things that are best for them and their closest ones. Some form of anarcho-communism would probably help this, but the current globalisation effort will make it very hard to implement. The best thing we can do as individuals is to just improve our social circle, and try to rely on as many local things as possible.

  • Even the comic-book bullies are better than this... The sad part is that the West will continue to lick the boot, hoping everyone will just forget. I really hope that that is not the case. What Greta did here is very impressive, and I hope that her spirit will inspire other young people to vote out these dumbfucks in government that try to do damage control in this situation.

  • Get a dog. I'm now forced to get up early to take it out, otherwise it will pee on my bed.

    (Do not actually get a pet if you cannot take care of them.)

  • Is the feather from a free-range seagull? I do not make deals when animals' welfare is at risk.

  • The propaganda machine goes brr on both sides. However, the side you're advocating for decided that the only way to resolve a diplomatic issue is to invade a country and murder its citizens. No matter how you spin it, it is still the case that the powerful are throwing away human lives for their own benefit. Isn't .ml supposed to side with the people usually, and not the rich people controlling the world currently?

  • Free and Open Source Software @beehaw.org

    Open Source Text-to-Speech and Speech-to-Text on Android?

  • Europe @feddit.org

    Advancing European Sovereignty in HPC with RISC-V

    eurohpc-ju.europa.eu /advancing-european-sovereignty-hpc-risc-v-2025-03-06_en
  • Europe @feddit.org

    EU Digital Sovereignty - Time to provide alternatives to US/Chinese big tech

    www.europarl.europa.eu /petitions/en/petition/content/0729%252F2024/html/Petition-No-0729%252F2024-by-N.-W.-%2528Austrian%2529-on-the-implementation-of-an-EU-Linux-operating-system-in-public-administrations-across-all-EU-countries
  • Europe @feddit.org

    We can all help Ukraine - UNITED24

    u24.gov.ua
  • Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ @lemmy.dbzer0.com

    Archiving papers using Zotero headless?

  • Technology @lemmy.world

    Redox OS 0.9.0 - Redox - Your Next(Gen) OS

    www.redox-os.org /news/release-0.9.0/
  • Linux @lemmy.ml

    Poll: GUI framework for widgets/apps in Wayland

  • Arch Linux @lemmy.ml

    Installing AUR packages after using archinstall

  • Modded Minecraft @sopuli.xyz

    Improving server performance for All the Mods 8

  • Linux @lemmy.ml

    Jump from Arch to NixOS?

  • Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ @lemmy.dbzer0.com

    Sites or Trackers for Exam Dumps