Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)M
Posts
10
Comments
15
Joined
1 yr. ago

  • Such dumbasses, even if this was a good strategy, they're still banning one company and let others (arguably more dangerous ones) go scot free

  • Alright, I'm waiting on the youtube playlist

  • Technically it supports fewer languages than whisper, 40 vs 99

    The main problem isn't "bother", it's training data. You need hundreds of thousands of hours of high quality transcripts to train models like these and that just doesn't exist for like zulu or whatever

  • LocalLLaMA @sh.itjust.works

    Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages

    arxiv.org /abs/2503.20212
  • I want to clarify something. Reranker is a general term that can refer to any model used for reranking. It is independent of implementation.

    What you refer to

    because reranker models look at the two pieces of content simultaneously and can be fine tuned to the domain in question. They shouldn't be used for the initial retrieval because the evaluation time is O(n²) as each combination of input

    Is a specific implementation known as CrossEncoder that is common for reranking models but not retrieval ones for the reasons you described. But you can also use any other architecture

  • On god

  • LocalLLaMA @sh.itjust.works

    Sentence transformers v4

  • Thumbnail looks a little odd when small. You may want to go for a more digital llama aesthetic

  • LocalLLaMA @sh.itjust.works

    NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

    electricalexis.github.io /notagen-demo/
  • autotracers can't generate svgs from text

  • Claude frequently draws svgs to illustrate things for me (I'm guessing it's in the prompt) but even though it's better at it than all the other models, it still kinda sucks. It's just fudamentally dumb task to do for a purely language model, similar to the arc-agi benchmark , just makes more sense for a vision model and trying to get an llm to do is a waste

  • LocalLLaMA @sh.itjust.works

    StarVector - a foundation model for generating svgs

    huggingface.co /starvector/starvector-1b-im2svg
  • what is the license? The link on hf just 404s

  • LocalLLaMA @sh.itjust.works

    EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News

    www.lgresearch.ai /news/view
  • Very similar to chain of draft but seems more thorough

  • LocalLLaMA @sh.itjust.works

    Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

    arxiv.org /abs/2503.05179
  • LocalLLaMA @sh.itjust.works

    Sorting-Free GPU Kernels for LLM Sampling

    flashinfer.ai /2025/03/10/sampling.html
  • LocalLLaMA @sh.itjust.works

    Reka Flash, open source 21B model comparable to QWQ 32B

  • It matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32

  • insane, absolutely insane

  • LocalLLaMA @sh.itjust.works

    Chain of Draft: Thinking Faster by Writing Less

    arxiv.org /abs/2502.18600
  • LocalLLaMA @sh.itjust.works

    Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1

    bsky.app /profile/sungkim.bsky.social/post/3ljgwfe3flk2h
  • good luck trying to run a video model locally

    Unless you have top tier hardware