Skip Navigation

morrowind

@ morrowind @lemm.ee

Posts

10
Comments

15
Joined

1 yr. ago

11mo ago

Locked

lemm.ee is shutting down at the end of this month
Jump
morrowind @lemm.ee 11mo ago
We hav sub mods here too. The difference is the admins

1y ago

Trump administration reportedly considers a US DeepSeek ban | TechCrunch

morrowind @lemm.ee 1y ago

Such dumbasses, even if this was a good strategy, they're still banning one company and let others (arguably more dangerous ones) go scot free

1y ago

Societal rules

morrowind @lemm.ee 1y ago

Alright, I'm waiting on the youtube playlist

1y ago

Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages

morrowind @lemm.ee 1y ago

Technically it supports fewer languages than whisper, 40 vs 99

The main problem isn't "bother", it's training data. You need hundreds of thousands of hours of high quality transcripts to train models like these and that just doesn't exist for like zulu or whatever

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Dolphin: A Large-Scale Automatic Speech Recognition Model for Eastern Languages

arxiv.org /abs/2503.20212

1y ago

Sentence transformers v4

1

morrowind @lemm.ee 1y ago

I want to clarify something. Reranker is a general term that can refer to any model used for reranking. It is independent of implementation.

What you refer to

because reranker models look at the two pieces of content simultaneously and can be fine tuned to the domain in question. They shouldn't be used for the initial retrieval because the evaluation time is O(n²) as each combination of input

Is a specific implementation known as CrossEncoder that is common for reranking models but not retrieval ones for the reasons you described. But you can also use any other architecture

1y ago

Zoomers & Boomers are the same

morrowind @lemm.ee 1y ago

On god

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Sentence transformers v4

1y ago

Some updates on community changes and future goals (03-28-2025)

1

morrowind @lemm.ee 1y ago

Thumbnail looks a little odd when small. You may want to go for a more digital llama aesthetic

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

electricalexis.github.io /notagen-demo/

1y ago

StarVector - a foundation model for generating svgs

1

morrowind @lemm.ee 1y ago

autotracers can't generate svgs from text

1y ago

StarVector - a foundation model for generating svgs

morrowind @lemm.ee 1y ago

Claude frequently draws svgs to illustrate things for me (I'm guessing it's in the prompt) but even though it's better at it than all the other models, it still kinda sucks. It's just fudamentally dumb task to do for a purely language model, similar to the arc-agi benchmark , just makes more sense for a vision model and trying to get an llm to do is a waste

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

StarVector - a foundation model for generating svgs

huggingface.co /starvector/starvector-1b-im2svg

1y ago

EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News

1

morrowind @lemm.ee 1y ago

what is the license? The link on hf just 404s

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

EXAONE Deep ━ Setting a New Standard for Reasoning AI - LG AI Research News

www.lgresearch.ai /news/view

1y ago

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

morrowind @lemm.ee 1y ago

Very similar to chain of draft but seems more thorough

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching

arxiv.org /abs/2503.05179

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Sorting-Free GPU Kernels for LLM Sampling

flashinfer.ai /2025/03/10/sampling.html

1y ago

Reka Flash, open source 21B model comparable to QWQ 32B

morrowind @lemm.ee 1y ago

More info here https://www.reka.ai/news/introducing-reka-flashHF: https://huggingface.co/RekaAI/reka-flash-3

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Reka Flash, open source 21B model comparable to QWQ 32B

1y ago

Qwen/QwQ-32B · Hugging Face

morrowind @lemm.ee 1y ago

It matches R1 in the given benchmarks. R1 has 671B params (36 activated) while this only has 32

1y ago

Qwen/QwQ-32B · Hugging Face

2

morrowind @lemm.ee 1y ago

insane, absolutely insane

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Chain of Draft: Thinking Faster by Writing Less

arxiv.org /abs/2502.18600

LocalLLaMA @sh.itjust.works

morrowind @lemm.ee

1y ago

Atom of Thoughts (AOT): lifts gpt-4o-mini to 80.6% F1 on HotpotQA, surpassing o3-mini and DeepSeek-R1

bsky.app /profile/sungkim.bsky.social/post/3ljgwfe3flk2h

1y ago

Alibaba Releases Advanced Open Video Model, Immediately Becomes AI Porn Machine

2

morrowind @lemm.ee 1y ago

good luck trying to run a video model locally

Unless you have top tier hardware