If I can pick my own API (including local) and sampling parameters
You can do this now:
- selfhost ollama.
- selfhost open-webui and point it to ollama
- enable local models in about:config
- select "local" instead of ChatGPT or w/e.
Hardest part is hosting open-webui because AFAIK it only ships as a docker image.
Edit: s/openai/open-webui
Basically everything its used for that isn't being shoved in your face 24/7.
Lots of these existed before the AI hype to the point they're taken for granted, but they are as much AI an LLM or image generator. All the consumer level AI services range from annoying to dangerous.