Machine Learning - Learning/Language Models

5 readers

1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

Model Catalog

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 2 years ago

MODERATORS

[email protected]

NousResearch/Nous-Hermes-Llama2-13b · Hugging Face (huggingface.co)

submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

Nous-Hermes-Llama2-13b is currently the highest ranked 13B LLaMA finetune on the Open LLM Leaderboard.

Model Description

Nous-Hermes-Llama2-13b is a state-of-the-art language model fine-tuned on over 300,000 instructions. This model was fine-tuned by Nous Research, with Teknium and Emozilla leading the fine tuning process and dataset curation, Redmond AI sponsoring the compute, and several other contributors.

This Hermes model uses the exact same dataset as Hermes on Llama-1. This is to ensure consistency between the old Hermes and new, for anyone who wanted to keep Hermes as similar to the old one, just more capable.

This model stands out for its long responses, lower hallucination rate, and absence of OpenAI censorship mechanisms. The fine-tuning process was performed with a 4096 sequence length on an 8x a100 80GB DGX machine.

Announcements

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here