LocalLLaMA

3556 readers

3 users here now

Welcome to LocalLLaMA! Here we discuss running and developing machine learning models at home. Lets explore cutting edge open source neural network technology together.

Get support from the community! Ask questions, share prompts, discuss benchmarks, get hyped at the latest and greatest model releases! Enjoy talking about our awesome hobby.

As ambassadors of the self-hosting machine learning community, we strive to support each other and share our enthusiasm in a positive constructive way.

Rules:

Rule 1 - No harassment or personal character attacks of community members. I.E no namecalling, no generalizing entire groups of people that make up our community, no baseless personal insults.

Rule 2 - No comparing artificial intelligence/machine learning models to cryptocurrency. I.E no comparing the usefulness of models to that of NFTs, no comparing the resource usage required to train a model is anything close to maintaining a blockchain/ mining for crypto, no implying its just a fad/bubble that will leave people with nothing of value when it burst.

Rule 3 - No comparing artificial intelligence/machine learning to simple text prediction algorithms. I.E statements such as "llms are basically just simple text predictions like what your phone keyboard autocorrect uses, and they're still using the same algorithms since <over 10 years ago>.

Rule 4 - No implying that models are devoid of purpose or potential for enriching peoples lives.

founded 2 years ago

MODERATORS

[email protected]

Pygmalion-2 has been released (pygmalionai.github.io)

submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

10 comments fedilink hide all child comments

I might be a bit late to the party, but for those of you that like ERP and fiction writing:

Introducing Pygmalion-2

The people from Pygmalion have released a new model, usable for roleplaying, conversation and storywriting. It is based on Llama 2 and has been trained on SFW and NSFW roleplay, fictional stories and instruction following conversations. It is available in two sizes, 7b and 13b parameters. They're also releasing a mix with MythoMax-L2 called Mythalion 13B.

Furthermore they're (once again) announcing a website with character sharing and inference (later in october.)

For reference: Pygmalion-6b has been a well known dialogue model for (lewd) roleplay in the times before LLaMA. It had been followed up with an underwhelming successor based on LLaMA (Pygmalion-7b). In their new blogpost they promise to have improved with their new model.

(Personally, I'm curious how it performs compared to MythoMax. There aren't many models around, that excel at roleplay or have been designed specifically for that use case.)

all 11 comments

sorted by: hot top controversial new old

[–] [email protected] 4 points 2 years ago* (last edited 2 years ago) (1 children)

~~Perhaps I'm blind, but I didn't see any downloads on that page.~~

I found the download links.

Oh, and of course The Bloke quantized the models already:

[–] [email protected] 4 points 2 years ago* (last edited 2 years ago) (1 children)

Sorry. I was under the impression that everyone interested in new models has TheBloke's HuggingFace profile on speed-dial. I should have linked them ;-)

[–] [email protected] 2 points 2 years ago

I do, but brain farted and just kept looking on Pygmalion's site.

[–] [email protected] 2 points 2 years ago (1 children)

Very cool that they have a mix with MythoMax right out of the gate. It'll be interesting to see the differences between MythoMax/Pygmalion-2/Mythalion as everyone kicks the tires.

[–] [email protected] 3 points 2 years ago (1 children)

I did some quick testing yesterday and my initial impressions were that Mythalion and Pyg2 (13B q5_K_M versions btw) were a bit more eloquent and verbose in some situations, but they would often take this too far and start writing novels instead of a dialogue. It also felt like they were more prone to take a sentence and repeat it verbatim as part of all their turns. It's possible that these issues could be toned down by adjusting generation parameters, but MythoMax has been very easy to get good results out of.

It's interesting that you can specify which "mode" pyg2 should operate in as part of system prompts but I didn't test how much difference it actually makes on generation. I told it to be in "instruction following mode" and it seemed good enough at general tasks as well.

If I understand pyg2's model card you're supposed to prefix all turns with <|user|> or <|model|> which I didn't manage to get text-generation-webui to do in chat-instruct mode, so I just used the notepad tab instead.

[–] [email protected] 4 points 2 years ago* (last edited 2 years ago) (1 children)

text-generation-webui "chat" and "chat-instruct" modes are... weird and badly documented when it comes to using a specific prompt template. If you don't want to use the notepad mode, use "instruct" mode and set your turn template with the required tags and include your system prompt in the context (? I forget what it is labeled as) box.

EDIT: Actually I think text-generation-webui might use <|user|> as a special string to mean "substitute the user prefix set in the box directly above the turn template box". Why they have to have a turn template field with "macro" functionality and then separate fields for user and bot prefixes when you could just... put the prefix directly in the turn template I have no idea. It's not as though you would ever want or need to change one without the other anyway. But it's possible that as a result of this you can't actually use <|user|> itself in the turn template...

[–] [email protected] 2 points 2 years ago (1 children)

Seems easier with SillyTavern. They've included screenshots with recommended settings for that in the blog post.

[–] [email protected] 4 points 2 years ago (1 children)

TBH my experience with SillyTavern was that it merely added another layer of complexity/confusion to the prompt formatting/template experience, as it runs on top of text-generation-webui anyway. It was easy for me to end up with configurations where e.g. the SillyTavern turn template would be wrapped inside the text-generation-webui one, and it is very difficult to verify what the prompt actually looks like by the time it reaches the model as this is not displayed in any UI or logs anywhere.

For most purposes I have given up on any UI/frontend and I just work with llama-cpp-python directly. I don't even trust text-generation-webui's "notebook" mode to use my configured sampling settings or to not insert extra end-of-text tokens or whatever.

[–] [email protected] 3 points 2 years ago

I had exactly the same experiences. I use Koboldcpp and also oftentimes the notebook mode. SillyTavern is super complex and difficult to understand. In this case it's okay. I can copy-paste from screenshots (unless the UI changes).