Пытаюсь установить гуанако (pip install guanaco) для модели классификации текста, но получаю ошибку
Я пытаюсь установить языковую модель гуанако https://arxiv.org/abs/2305.14314 , используяpip install guanaco
для модели классификации текста, но возникает ошибка.
Failed to build guanaco
ERROR: Could not build wheels for guanaco, which is required to install pyproject.toml-based projects
Как установить языковую модель и использовать ее для классификации?
1 ответ
Библиотека PyPI, которую вы установили черезpip install guanaco
это не большая языковая модель, поддерживаемая инструментом Huggingface, это: https://pypi.org/project/guanaco/
Чтобы использовать модель Гуанако, см. https://colab.research.google.com/drive/17XEqL1JcmVWjHkT-WczdYkJlNINacwG7?usp=sharing .
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer, LlamaTokenizer, StoppingCriteria, StoppingCriteriaList, TextIteratorStreamer
model_name = "decapoda-research/llama-7b-hf"
adapters_name = 'timdettmers/guanaco-7b'
print(f"Starting to load the model {model_name} into memory")
m = AutoModelForCausalLM.from_pretrained(
model_name,
#load_in_4bit=True,
torch_dtype=torch.bfloat16,
device_map={"": 0}
)
m = PeftModel.from_pretrained(m, adapters_name)
m = m.merge_and_unload()
tok = LlamaTokenizer.from_pretrained(model_name)
tok.bos_token_id = 1
stop_token_ids = [0]
print(f"Successfully loaded the model {model_name} into memory")
Затем использовать модель:
prompt = "Today was an amazing day because"
inputs = tok(prompt, return_tensors="pt")
outputs = m.generate(**inputs, do_sample=True, num_beams=1, max_new_tokens=100)
tok.batch_decode(outputs, skip_special_tokens=True)
[вне]:
['Today was an amazing day because I met M, my bestie from Bermuda in 2002.\nWe have not seen each other for 8 years and I was thrilled to meet her and her husband. We went out for lunch and then went for a walk in the park. We caught each other up on our lives and just laughed and laughed. I love her so much and I am so glad we are back in touch. It was like no time had passed at all. I am so']
Чтобы использовать его для классификации с нулевым выстрелом:
from transformers import pipeline
tok.add_special_tokens({'pad_token': '[PAD]'})
classifier = pipeline("zero-shot-classification", model=m, tokenizer=tok)
classifier("Today was an amazing day", candidate_labels=["negative", "positive"])
[вне]:
{'sequence': 'Today was an amazing day',
'labels': ['positive', 'negative'],
'scores': [0.7662936449050903, 0.23370634019374847]}