カテゴリから選ぶ

Bobbie-model -

messages = [ "role": "user", "content": "Summarize this 20k token document..." ] inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device) output = model.generate(inputs, max_new_tokens=512, temperature=0.7) print(tokenizer.decode(output[0][inputs.shape[1]:])) Bobbie works out-of-the-box with vLLM 0.6.0+:

| Stage | Dataset | Tokens | Purpose | |-------|---------|--------|---------| | 1 | RedPajama (v2) | 1.2T | Base language modeling | | 2 | SlimPajama + CodeAlpaca | 400B | Code & reasoning | | 3 | Synthetic multi-turn chat | 50B | Instruction following | bobbie-model

Bobbie is not just another incremental fine-tune. It represents a thoughtful experiment in . messages = [ "role": "user", "content": "Summarize this

The research collective has hinted at a 13B version with Mixture of Depths (MoD) later this year. Until then, Bobbie-7B deserves a spot in your evaluation pipeline. Until then, Bobbie-7B deserves a spot in your

Bobbie loses marginally on standard benchmarks but dramatically outperforms on long-context retrieval (RULER). At 32k context, Bobbie is also 36% faster than Llama-3 due to its BiGLU and windowed attention strategy. 5. How to Use Bobbie-Model The model is available on Hugging Face as bobbie-collective/bobbie-7b-base and bobbie-7b-instruct . Transformers Example from transformers import AutoTokenizer, AutoModelForCausalLM import torch model_name = "bobbie-collective/bobbie-7b-instruct" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.bfloat16, device_map="auto" )