login::  password::




cwbe coordinatez:
101
63533
63608
8771344
9005291

ABSOLUT
KYBERIA
permissions
you: r,
system: public
net: yes

neurons

stats|by_visit|by_K
source
tiamat
K|my_K|given_K
last
commanders
polls

total descendants::6
total children::3
2 K

show[ 2 | 3] flat


https://github.com/karpathy/minGPT

Andrej ma taketo pekne repo, aj s nejakymi ukazkami. Viete si natrenovat vlastne GPT from scratch.
Teoreticky 124M model by mala zvladnut aj grafika s 8GB vram.

Napriklad mne sa podarilo rozbehat finetuning 124M GPT-2 modelu (https://github.com/minimaxir/gpt-2-simple ) na mojom notebooku s RTX2070/8GB. Sice s trochou hackingu, chce to tensorflow-gpu@1.13.1 a gpt-2-simple@0.6 a Ubuntu 18.. Najviac ostara bolo nainstalovat CUDA v spravnej verzii pre dotycny tensorflow (CUDA 10)

Jop a rozbehaval som to cez WSL2 na win11. Takze napriklad mam WSL2 s ubuntu18 kde je CUDA 10, potom WSL2 s ubuntu 20, kde bezim CUDA11 a stablle diffusion.. Ten CUDA bridge z win do WSL2 funguje krasne.




000001010006353300063608087713440900529109012711
kyberbubus
 kyberbubus      23.10.2022 - 17:45:58 , level: 1, UP   NEW
ja si vlastné gpt-2 from scratch trénujem cez finetuning na tomto, tiež 124M, je tam nejaký rozdiel? (okrem toho, že to bežím na diaľku)
https://colab.research.google.com/github/sarthakmalik/GPT2.Training.Google.Colaboratory/blob/master/Train_a_GPT_2_Text_Generating_Model_w_GPU.ipynb

×÷ßßß$ˇ~[☼◙ş→☻ü84ó♀ÇüŮń§►♫☺♀♂ć☺<\ˇ

00000101000635330006360808771344090052910901271109012750
drakh
 drakh      23.10.2022 - 21:30:08 , level: 2, UP   NEW
tak, bude trenujes from scratch, alebo finetunujes :)

btw v google collab ide finetunovat aj nad tym strednym modelom povacsinou, aspon ten minimaxirov simple-gpt2

0000010100063533000636080877134409005291090127110901275009012786
kyberbubus
 kyberbubus      24.10.2022 - 01:10:08 , level: 3, UP   NEW
Mne to prišlo narovnako, pretože to asi nemá slovenčinu v korpuse, takže sa ju učilo z ničoho, ale pomerne rýchlo mi to začalo dávať čistú slovenčinu, v ktorej sa to už iba zdokonaľovalo, napriek tomu, že model mal byť natrénovaný z drvivej väčšiny na angličtine?

×÷ßßß$ˇ~[☼◙ş→☻ü84ó♀ÇüŮń§►♫☺♀♂ć☺<\ˇ

000001010006353300063608087713440900529109012711090127500901278609012805
drakh
 drakh      24.10.2022 - 09:13:12 , level: 4, UP   NEW
tak ono tam je dolezite ze ten BPE encoder (hoc je "optimalizovany "na EN) rozdeluje rovnako akekolvek stringy, takze slovencinu sa to dotrenuje velmi jednoducho.

000001010006353300063608087713440900529109005927
drakh
 drakh      30.09.2022 - 09:27:15 (modif: 30.09.2022 - 09:31:10), level: 1, UP   NEW !!CONTENT CHANGED!!
https://github.com/drakh/minGPT/blob/master/projects/gpt-2/gpt-2.py


import os
import sys
import torch
from mingpt.model import GPT
from mingpt.bpe import get_encoder, BPETokenizer
from mingpt.trainer import Trainer
from torch.utils.data import Dataset
from torch.utils.data.dataloader import DataLoader
from mingpt.utils import set_seed, setup_logging, CfgNode as CN

class CharDataset(Dataset):
"""
Emits batches of characters
"""

@staticmethod
def get_default_config():
C = CN()
C.block_size = 1024
return C

def __init__(self, config, data):
self.encoder = get_encoder()
self.config = config

encoded = self.encoder.encode(data)
data_size = len(encoded)
vocab_size = len(self.encoder.encoder)
print('data: %d tokens, vocab_size: %d' % (data_size, vocab_size))

self.vocab_size = vocab_size
self.data = encoded

def get_vocab_size(self):
return self.vocab_size

def get_block_size(self):
return self.config.block_size

def get_item(self, idx):
chunk = self.data[idx:idx + self.config.block_size + 1]
x = torch.tensor(chunk[:-1], dtype=torch.long)
y = torch.tensor(chunk[1:], dtype=torch.long)
return x, y

def __len__(self):
return len(self.data) - self.config.block_size

def __getitem__(self, idx):
return self.get_item(idx)

# -----------------------------------------------------------------------------

print('loading data')
text = open('input.bak.txt', 'r').read()
print('data loaded')

print('preparing dataset')
train_dataset = CharDataset(CharDataset.get_default_config(), text)

model_config = GPT.get_default_config()
model_config.model_type = 'gpt2'
model_config.vocab_size = train_dataset.vocab_size
model_config.block_size = train_dataset.get_block_size()
model = GPT(model_config)

train_config = Trainer.get_default_config()
# with this you can train your model on 8GB VRAM, as far i know OpenAI used 512
train_config.batch_size = 1
trainer = Trainer(train_config, model, train_dataset)

tokenizer = BPETokenizer()

def batch_end_callback(trainer):
if trainer.iter_num % 10 == 0:
print(f"iter_dt {trainer.iter_dt * 1000:.2f}ms; iter {trainer.iter_num}: train loss {trainer.loss.item():.5f}")

if trainer.iter_num % 500 == 0:
# evaluate both the train and test score
model.eval()
with torch.no_grad():
# sample from the model...
context = "This is our starter text"
x = tokenizer(context).to(trainer.device)
y = model.generate(x, 500, temperature=1.0, do_sample=True, top_k=10)[0]
decoded = tokenizer.decode(y)
print(decoded)
# save the latest model
print("saving model")
ckpt_path = os.path.join('./', "model.pt")
torch.save(model.state_dict(), ckpt_path)
# revert model to training mode
model.train()

trainer.set_callback('on_batch_end', batch_end_callback)

trainer.run()



000001010006353300063608087713440900529109005828
drakh
 drakh      29.09.2022 - 19:38:20 , level: 1, UP   NEW
tak, rozobral som si ten jeho "chargpt" ukazku a upravil som si to tak aby mi to fungovalo s BPE encodingom (teda ta ista tokenizacia ako GPT)

s block_size=1024 (dlzka kontextu co vie drzat), to iste co GPT-2 a batch_size=1, ak som nasiel spravny udaj tak OpenAI pouzilo pri GPT-2 bacth_size=512, mi to vie trenovat 124M model from scratch na mojej grafike..