Recent comments in /f/MachineLearning
Daveboi7 t1_jdneqdx wrote
Reply to comment by dreamingleo12 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Ah. I’m trying to train the Dolly model created developed databricks.
dreamingleo12 t1_jdnel6b wrote
Reply to comment by Daveboi7 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
You can just follow Stanford Alpaca’s github instructions, as long as you have LLaMA weights. It’s straightforward.
Daveboi7 t1_jdnedrd wrote
Reply to comment by dreamingleo12 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
But which cloud service did you use to train them?
I tried using databricks to train a model but the setup was too complicated.
I’m wondering is there a more straightforward platform to train on?
mrfreeman93 t1_jdnebrv wrote
Reply to [R] Reflexion: an autonomous agent with dynamic memory and self-reflection - Noah Shinn et al 2023 Northeastern University Boston - Outperforms GPT-4 on HumanEval accuracy (0.67 --> 0.88)! by Singularian2501
I think it was aleady well known that it would fix its own errors when provided the error message, this is not a breakthrough
dreamingleo12 t1_jdndzmt wrote
Reply to comment by Daveboi7 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
No I don’t use databricks. I only tried LLaMA and Alpaca.
Daveboi7 t1_jdndvq0 wrote
Reply to comment by dreamingleo12 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
With databricks?
dreamingleo12 t1_jdndszl wrote
Reply to comment by Daveboi7 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
I trained the model using cloud
Daveboi7 t1_jdnczd9 wrote
Reply to comment by dreamingleo12 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
My bad. Did you train the model locally on your PC or using cloud?
mxby7e t1_jdncs51 wrote
Reply to comment by ebolathrowawayy in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
From my understanding its limited to no commercial use, so you can use it for what you need, but not commercially.
ebolathrowawayy t1_jdnc05i wrote
Reply to comment by mxby7e in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
But what if you're training a model for a narrow use-case and don't intend for anyone to use it except for a niche set of users? Is that enough to be in the clear? Or is any use of OpenAI's model output to train a model for any purpose a no-no?
bumbo-pa t1_jdnbwyr wrote
Reply to comment by Zealousideal_Low1287 in [D] Do you use a website or program to organise and annotate your papers? by who_here_condemns_me
I write crimson notes on my cave wall.
EchoMyGecko t1_jdnbrve wrote
Reply to [D] Do you use a website or program to organise and annotate your papers? by who_here_condemns_me
My annotations make too little sense for anything but me to possibly understand them :)
I save them as pdfs and write notes in the margins.
RiyazRockz t1_jdnbroi wrote
Reply to comment by machineko in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
Hey, I want to fine tune a model to solve a pharma related problem. I want to know if I can fine tune my model with this.. Could you please share your contact details so that I can learn about this more?
Fit-Recognition9795 t1_jdnbjar wrote
Reply to comment by Eggy-Toast in [D] What happens if you give as input to bard or GPT4 an ASCII version of a screenshot of a video game and ask it from what game it has been taken or to describe the next likely action or the input? by Periplokos
Even worse, gpt-4 doesn't know it is gpt-4.
I have chatgpt-plus and the above answer is generated using the gpt-4 model.
Thebadwolf47 t1_jdnbfya wrote
Reply to comment by Short_Change in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
wasn't he rather comparing the parameters to the volume of the first computer and not their transistor count?
Simusid t1_jdnag2i wrote
Reply to comment by RiotSia in [D] Simple Questions Thread by AutoModerator
I’m unable to connect to hamata.so. Can you tell me what kind of analysis you want to do?
philipgutjahr t1_jdn95o8 wrote
Reply to comment by michaelthwan_ai in [N] March 2023 - Recent Instruction/Chat-Based Models and their parents by michaelthwan_ai
for completeness, you should also add all those proprietary models: Megatron-Turing (530B, NVIDIA), Gopher (280B, Google), Chinchilla (70B, DeepMind) and Chatgenie (WriteCream)
currentscurrents t1_jdn7spo wrote
Reply to comment by pornthrowaway42069l in [N] GPT-4 has 1 trillion parameters by mrx-ai
Bigger models are more sample efficient for a given amount of data.
Scale is a triangle of three factors; model size, data size, and compute size. If you want to make more efficient use of data, you need to increase the other two.
In practice LLMs are not data limited right now, they're limited by compute and model size. Which is why you see models like LLaMa that throw huge amounts of data at smaller models.
Eggy-Toast t1_jdn7cki wrote
Reply to comment by Fit-Recognition9795 in [D] What happens if you give as input to bard or GPT4 an ASCII version of a screenshot of a video game and ask it from what game it has been taken or to describe the next likely action or the input? by Periplokos
ChatGPT doesn’t know that GPT-4 has multimodal input though, right? I assume based on “not [designed] to analyze images or visual data” this is the case.
Puzzleheaded_Acadia1 t1_jdn6ugl wrote
Reply to comment by soggy_mattress in [D] Do we really need 100B+ parameters in a large language model? by Vegetable-Skill-9700
I see a future where LLMs or llamas that are multimodels or any other new kind artificial intelligence run on esp32 level of hardware i don't know how that will work but I'm pretty sure we are heading there
pornthrowaway42069l t1_jdn6noe wrote
Reply to [N] GPT-4 has 1 trillion parameters by mrx-ai
Not going to deny that GPT-4 looks impressive, but, they could set up 10 bajillion-quadrillion parameters, question is, do they have the data to effectively utilize all of these? Maybe its time to start looking into decreasing number of parameters, and making more efficient use of the data.
Professional_Price89 t1_jdn6lhn wrote
Reply to [D] I just realised: GPT-4 with image input can interpret any computer screen, any userinterface and any combination of them. by Balance-
Lets integrate winSpy with it
ajingnk t1_jdn5uwr wrote
Reply to [D] Simple Questions Thread by AutoModerator
What is the minimum hardware requirement to fine tune like Stanford Alpaca? I am thinking to build a workstation to do some DL exploration and fine-tuning work. For fine-tuning, I have around 10k samples.
Ph0masta t1_jdn5o16 wrote
Where does Google’s LAMDA fit on this chart?
dreamingleo12 t1_jdnewt2 wrote
Reply to comment by Daveboi7 in [R] Hello Dolly: Democratizing the magic of ChatGPT with open models by austintackaberry
It’s just Alpaca with a different base model. Databricks boasted too much.