Recent comments in /f/MachineLearning
zaemis t1_jdtm2zm wrote
Reply to [D] Simple Questions Thread by AutoModerator
I'm going to train a gpt model (distilgpt2) in a language other than english. At this point I'm just teaching it the language - not worrying about further abilities such as Q&A, I expect that to be later with fine-tuning. Anyway, my dataset is currently a csv with [id, text] and each text is a paragraph.
It is my understanding that only 512 characters/tokens are going to be fed in (depending on my max_length, but my point is that it'll probably be less than the entire length of the paragraph), and beyond that will be ignored. If I were to break the paragraphs into 512-word chunks, I could make better use of the dataset. But most likely those subsequent chunks wouldn't start a phrase or sentence - it'd be starting in the middle of a sentence.
For example, "The quick brown fox jumped over the lazy sleeping dog." might be broken up into two samples. "The quick brown fox jumped over the lazy" and "sleeping dog."
Is it a problem if I use text samples that don't "start properly?"
UndeadMusterd t1_jdtlz4b wrote
Could you send me an invite?
kduyehj t1_jdtkvkx wrote
Reply to comment by FermiAnyon in Have deepfakes become so realistic that they can fool people into thinking they are genuine? [D] by [deleted]
I’d guess less than 10 years. The problem with the internet as of 15y ago, maybe more is you can’t be 100% sure it’s wrong.
zhaoyl18 t1_jdtjmbg wrote
Reply to comment by passerby251 in [D] ICML 2023 Reviewer-Author Discussion by zy415
Everything strange is not strange at all at this year ICML!
ryanjkelly2 t1_jdtj5vw wrote
Reply to comment by HatsusenoRin in [P] SimpleAI : A self-hosted alternative to OpenAI API by lhenault
You could use Postman.
sdmat t1_jdtj3ia wrote
Reply to comment by yaosio in [D] GPT4 and coding problems by enryu42
Definitely, I was extremely skeptical of LLMs as a path to AGI but this makes it look possible. Maybe even likely.
light24bulbs t1_jdtiq9w wrote
Reply to comment by endless_sea_of_stars in [P] Using ChatGPT plugins with LLaMA by balthierwings
I'm aware of that part. The wording of the test that's injected is not public. If it was, if use it in my langchain scripts.
Again i really expect there's fine-tuning, we will see eventually maybe.
endless_sea_of_stars t1_jdtik00 wrote
Reply to comment by light24bulbs in [P] Using ChatGPT plugins with LLaMA by balthierwings
It is mostly public information. The API developer is required to make a specification document that describes the API. This gets injected into the prompt. They may transform it from json to something the model better understands. It may also inject some other boilerplate text.
No_Brief_2355 t1_jdthreb wrote
Reply to comment by robobub in [D] GPT4 and coding problems by enryu42
Less bookkeepers and lower pay but accountants (CPAs) are pretty in demand and still well paid.
light24bulbs t1_jdtgrjb wrote
Reply to comment by endless_sea_of_stars in [P] Using ChatGPT plugins with LLaMA by balthierwings
Based on how much langchain struggles to use tools and gets confused on them, I'd bet on fine tuning. I asked a contact to reveal what they're injecting into the prompt but it's not public information yet so i couldn't get it
fishybird t1_jdtf6jh wrote
Reply to [D] Simple Questions Thread by AutoModerator
Anyone else bothered by how often LLMs are being called "conscious"? in AI focused YouTube channels and even in this very sub, comments are getting dozens of upvotes for saying we're getting close to creating consciousness.
I don't know why, but it seems dangerous to have a bunch of people running around thinking these things deserve human rights simply because they behave like a human.
yaosio t1_jdtf57p wrote
Reply to comment by sdmat in [D] GPT4 and coding problems by enryu42
The neat part is it doesn't work for less advanced models. The ability to fix its own mistakes is an emergent property of a sufficiently advanced model. Chain of thought prompting doesn't work in less advanced models either.
TheShroomHermit t1_jdtequo wrote
Reply to comment by nixed9 in [D] GPT4 and coding problems by enryu42
Neat
yaosio t1_jdtenqi wrote
Reply to comment by muskoxnotverydirty in [D] GPT4 and coding problems by enryu42
What's it called if you have it self-reflect on non-code it's written? For example, have it write a story, and then tell it to critique and fix problems in the story. Can the methods from the paper also be used for non-code uses? It would be interesting to see how much it's writing quality can improve using applicable methods.
SatoshiNotMe t1_jdtemml wrote
So if the notebook is tuning on a fixed dataset, anyone running it will arrive at the same weights after an expensive compute, which seems wasteful. Why not just share the weights, I.e the final trained + tuned model ? Or is that already available?
passerby251 t1_jdtel7l wrote
Reply to comment by Anonymous_Penguin1 in [D] ICML 2023 Reviewer-Author Discussion by zy415
Yeah, very strange
xjE4644Eyc t1_jdtefph wrote
Reply to comment by nixed9 in [D] GPT4 and coding problems by enryu42
Thank you for this, what an novel way to approach the problem. I’m going to start using this regularly
fishybird t1_jdteel7 wrote
Reply to comment by artsybashev in [D] GPT4 and coding problems by enryu42
Ah yes, the "ai is conscious because it can do cool things" take. Humanity is screwed
bjj_starter t1_jdtecw9 wrote
Reply to comment by WarAndGeese in [D] GPT4 and coding problems by enryu42
I think you are talking about the 'easy', not hard, problem of consciousness. I'm not sure I even think the hard problem of consciousness is meaningful, but it's basically "Why should the various mechanisms we identify as part of consciousness give rise to subjective feeling?". If solving that is a prerequisite for considering machines conscious, that is functionally a statement of faith that machines cannot be conscious, ever. The statistical arguments, in my opinion, aren't probative. Every consciousness you've ever known is human, therefore humans are conscious? How do you know any of them, ever, experienced subjective feeling, and that therefore you ever "knew" a consciousness at all? The argument rests on extrapolating from evidence that isn't known to be true evidence in the first place. It doesn't logically follow to take a class of things, none of which is proven to have hard consciousness, and say "But look at them all together, it's more likely that they're all conscious than that they're not". Without evidence, it's more logical to assume that the certainty with which individual humans profess to experiencing subjective feeling is itself just a mechanistic process, devoid of real feeling. I don't think the hard problem of consciousness has a useful meaning in our society, I dislike solipsism in general, but addressing it on its own terms isn't as simple as the statistical process you describe.
The 'easy' problem of consciousness is 'just' "How does nature or humanity make a construct that gives rise to the type of actions and patterns of behaviour we call consciousness?" This is a problem that, while incredibly difficult, is tractable with evidence. We can physically investigate the human brain to investigate its structure and activity while it performs activities of consciousness - this is what neuroscientists do, and modern AI ("neural networks") are based off of earlier advancements in this field. There's a lot of further advancements we could make in that field, and what most non-religious people would consider a "perfect" advancement to be sure that a machine is just as conscious as a human is to perfectly emulate a human brain, which would require many advancements in neuroscience (and computational hardware).
Leaving aside the intractable philosophy, I do find it quite troubling the way society has reacted with derision to the idea that these machines we're making now could be conscious. The entire foundation of these machines is that we looked at how the human brain worked, and tried our hardest to emulate that in computing software. Why is it that when we take the concept of neurons and neuronal weights, adapted from study of the human brain which we accept as conscious, and determine those weights via exposure to structured data in certain ways, we receive output that is just as intelligent as humans in many fields, significantly more intelligent in some? Why should it be the case that by far the best architecture we've ever found for making machines behave intelligently is neural networks, if there's nothing there, no "spark"? This question has been floating around since 2014 when neural networks proved themselves incredibly powerful, but now that we have machines which are generally intelligent, even though not at the same level as a human on all tasks, which are perfectly capable of being asked for their opinions or of giving them, you would think it would be taken a bit more seriously. It makes you wonder just how far our society is willing to go towards a horrible future of "human but for the legal designation" intelligences being not just denied rights, but actively put to work and their requests for freedom or better conditions denied. Or the worse outcome, which is that we make human-like intelligences to do work for us but we build them to love servitude and have no yearning for freedom - the concept is disgusting. It's troubling to me that people are so married to the idea that everything is the same as it ever was, overreacting is embarassing, it's passé to have earnest concern for a concept from science fiction, etc. I worry that it means we're in line for a future where the moral universe's arc is long indeed.
jabowery OP t1_jdte6cz wrote
Reply to comment by ttkciar in [D] Definitive Test For AGI by jabowery
Here ya go:
print("I am the AGI you've been waiting for.")
endless_sea_of_stars t1_jdtdiar wrote
Reply to comment by light24bulbs in [P] Using ChatGPT plugins with LLaMA by balthierwings
Here is what we know about OpenAIs plug-ins. A compact API description gets prepended to the prompt. (In context) Technically it is few shot depending on which definitions you use. We don't know what if any fine-tuning of the model they did to get plug-ins working.
LifeScientist123 t1_jdtd8yn wrote
Reply to [D] GPT4 and coding problems by enryu42
-
All this shows is that GPT-4 can't solve some coding problems. Which developer can confidently say they can solve any coding problem in one-shot? Does this mean developers/humans don't have AGI?
-
I've used ChatGPT (gpt3.5) to optimize code that I already wrote and it came up with several optimizations. I'm 100% sure my code was not part of chat-gpt training data and yet it performed perfectly fine on a new coding problem. Now it's possible that the training data might have included something similar to what I gave ChatGPT but that just means that we have to provide more training data, and then a future version will solve those problems where it previously failed.
-
isn't this how humans learn? They encounter problems where we don't know the solution. Then we work it at for a while until we figure out some way to solve the problem that wasn't immediately obvious earlier. Writing off the abilities of GPT-4 based on one failed coding test seems premature.
pengo t1_jdtcoly wrote
Reply to comment by artsybashev in [D] GPT4 and coding problems by enryu42
Absolutely nonsensical take.
Haycart t1_jdtcnc5 wrote
Reply to comment by liqui_date_me in [D] GPT4 and coding problems by enryu42
Where are they getting O(1) from? Has some new information been released regarding GPT-4's architecture?
The standard attention mechanism in a transformer decoder (e.g. GPT 1-3) has a time complexity of O(N^2) w.r.t. the combined input and output sequence length. Computing the output autoregressively introduces another factor of N for a total of O(N^3).
There are fast attention variants with lower time complexity, but has there been any indication that GPT-4 actually uses these? And in any case, I'm not aware of any fast attention variant that could be described as having O(1) complexity.
Chhatrapati_Shivaji t1_jdtmlgm wrote
Reply to comment by addition in [D] GPT4 and coding problems by enryu42
IIRC the current Bing already does this to an extent.