topic proposal: auto-completion #4

asteroidb612 · 2024-03-07T20:22:36Z

Original proposal:

I've been working on a Roc project recreating this yot neural network engine https://github.com/karpathy/micrograd, and think it might be an interesting chapter.

Revised proposal (see thread below): show how autocompletion works.

asteroidb612 · 2024-03-07T20:24:51Z

One reason this could be a bad chapter idea is that it requires some Calculus thinking! But if this is for schools, maybe there's a lot of calculus going along there anyway.

isaacvando · 2024-03-07T23:30:26Z

I'd love to read that chapter!

gvwilson · 2024-03-08T02:14:15Z

My concern isn't with the math requirements, but whether programmers use neural networks when they're programming: most of the other tools are things like editors and linters that crop up regularly when building and deploying code.

asteroidb612 · 2024-03-08T04:03:43Z

A year ago, I would have agreed immediately - I never used any machine learning tools while learning to code. But I'm starting to see them used more, and I believe that ChatGPT was used as a low-reliability but occasionally very helpful tool in making Roc.

Maybe those cases were as obscure as making parsers! But maybe they're commonly useful? I am finding myself using ChatGPT as a faster-than-documentation search for how to use various libraries or frameworks.

Anton-4 · 2024-03-08T11:56:25Z

I use chatGPT almost everyday when working on Roc :) Neural nets also power many autocomplete tools. It's also possible to make the chapter about a tool that connects to e.g. the chatGPT API to avoid getting into the math too much.

gvwilson · 2024-03-08T13:21:54Z

I think that implementing a neural network would be a lot safer than using an external API - the latter are changing so rapidly right now that the chapter could be out of date as soon as it appears. Does Roc have something like NumPy that you could build the NN computations on? If not, could that be the first chapter, and the NN the second? (The JS and Py versions of the book build row-wise and column-wise dataframes in order to illustrate ideas about interface vs. implementation and using benchmarking to pick which implementation is best—could that work here?)

Anton-4 · 2024-03-08T15:01:24Z

the latter are changing so rapidly right now that the chapter could be out of date as soon as it appears.

Good point, another option would be download a pre-trained neural network model from a stable URL and run it locally.

Does Roc have something like NumPy that you could build the NN computations on?

Not yet, someone from Roc zulip has been experimenting with matrices but I have not looked at it closely yet.

I think explaining the inner workings of neural nets in depth is not feasible considering the one hour time limit. Andrej Karpathy, an excellent teacher spends about 2h30m on it [1], [2]. That is also for a "vanilla neural net", not the more complicated transformer ones people actually use for coding assistance.

Making a tool that uses a downloaded neural net seems to have the best trade-offs.

isaacvando · 2024-03-08T22:30:07Z

I would be more interested in reading a chapter that implemented a neural net than one that used a preexisting one. I also don't think it is necessary to fully understand the topic after reading a chapter and I suspect that a worthwhile treatment could still be done in an hour.

Anton-4 · 2024-03-09T10:16:38Z

That's reasonable, we can draft the chapter like that and see how we feel about it then :)

asteroidb612 · 2024-03-09T20:58:12Z

Andrej Karpathy's approach in that micrograd video is exactly what I'd like to present. I would crib his perspective, where we ignore optimizations like linear algebra. I would implement backpropogation on simple networks, like in the video you link @Anton-4. I think we could get it down to an hour, if we remove some of the dotlang and python operator override content.

Ideally we could have something useful at the end:

A network that can do word2vec
Identify a programming language given a file
etc.

I think it's once the backpropogation algorithm is understood, it's easy for us to say "Add lots more data / training time / clever network structure / $$$ and you have chatgpt."

asteroidb612 · 2024-03-09T21:02:11Z

I think that viewing machine learning through functional programming lenses is enlightening. Your neural network is just a function - we can even write it's type signature! But it's a function that we train instead of writing.

I have a hunch that roc will be actually nice for this kind of thing! My progress was stalled by a lambda set error but that has just been unblocked.

asteroidb612 · 2024-03-09T21:02:45Z

If someone were building a dataframe chapter, it would be interesting to base this off that. Or maybe we make a third chapter combining the two basic chapters?

gvwilson · 2024-03-09T21:13:53Z

I still think that neural networks don't fit the "tools programmers use to program" theme, but I realize I might just be showing my age :-). I am more certain that there are two chapters here if we want to respect the "teachable in one hour" restriction per chapter:

NumPy-in-Roc (NumRoc?), i.e., a linear algebra package. This could be pure Roc or a Roc wrapper around Polars.
A neural network built on top of that linalg package.

If y'all agree, let's create a separate ticket for the linear algebra package and see who wants to take it on.

Anton-4 · 2024-03-11T11:07:24Z

NumPy-in-Roc definitely sounds good!

I still think that neural networks don't fit the "tools programmers use to program" theme

I do agree, a more fitting possibility would be neural net based autocomplete but that seems too large in scope.

gvwilson · 2024-03-11T11:12:32Z

What about a more traditional autocomplete whose completion tree is updated incrementally based on what's currently in scope? I think most programmers rely on that in their editor - is that big enough/interesting enough for a chapter?

Anton-4 · 2024-03-11T12:14:34Z

is that big enough/interesting enough for a chapter?

I think so.

I see two possible approaches:

Use Roc as the language to be autocompleted and show how to fetch possible completions using the Roc language server. Language servers are definitely a commonly used and important tool.
Use English as the language to be autocompleted and have a much more self-contained example. So for example, given the text I typed in this comment, if I now were to type pos, it would suggest possible.

gvwilson · 2024-03-11T14:31:47Z

Can you do the latter first to show learners how incremental autocomplete works from the ground up? I think that building a small language server would be a great second chapter, but as a learner, I'd want to know what the magic is before relying on an external service to do it for me. (Cool idea, by the way...)

Anton-4 · 2024-03-11T16:22:40Z

Yeah that could work :)

I personally already have a lot to do with other Roc things, but any available motivated person could probably take on the first chapter of this. Are you interested in working on the second chapter about a tiny language server @faldor20?

faldor20 · 2024-03-11T21:47:17Z

Yeah, I'd be interested in taking that on. I was actually thinking it would be cool to try building a language server framework in roc ontop of tower-lsp, so maybe we could have two examples, one showing roc wrapped around an existing rust framework and one showing a pure roc implimentation that just talks of stdio?

The pure roc one is a much bigger task so I'd probably try to only show the most basic part, basically reading and writing jsonrpc from stdio and handling some basic updates and responding to one or two language server requests.

I was imagining either I could base everything of the roc compiler and just kind of hand wave how the actual calls work, or do something like @Anton-4 suggested and just turn every word in the text into a "symbol" and pretend it's the output of a compiler.

How in depth would we like to go here? Well made language servers tend to have a lot of pretty complex state management. They do a lot of caching and incrimental updating and recompilation. How far into the weeds do we want to get? Or should I just keep it as simple as "this is a naive implementation, here is where you could improve it in the real world"?

gvwilson · 2024-03-11T23:58:15Z

I think it would be a lot more approachable to do the simple version first (here's a vocabulary, autocomplete from it) and then build the language server as a separate chapter - I don't believe both will fit into our one-hour-per-lesson limit, and I think the latter will be more comprehensible after people have seen the former. @faldor20 are you interested in doing the first part?

faldor20 · 2024-03-12T04:30:02Z

I'm honestly unsure what you imagine the first part to look like?

I'm not sure it makes sense to implement any kind of autocomplete system with no foundation to actually use it in, I would argue the only really hard part of autocomplete for plain-text is dealing with the document updates and sending info out of the language server.

Implementing autocomplete as I imagine it is basically just a fuzzy search algorithm and a super simple parser that finds all the words In a document. Infact in roc-ls we don't even have fuzzy autocomplete yet 😅

But I think maybe I'm misunderstanding what you were imagining.

faldor20 · 2024-03-12T04:33:08Z

Oh, and I tried a quick mock up of jsonrpc parsing and realised roc is unfortunately currently unable to parse Json that contains unions.(types like id:number|string) Which makes implementing LSP in roc impossible right now :(
( see this zulip thread in Json null handling)

faldor20 · 2024-03-12T14:10:38Z

Okay, I went off and tried to work on my knowledge of abilities and decoders and it is actually possible, I take it all back, I'll get my implementation done soon.

gvwilson · 2024-03-12T15:05:59Z

Thanks @faldor20 - can you please create a subdirectory under the project root called completion and put your work there, along with an index.md file with notes to yourself? Cheers - Greg

gvwilson added discuss An issue or PR currently being discussed in-content Is the issue in lesson content? propose-addition A suggestion for an addition to content or infrastructure labels Mar 8, 2024

gvwilson added this to the topic-outline milestone Mar 8, 2024

gvwilson changed the title ~~Neural Network Chapter~~ topic proposal: neural network Mar 8, 2024

gvwilson mentioned this issue Mar 10, 2024

topic: SVG rendering #19

Open

hristog mentioned this issue Mar 11, 2024

topic proposal: machine learning from first principles #24

Closed

gvwilson changed the title ~~topic proposal: neural network~~ topic proposal: auto-completion Mar 11, 2024

gvwilson added the help-wanted A request for assistance label Mar 11, 2024

gvwilson assigned faldor20 Mar 12, 2024

gvwilson added assigned Topic has been assigned and removed help-wanted A request for assistance discuss An issue or PR currently being discussed propose-addition A suggestion for an addition to content or infrastructure labels Mar 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

topic proposal: auto-completion #4

topic proposal: auto-completion #4

asteroidb612 commented Mar 7, 2024 •

edited by gvwilson

Loading

asteroidb612 commented Mar 7, 2024

isaacvando commented Mar 7, 2024

gvwilson commented Mar 8, 2024

asteroidb612 commented Mar 8, 2024 •

edited

Loading

Anton-4 commented Mar 8, 2024

gvwilson commented Mar 8, 2024

Anton-4 commented Mar 8, 2024

isaacvando commented Mar 8, 2024

Anton-4 commented Mar 9, 2024

asteroidb612 commented Mar 9, 2024 •

edited

Loading

asteroidb612 commented Mar 9, 2024

asteroidb612 commented Mar 9, 2024

gvwilson commented Mar 9, 2024

Anton-4 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

Anton-4 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

Anton-4 commented Mar 11, 2024

faldor20 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

faldor20 commented Mar 12, 2024

faldor20 commented Mar 12, 2024 •

edited

Loading

faldor20 commented Mar 12, 2024

gvwilson commented Mar 12, 2024

topic proposal: auto-completion #4

topic proposal: auto-completion #4

Comments

asteroidb612 commented Mar 7, 2024 • edited by gvwilson Loading

asteroidb612 commented Mar 7, 2024

isaacvando commented Mar 7, 2024

gvwilson commented Mar 8, 2024

asteroidb612 commented Mar 8, 2024 • edited Loading

Anton-4 commented Mar 8, 2024

gvwilson commented Mar 8, 2024

Anton-4 commented Mar 8, 2024

isaacvando commented Mar 8, 2024

Anton-4 commented Mar 9, 2024

asteroidb612 commented Mar 9, 2024 • edited Loading

asteroidb612 commented Mar 9, 2024

asteroidb612 commented Mar 9, 2024

gvwilson commented Mar 9, 2024

Anton-4 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

Anton-4 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

Anton-4 commented Mar 11, 2024

faldor20 commented Mar 11, 2024

gvwilson commented Mar 11, 2024

faldor20 commented Mar 12, 2024

faldor20 commented Mar 12, 2024 • edited Loading

faldor20 commented Mar 12, 2024

gvwilson commented Mar 12, 2024

asteroidb612 commented Mar 7, 2024 •

edited by gvwilson

Loading

asteroidb612 commented Mar 8, 2024 •

edited

Loading

asteroidb612 commented Mar 9, 2024 •

edited

Loading

faldor20 commented Mar 12, 2024 •

edited

Loading