This approach of solving a problem by building a low-perplexity path towards the solution reminds me of Grothendieck's approach towards solving complex mathematical problems - you gradually build a theory which eventually makes the problem obvious.
what is striking to me is how far reasoning by analogy and generalization can get you. some of the deepest theorems are about relating disparate things by analogy.
First pass on my local deepseekv3.1-Terminus at Q4 answered it correctly. if anything, i think LLMs should write terse code, Q/J/APL/Forth/Prolog/Lisp, tokens is precious. It's insane to waste precious tokens generating Java, javascript and other overly verbose code...
The bigger issue is that LLMs haven’t had much training on Q as there’s little publically available code. I recently had to try and hack some together and LLMs couldn’t string simple pieces of code together.
Perhaps for array languages LLMs would do a better job running on a q/APL parse tree (produced using tree-sitter?) with the output compressed into the traditional array-language line noise just before display, outside the agentic workflow.
> I think the aesthetic preference for terseness should give way to the preference for LLM accuracy, which may mean more verbose code
From what I understand, the terseness of array languages (Q builds on K) serves a practical purpose: all the code is visible at once, without the reader having to scroll or jump around. When reviewing an LLM's output, this is a quality I'd appreciate.
> Human language has roughly, say, 36% encoding redundancy on purpose.
The purpose is being understandable by a person of average intellect and no specialized training. Compare with redundancy in math notation, for example.
> The purpose is being understandable by a person of average intellect and no specialized training.
The purpose probably is keeping human speech understandable through the often noise-filled channel of ambient sound. Human speech with no redundancy would have a hard time fighting the noise floor.
I agree with you, though in the q world people tend to take it to the extreme, like packing a whole function into a single line rather than a single screen. Here's a ticker plant standard script from KX themselves; I personally find this density makes it harder to read, and when reading it I put it into my text editor and split semicolon-separated statements onto different lines: https://github.com/KxSystems/kdb-tick/blob/master/tick.q
E.g. one challenge I've had was generating a magic square on a single line; for odd-size only, I wrote: ms:{{[(m;r;c);i]((.[m;(r;c);:;i],:),$[m[s:(r-1)mod n;d:(c+1) mod n:#:[m]];((r+1)mod n;c);(s;d)])}/[((x;x)#0;0;x div 2);1+!:[x*x]]0}; / but I don't think that's helping anyone
I've been dabbling in programming language design as of late, when trying to decide if including feature 'X' makes sense or not, with readability being the main focus I realized some old wisdom:
1 line should do 1 thing - that's something C has established, and I realized that putting conceptually different things on the same line destroys readability very quickly.
For example if you write some code to check if the character is in a rectangular area, and then turn on a light when yes, you can put the bounds check expressions on the same line, and most people will be able to read the code quickly - but if you also put the resulting action there, your code readability will suffer massively - just try it with some code.
That's why ternary expressions like a = condition? expr1: expr2 kinda controversial - they're not always bad, as they can encode logic about a single thing - if said character is friendly, turn the light color should be green, otherwise red - is a good example - but doing error handling there is not.
I haven't been able to find any research that backs this up (didn't try very hard tho), but I strongly believe this to be true.
A nice thing is that some other principles, like CQRS, can be derived from this, for example CQRS dictates that a function like checkCharacterInAreaThenSetLightState() is bad, and should be split up into checkCharacterInArea() and setLightState()
I'd perhaps generalize that to "it's useful to have visual grouping correlate with semantic grouping"; applies to separating words with spaces, sentences with punctuation, paragraphs with newlines, lines of code, UIs, and much more.
An important question for this also is what qualifies for "single thing"; you can look at a "for (int i = 0; i < n; i++) sum += arr[i]" as like 5 or more things (separate parts), 2 things (loop, body), or just one thing ("sum"). What array languages enable is squeezing quite a bit into the space of "single thing" (though I personally don't go as far as the popular k examples of ultra-terseness).
of course, since C is a low level language, implementation details seep in there but those are the 2 major logical concepts that mentally can be quickly parsed as a single unit each.
Some languages allow reductions on collections, so the actual iteration becomes logically part of that single unit, so it might count as one, but I'd say that's the absolute upper bound of complexity you should stuff into a singlwe line.
The goal here is to make every statement conform to a template so that your brain quickly recognizes it and you can move on, rather than you having to break it apart bit by bit to figure out the goal.
//!malloc
f(a,y(x+2,WS+=x;c*s=malloc(y);*s++=0;*s++=x;s)) //!< (a)llocate x bytes of memory for a vector of length x plus two extra bytes for preamble, set refcount to 0
//!< and vector length to x in the preamble, and return pointer to the 0'th element of a new vector \see a.h type system
f(_a,WS-=nx;free(sx-2);0) //!< release memory allocated for vector x.
G(m,(u)memcpy((c*)x,(c*)y,f)) //!< (m)ove: x and y are pointers to source and destination, f is number of bytes to be copied from x to y.
//!< \note memcpy(3) assumes that x/y don't overlap in ram, which in k/simple they can't, but \see memmove(3)
//!memory management
f(r_,ax?x:(++rx,x)) //!< increment refcount: if x is an atom, return x. if x is a vector, increment its refcount and return x.
f(_r,ax?x //!< decrement refcount: if x is an atom, return x.
:rx?(--rx,x) //!< if x is a vector and its refcount is greater than 0, decrement it and return x.
:_a(x)) //!< if refcount is 0, release memory occupied by x and return 0.
Reminds me a bit of both the IOCCC and 70s Unix C from before anyone knew how to write C in a comprehensible way. But the above is ostensibly production code and the file was last updated six months ago.
Is there some kind of brain surgery you have to undergo when you accept the q license that damages the part of the brain that perceives beauty?
Lol no. It's just Arthur Whitney style C (with the repo in question being originally written by Whitney).
Whitney kind of set the stage for this but it got adopted informally as a style by the people at IPSA (I. P. Sharp Associates) and spread throughout the industry.
Whitney style C isn't great for everything but it's not bad for interpreter writing and other text/stream touching tasks.
> When I(short proof)=I(long proof), per-token average surprisal must be lower for the long proof than for the short proof. But since surprisal for a single token is simply -log P, that would mean that, on average, the shorter proof is made out of less probable tokens.
This assertion is intuitive, but it isn't true. Per-token entropy of the long proof can be larger if the long proof is not minimal.
For example, consider the "proof" of "list the natural numbers from 1 to 3, newline-delimited." The 'short proof' is:
"1\n2\n3\n" (Newlines escaped because of HN formatting)
Now, consider the alternative instruction to give a "long proof", "list the natural numbers from 1 to 3, newline-delimited using # for comments. Think carefully, and be verbose." Trying this just now with Gemini 2.5-pro (Google AI Studio) gives me:
"# This is the first and smallest natural number in the requested sequence.\n
1
# Following the first, this is the second natural number, representing the concept of a pair.\n
2
# This is the third natural number, concluding the specified range from 1 to 3.\n
3"
We don't have access to Gemini's per-token logits, but repeating the prompt gives different comments so we can conclude that there is 'information' in the irrelevant commentary.
The author's point regains its truth, however, if we consider the space of all possible long proofs. The trivial 'long' proof has higher perplexity than the short proof, but that's because there are so many more possible long proofs of approximately equal value. The shortest possible proof is a sharp minimum, but and longer proofs are shallower and 'easier'.
The author also misses a trick with:
> Prompted with “Respond only with code: How do you increment i by 1 in Python?”, I compared the two valid outputs: i += 1 has a perplexity of approximately 38.68, while i = i + 1 has a perplexity of approximately 10.88.
… in that they ignore the equally-valid 'i = 1 + i'.
Unless the interpreter is capable of pattern-recognizing that whole pattern, that will be less efficient, e.g. having to work with 16-bit integers for x in the range 128..32767, whereas the direct version can construct the array directly (i.e. one byte or bit per element depending on whether kdb has bit booleans). Can't provide timings for kdb for licensing reasons, but here's Dyalog APL and CBQN doing the same thing, showing the fast version at 3.7x and 10.5x faster respectively: https://dzaima.github.io/paste/#0U1bmUlaOVncM8FGP5VIAg0e9cxV...
The vibe I get from q/kdb in general is that its concision has passed the point of increasing clarity through brevity and is now about some kind of weird hazing or machismo thing. I've never seen even numpy's verbosity be an actual impediment to understanding an algorithm, so we're left speculation about social and psychological explanations for why someone would write (2#x)#1,x#0 and think it beautiful.
Some brief notations make sense. Consider, say, einsum: "ij->ji" elegantly expresses a whole operation in a way that exposes the underlying symmetry of the domain. I don't think q's line noise style (or APL for that matter) is similarly exposing any deeper structure.
Disagree. If some small adjustments to your workflow or expectations enable you to use LLMs to produce good, working, high-quality code much faster than you could otherwise, at some point you should absolutely welcome this, not stubbornly refuse change.
I see no reason to believe LLMs can write working let alone good or high-quality code, nor that the adjustments to my workflow or expectations will be small. But sure, if such a thing happened, I would probably welcome it.
Meanwhile, there are people who write good and high-quality working code faster than me, and they all write as much as possible on one line with the most bare-bones of text editors, so I will continue to learn from them, rather than the people who say LLMs are helping them. Maybe you should reconsider.
Somehow I don't think writing verbose English to communicate with an LLM is ever going to beat a language purpose-built for its particular niche. Being terse is the point and what makes it so useful. If people wanted to use python with their LLM instead, they have that option.
Chances are hell is going to freeze over before people start writing verbose q code. Q being less verbose than alternatives is the whole point. Nobody is feeling any pressure to bend over backwards to accommodate the guy who struggles to get by when his LLM can't explain a piece of code to him.
To use your nailgun analogy as an example: Waddling in with your LLM and demanding the q community change is like walking into a clockmaker's workshop with your nailgun and demand they accommodate your chosen tool.
"But I can't fit my nailgun into these tiny spaces you're making, you should build larger clocks with plenty of space for my nailgun to get a good angle!"
No, we're not going to build larger clocks, but you're free to come back with a tiny automatic screwdriver instead. Alternatively you and your nailgun might feel more at home with the construction company across the street.
I'm pretty sure the time will soon come when nobody is trying to accommodate the personal tastes and preferences of developers anymore; languages and tools will be chosen based on how well LLMs work with them, and the way the LLMs are used with those will be determined again by the traits of the tool, not the preferences of the user. Management won't be in the mood to humor devs who are stuck in their old mindset of writing code themselves.
I was kind of taken aback by the author's definition of 'terse'. I was expecting a discussion about architecture not about syntax aesthetics.
Personally I don't like short variable names, short function names or overly fancy syntactical shortcuts... But I'm obsessed with minimizing the amount of logic.
I want my codebases to be as minimalist as possible. When I'm coding, I'm discovering the correct lines, not inventing them.
This is why I like using Claude Code on my personal projects. When Claude sees my code, it unlocks a part of its exclusive, ultra-elite, zero-bs code training set. Few can tap into this elite set. Your codebase is the key which can unlock ASI-like performance from your LLMs.
My friend was telling me about all the prompt engineering tricks he knows... And in a typical midwit meme moment; I told him, dude, relax, my codebase basically writes itself now. The coding experience is almost bug free. It just works first time.
I told my friend I'd consider letting him code on my codebase if he uses an LLM... And he took me up on the offer... I merged his first major PR directly without comment. It seems even his mediocre Co-pilot was capable of getting his PR to the standard.
I'd bet a lot of people are trying to optimize their codebases for LLMs. I'd be interested to see some examples of your ASI-unlocking codebase in action!
This approach of solving a problem by building a low-perplexity path towards the solution reminds me of Grothendieck's approach towards solving complex mathematical problems - you gradually build a theory which eventually makes the problem obvious.
https://ncatlab.org/nlab/show/The+Rising+Sea
> you gradually build a theory which eventually makes the problem obvious.
Which incidentally is how programming in Haskell feels like
what is striking to me is how far reasoning by analogy and generalization can get you. some of the deepest theorems are about relating disparate things by analogy.
First pass on my local deepseekv3.1-Terminus at Q4 answered it correctly. if anything, i think LLMs should write terse code, Q/J/APL/Forth/Prolog/Lisp, tokens is precious. It's insane to waste precious tokens generating Java, javascript and other overly verbose code...
https://pastebin.com/VVT74Rp9
The bigger issue is that LLMs haven’t had much training on Q as there’s little publically available code. I recently had to try and hack some together and LLMs couldn’t string simple pieces of code together.
It’s a bizarre language.
I don't think that's the biggest problem. I think it's the tokenizer: it probably does a poor job with array languages.
Perhaps for array languages LLMs would do a better job running on a q/APL parse tree (produced using tree-sitter?) with the output compressed into the traditional array-language line noise just before display, outside the agentic workflow.
> I think the aesthetic preference for terseness should give way to the preference for LLM accuracy, which may mean more verbose code
From what I understand, the terseness of array languages (Q builds on K) serves a practical purpose: all the code is visible at once, without the reader having to scroll or jump around. When reviewing an LLM's output, this is a quality I'd appreciate.
Perl and line noise also share these properties. Don’t particularly want to read straight binary zip files in a hex editor, though.
Human language has roughly, say, 36% encoding redundancy on purpose. (Or by Darwinian selection so ruthless we might as well call it "purpose".)
> Human language has roughly, say, 36% encoding redundancy on purpose.
The purpose is being understandable by a person of average intellect and no specialized training. Compare with redundancy in math notation, for example.
> The purpose is being understandable by a person of average intellect and no specialized training.
The purpose probably is keeping human speech understandable through the often noise-filled channel of ambient sound. Human speech with no redundancy would have a hard time fighting the noise floor.
Language is often consciously changed and learned, so it is sometimes quite designed.
I agree with you, though in the q world people tend to take it to the extreme, like packing a whole function into a single line rather than a single screen. Here's a ticker plant standard script from KX themselves; I personally find this density makes it harder to read, and when reading it I put it into my text editor and split semicolon-separated statements onto different lines: https://github.com/KxSystems/kdb-tick/blob/master/tick.q E.g. one challenge I've had was generating a magic square on a single line; for odd-size only, I wrote: ms:{{[(m;r;c);i]((.[m;(r;c);:;i],:),$[m[s:(r-1)mod n;d:(c+1) mod n:#:[m]];((r+1)mod n;c);(s;d)])}/[((x;x)#0;0;x div 2);1+!:[x*x]]0}; / but I don't think that's helping anyone
There's a difference between one line and short/terse/elegant.
generates magic squares of odd size, and the method is much clearer. This isn't even golfed as the variables have been left.I've been dabbling in programming language design as of late, when trying to decide if including feature 'X' makes sense or not, with readability being the main focus I realized some old wisdom:
1 line should do 1 thing - that's something C has established, and I realized that putting conceptually different things on the same line destroys readability very quickly.
For example if you write some code to check if the character is in a rectangular area, and then turn on a light when yes, you can put the bounds check expressions on the same line, and most people will be able to read the code quickly - but if you also put the resulting action there, your code readability will suffer massively - just try it with some code.
That's why ternary expressions like a = condition? expr1: expr2 kinda controversial - they're not always bad, as they can encode logic about a single thing - if said character is friendly, turn the light color should be green, otherwise red - is a good example - but doing error handling there is not.
I haven't been able to find any research that backs this up (didn't try very hard tho), but I strongly believe this to be true.
A nice thing is that some other principles, like CQRS, can be derived from this, for example CQRS dictates that a function like checkCharacterInAreaThenSetLightState() is bad, and should be split up into checkCharacterInArea() and setLightState()
I'd perhaps generalize that to "it's useful to have visual grouping correlate with semantic grouping"; applies to separating words with spaces, sentences with punctuation, paragraphs with newlines, lines of code, UIs, and much more.
An important question for this also is what qualifies for "single thing"; you can look at a "for (int i = 0; i < n; i++) sum += arr[i]" as like 5 or more things (separate parts), 2 things (loop, body), or just one thing ("sum"). What array languages enable is squeezing quite a bit into the space of "single thing" (though I personally don't go as far as the popular k examples of ultra-terseness).
I'd say I'd decompose this into 2 'units'
- iterate through the array arr,
- sum its elements
of course, since C is a low level language, implementation details seep in there but those are the 2 major logical concepts that mentally can be quickly parsed as a single unit each.
Some languages allow reductions on collections, so the actual iteration becomes logically part of that single unit, so it might count as one, but I'd say that's the absolute upper bound of complexity you should stuff into a singlwe line.
The goal here is to make every statement conform to a template so that your brain quickly recognizes it and you can move on, rather than you having to break it apart bit by bit to figure out the goal.
When Q folks try to write C: https://github.com/kparc/ksimple
When EAX and RAX take too long to type.
Representative example:
Reminds me a bit of both the IOCCC and 70s Unix C from before anyone knew how to write C in a comprehensible way. But the above is ostensibly production code and the file was last updated six months ago.Is there some kind of brain surgery you have to undergo when you accept the q license that damages the part of the brain that perceives beauty?
Lol no. It's just Arthur Whitney style C (with the repo in question being originally written by Whitney).
Whitney kind of set the stage for this but it got adopted informally as a style by the people at IPSA (I. P. Sharp Associates) and spread throughout the industry.
Whitney style C isn't great for everything but it's not bad for interpreter writing and other text/stream touching tasks.
Hey, another language with smileys! Like haskell, which has (x :) (partial application of a binary operator)
> When I(short proof)=I(long proof), per-token average surprisal must be lower for the long proof than for the short proof. But since surprisal for a single token is simply -log P, that would mean that, on average, the shorter proof is made out of less probable tokens.
This assertion is intuitive, but it isn't true. Per-token entropy of the long proof can be larger if the long proof is not minimal.
For example, consider the "proof" of "list the natural numbers from 1 to 3, newline-delimited." The 'short proof' is:
"1\n2\n3\n" (Newlines escaped because of HN formatting)
Now, consider the alternative instruction to give a "long proof", "list the natural numbers from 1 to 3, newline-delimited using # for comments. Think carefully, and be verbose." Trying this just now with Gemini 2.5-pro (Google AI Studio) gives me:
"# This is the first and smallest natural number in the requested sequence.\n 1
# Following the first, this is the second natural number, representing the concept of a pair.\n 2
# This is the third natural number, concluding the specified range from 1 to 3.\n 3"
We don't have access to Gemini's per-token logits, but repeating the prompt gives different comments so we can conclude that there is 'information' in the irrelevant commentary.
The author's point regains its truth, however, if we consider the space of all possible long proofs. The trivial 'long' proof has higher perplexity than the short proof, but that's because there are so many more possible long proofs of approximately equal value. The shortest possible proof is a sharp minimum, but and longer proofs are shallower and 'easier'.
The author also misses a trick with:
> Prompted with “Respond only with code: How do you increment i by 1 in Python?”, I compared the two valid outputs: i += 1 has a perplexity of approximately 38.68, while i = i + 1 has a perplexity of approximately 10.88.
… in that they ignore the equally-valid 'i = 1 + i'.
> Let’s start with an example: (2#x)#1,x#0 is code from the official q phrasebook for constructing an x-by-x identity matrix.
Is this... just to be clever? Why not
aka. the identity matrix is defined as having ones on the diagonal? Bonus points AI will understand the code better.while both versions are O(N^2), your version is slower because comparison operation, which affects execution speed. This is suboptimal.
upd: in ngn/k, situation is opposite ;-oUnless the interpreter is capable of pattern-recognizing that whole pattern, that will be less efficient, e.g. having to work with 16-bit integers for x in the range 128..32767, whereas the direct version can construct the array directly (i.e. one byte or bit per element depending on whether kdb has bit booleans). Can't provide timings for kdb for licensing reasons, but here's Dyalog APL and CBQN doing the same thing, showing the fast version at 3.7x and 10.5x faster respectively: https://dzaima.github.io/paste/#0U1bmUlaOVncM8FGP5VIAg0e9cxV...
The vibe I get from q/kdb in general is that its concision has passed the point of increasing clarity through brevity and is now about some kind of weird hazing or machismo thing. I've never seen even numpy's verbosity be an actual impediment to understanding an algorithm, so we're left speculation about social and psychological explanations for why someone would write (2#x)#1,x#0 and think it beautiful.
Some brief notations make sense. Consider, say, einsum: "ij->ji" elegantly expresses a whole operation in a way that exposes the underlying symmetry of the domain. I don't think q's line noise style (or APL for that matter) is similarly exposing any deeper structure.
It’s…remarkable. Is this meant to be entered on a payphone? I will never be cool enough.
[dead]
LLMs were created to use the same interface as humans (language/code).
Asking humans to change for the sake of LLMs is an utterly indefensible position. If humans want terse code, your LLM better cope or go home.
Disagree. If some small adjustments to your workflow or expectations enable you to use LLMs to produce good, working, high-quality code much faster than you could otherwise, at some point you should absolutely welcome this, not stubbornly refuse change.
I think there's a mighty big assumption in there.
I see no reason to believe LLMs can write working let alone good or high-quality code, nor that the adjustments to my workflow or expectations will be small. But sure, if such a thing happened, I would probably welcome it.
Meanwhile, there are people who write good and high-quality working code faster than me, and they all write as much as possible on one line with the most bare-bones of text editors, so I will continue to learn from them, rather than the people who say LLMs are helping them. Maybe you should reconsider.
Somehow I don't think writing verbose English to communicate with an LLM is ever going to beat a language purpose-built for its particular niche. Being terse is the point and what makes it so useful. If people wanted to use python with their LLM instead, they have that option.
Do you swing a nailgun?
Use the tool according to how it works, not according to how you think it should work.
Chances are hell is going to freeze over before people start writing verbose q code. Q being less verbose than alternatives is the whole point. Nobody is feeling any pressure to bend over backwards to accommodate the guy who struggles to get by when his LLM can't explain a piece of code to him.
To use your nailgun analogy as an example: Waddling in with your LLM and demanding the q community change is like walking into a clockmaker's workshop with your nailgun and demand they accommodate your chosen tool.
"But I can't fit my nailgun into these tiny spaces you're making, you should build larger clocks with plenty of space for my nailgun to get a good angle!"
No, we're not going to build larger clocks, but you're free to come back with a tiny automatic screwdriver instead. Alternatively you and your nailgun might feel more at home with the construction company across the street.
I'm pretty sure the time will soon come when nobody is trying to accommodate the personal tastes and preferences of developers anymore; languages and tools will be chosen based on how well LLMs work with them, and the way the LLMs are used with those will be determined again by the traits of the tool, not the preferences of the user. Management won't be in the mood to humor devs who are stuck in their old mindset of writing code themselves.
I could be wrong. Time will tell.
I think that there are a few critical issues that are not being considered:
* LLMs don't understand the syntax of q (or any other programming language).
* LLMs don't understand the semantics of q (or any other programming language).
* Limited training data, as compared to kanguages like Python or javascript.
All of the above contribute to the failure modes when applying LLMs to the generation or "understanding" of source code in any programming language.
> Limited training data, as compared to kanguages like Python or javascript.
I use my own APL to build neural networks. This is probably the correct answer, and inline with my experience as well.
I changed the semantics and definition of a bunch of functions and none of the coding LLMs out there can even approach writing semidecent APL.
I was kind of taken aback by the author's definition of 'terse'. I was expecting a discussion about architecture not about syntax aesthetics.
Personally I don't like short variable names, short function names or overly fancy syntactical shortcuts... But I'm obsessed with minimizing the amount of logic.
I want my codebases to be as minimalist as possible. When I'm coding, I'm discovering the correct lines, not inventing them.
This is why I like using Claude Code on my personal projects. When Claude sees my code, it unlocks a part of its exclusive, ultra-elite, zero-bs code training set. Few can tap into this elite set. Your codebase is the key which can unlock ASI-like performance from your LLMs.
My friend was telling me about all the prompt engineering tricks he knows... And in a typical midwit meme moment; I told him, dude, relax, my codebase basically writes itself now. The coding experience is almost bug free. It just works first time.
I told my friend I'd consider letting him code on my codebase if he uses an LLM... And he took me up on the offer... I merged his first major PR directly without comment. It seems even his mediocre Co-pilot was capable of getting his PR to the standard.
I'd bet a lot of people are trying to optimize their codebases for LLMs. I'd be interested to see some examples of your ASI-unlocking codebase in action!