APL and J
Years later, while studying Lisp, I ran across a quote by Joel Moses: “APL is like a beautiful diamond - flawless, beautifully symmetrical. But you can’t add anything to it. If you try to glue on another diamond, you don’t get a bigger diamond. Lisp is like a ball of mud. Add more and it’s still a ball of mud - it still looks like Lisp.” I was intrigued, but I was swept up in the object-oriented craze and postponed my treasure hunt.
Perhaps it’s just as well. The internet was much smaller back then; tracking down APL documentation was trickier. Today, it is but one search away. I sought after the legendary language, but my focus quickly shifted to J, the successor to APL. Designed by Ken Iverson and Roger Hui, J improves on APL; notably, J uses ASCII instead of strange symbols.
I was impressed. APL and J indeed glitter like diamonds. Brilliant ideas sparkle thoughout the language. I feel wiser from having experimented with them.
As one would expect, Dijkstra was less enamoured: “APL is a mistake, carried through to perfection. It is the language of the future for the programming techniques of the past: it creates a new generation of coding bums.” However, compared to his observations on other programming languages, this is faint praise!
I wrote an interpreter for a tiny subset of J. Already insignificant, my code grew even more useless several months later, when the J source was released under the GPL.
Small is beautiful
APL is the Classical Chinese of programming languages. Words consisting of several characters in other languages are instead written with a symbol or two. These symbols can be overloaded. Meaning is often inferred.
Thus APL and J snippets possess the aesthetic appeal of a classical Chinese poem. Deep ideas can be conveyed by short lines of symbols precisely selected by a talented artist. See Michael Gertelman’s APL one-liner for Conway’s Game of Life.
On the other hand, the extreme terseness and quirky rules, though stylish, can hinder comprehension. Ultimately, for a language, rapid exchange of ideas is more important than looking good.
Lengthy prose written in Classical Chinese, though minimalist and beautiful, is hard to read. While there are many great classical Chinese poems, there are only Four Great Classical Novels. Similarly, I suspect J programs become increasingly obfuscated as they grow. Perhaps manageable large J projects are as rare as the Four!
In J, the percent sign represents taking the reciprocal when acting as a unary operator (monadic verb):
When given an array of numbers, J automatically applies the verb to each element of the array. Space-separated numbers represent an array. Thus "%3 1 4 1 5" elicits "0.333333 1 0.25 1 0.2".
When there are arrays on either side of a binary operator, they must match in length, in which case J applies the operation on corresponding components of each array to produce the output array: "1 2 3 + 4 5 6" is "5 7 9" (vector addition).
Hence array manipulations can be remarkably compact. For the same task, many languages typically step through an array with an index or pointer in a loop. With J, we can do the same with one character.
J ingeniously generalizes this scheme to arrays of higher dimensions. Each verb possesses a rank, which describes whether it works with, say, cells at a time rather than tables or rows at a time. A few logical rules determine how to make sense of the verb when one or both inputs has a higher rank than the verb rank.
These rules were shrewdly chosen. They naturally extend the reach of each verb. A few characters can describe operations on multidimensional arrays of varying sizes. J hides the nested loops and other overheads.
Like function arguments in C, the order in which the array elements are operated upon is not defined, so a J compiler could parallelize array operations.
On the other hand, invisiblity can be incomprehensible. The implied “for all” reminds me of Einstein notation, which I mostly avoid because I worry about confusing my audience, and more importantly, confusing myself. See Don Knuth’s “Notation” lecture in his Computer Musings series (he also talks about Ken Iverson in one segment). I suppose J junkies build up tolerance.
J might overly encourage programmers to use arrays when another data structure would be a better fit.
In many languages, "-" means subtraction when it is a binary operator (often written infix), and negation when unary. This is true of all J verbs: they have one meaning as a binary infix (dyadic) operation, and another as a unary (monadic) operation.
For example, "|" performs a modulo operation when both left and right arguments are present, but it returns the absolute value when only one operand is given.
Apart from decreased code size, overloading makes puzzling out a J one-liner even more intellectually stimulating. However, we sacrifice too much clarity. For each verb, we must first determine whether if it is monadic or dyadic before we can discern its meaning.
I shun function overloading in C++. One is courting obfuscation when a function’s meaning depends on the type and number of the inputs and outputs.
Running on empty
For example, adding 5 to a nonempty string is a domain error because the types mismatch. Yet adding 5 to an empty string results in an empty numeric array.
I wonder why J was designed this way. For the sake of mathematical tidiness? It all seems unnecessarily messy. In fact, the interpreter itself deviates from the rules in special cases, suggesting this aspect of J is flawed.
Hooks and forks
We only describe the monadic variants of J’s hooks and forks. The dyadic variants are similar.
Two verbs in a row constitute a hook, which means we apply the second verb to the input, then apply the first verb to the input and this output: hook(f, g)(y) = f(y, g(y)). For example, "(^%) 3" computes 31/3.
Without the parentheses we have ordinary function composition: J first applies the unary "%" operator, then the unary "^" operator (which means the exponential function), so "^%3" is e1/3.
Three verbs in a row constitute a fork, which means we apply the first verb to the given input, as well as the last verb (possibly in parallel), then apply the middle verb to the two results: fork(f,g,h) (y) = g(f(y),h(y)).
(+/ % #) 3 1 4 1 5 2.8
computes the average of the listed five numbers. The folded addition ("+/") and the tally operator ("#") are applied to the input, and afterwards, division ("%") is performed on the two outputs.
Aside from adding to the fun of reading and writing J, implied hooks and forks makes some programs succint because they often occur naturally. Additionally, we could parallelize the tines of the fork, or even within a tine (an associative verb used with the insert adverb can be parallelized).
Other languages might benefit from forks and hooks. "Real World Haskell" talks about optimization, and presents a program that computes the mean of a list of numbers as an example. Although the root cause of bad performance is lazy evaluation, maybe a standard hook operator would make optimization easier.
For clarity, I would prefer visible hooks and forks.
Simply add "/" to a verb to right-fold it over a list: "+/1 2 3" is "6".
J returns the mathematically correct interpretation for standard verbs with empty inputs when possible, such as 0 for folded additions, and 1 for folded multiplications.
Though convenient, empty folds seem unavailable for user-defined verbs, which troubles me. Either allow defining the result of an empty fold, or forbid it for every verb. All or nothing.
The obverse of a verb is a curious language feature: when practical, "v^:_1" reverses the action of a built-in verb "v". For example, "*:" is the square function, so its obverse "*:^:_1" is "%:", the square root function. J automatically determines the obverse of cases such as the composition of two verbs. One can define obverses for any verb, and even override existing obverses.
The conjunction "&." provides terse notation for computing f-1(g(f(x)) where f and g are functions. For example, once upon a time, slide rules were used to multiply numbers. Strips marked with logarithmic scales meant products could be computed simply by sliding one strip against another.
Since the monadic "^." is the natural log in J, we can write "x +&.\^. y" to simulate computing xy on a slide rule.
J assignments defy intuition. To facilitate recursion and other modern conveniences, unidentified identifiers are assumed to be verbs of infinite rank and lazily evaluated, while other parts of speech are strictly evaluated.
For example, below, reassigning the noun "n" has no effect on "u", but reassigning "v" does:
n =: 42 u =: n&v v =: + u 1 43 n =: 0 u 1 43 v =: - u 1 41
Unfortunately there are exceptions. If a verb is used on the right side of a rank conjunction, its ranks are strictly evaluated:
u =: +"v u b. 0 _ _ _ v =: + u b. 0 _ _ _ u =: +"v u b. 0 0 0 0
Assignments such as "avg =: +/%#" are not text substitutions. Rather, "avg 6 8 9" returns "(+/%\#)6 8 9", and not "+/%#6 8 9"; a new closure representing the fork was created during the assignment, and it is this closure that is invoked in subsequent expressions.
What if you want to assign the composition of verbs, and not create a hook or fork? Use the "@" or "@:" conjunctions, or cap a fork with "[:".
Once I was working on a C library which involved finite fields. I wanted to compose two maps, so I could take an element from one field to another via a third field. For the first time I wondered whether tacit programming was possible: I’d already described how the individual maps worked, so why did I have to write so much more code to compose them?
Backus explored function-level programming in a provocative paper long ago, and J incorporated these ideas. We’ve already seen the classic example in J:
avg =: +/ % # avg 1 2 3 4 5 6 3.5
Such elegance is unattainable in C.
J constructions allow many functions to take the visually pleasing form of an unbroken train of verbs. For instance, the sum of the first odd 42 integers is "([: +/ 1: + 2: * i.) 42".
However, overloading juxtaposition impairs comprehension. A principal designer of J has expressed regret about its hook notation.
Worse still, tacit hooks, forks, and ranks obscures the beautiful category theory connections: hooks and forks are function composition and application within the Reader monad, while adjusting the rank of a function corresponds to applying functors.
J is mostly right-associative: NVNVN = NV(NVN), VVVVV = VV(VVV), but left associative with conjunctions: VCVCV = (VCV)CV.
Why were these rules chosen? Perhaps taking the opposite direction for conjunctions and adverbs (which are postfix, the opposite of monadic verbs) helps remind us they are different to verbs.
Although implicit casting reduces code size, I still think the costs outweigh the benefits. At least J conversions are mostly sensible upgrades.
Gerunds are like C function pointers, but also allow reflection because instead of pointing to the memory address of some code, they use function names.
Unfortunately, there is a nagging inconsistency that still vexes me. Supposedly, gerunds may be produced directly by boxing. For example, "mean`median = 'mean';'median'" returns "1 1". But for primitive verbs, the results differ from an equivalent gerund produced by the tie conjunction: "+`- = '+';'-'" returns "0 0".
Why is it important to be able to distinguish between the two cases for primitive verbs? And if it is so important, how do we tell if a given gerund was created via direct boxing or a tie?
The original precursor to the current J interpreter is written in C. It is praised for its “organization and programming style”, despite employing techniques popular in the International Obfuscated C Code Contest! This code snippet gives us a window into the mind of a J designer, and suggests J’s brevity may be too much of a good thing.