Numbers in J

The next major obstacle standing in the way of our J interpreter, is understanding J’s eccentric type system.

J has six kinds of numbers with rules for type promotion. We denote the types with the letters B I X Q D Z, and they represent booleans, 64-bit integers, arbitrary-precision integers (J calls these extended integers), arbitrary-precision rationals, doubles, and complex doubles. We have listed the types in order of promotion.

My impression is that B only exists so that J can perform certain optimizations, so we’ll combine the B and I types. We define the Jumble type to hold a J number. ("Jumble" starts with J, sounds similar to "number", and is indeed a hodge-podge mix of different things.)

data Jumble = I Integer
            | X Integer
            | Q (Ratio Integer)
            | D Double
            | Z (Double, Double) deriving Eq

Numbers are displayed in a format specific to J. Extended integers and ints look the same. If possible, J suppresses the denominator of rationals, the fractional part of doubles, and the imaginary part of complex numbers, so for each type there are numbers that look like ints.

In Haskell, more sensibly, read is the inverse of show.

jShowInt n
  | n < 0     = '_':jShowInt (-n)
  | otherwise = show n

jShowDouble n
  | n < 0        = '_':jShowDouble (-n)
  | '.' `elem` s = reverse $ dropWhile (`elem` "0.") $ reverse s
  | otherwise    = s
  where s = printf "%.6g" n

instance Show Jumble where
  show (I x) = jShowInt x
  show (X x) = jShowInt x
  show (Q x) = jShowInt (numerator x) ++ (if denominator x == 1
    then "" else 'r':show (denominator x))
  show (D x) = jShowDouble x
  show (Z (x, y)) = jShowDouble x ++ (if y == 0 then "" else 'j':jShowDouble y)

The following are the rules for type promotion:

pro (I x) (I y) = (I x, I y)
pro (X x) (X y) = (X x, X y)
pro (Q x) (Q y) = (Q x, Q y)
pro (D x) (D y) = (D x, D y)
pro (Z x) (Z y) = (Z x, Z y)

pro (I x) (X y) = (X x, X y)
pro (I x) (Q y) = (Q $ x % 1, Q y)
pro (I x) (D y) = (D $ fromIntegral x, D y)
pro (I x) (Z y) = (Z (fromIntegral x, 0), Z y)

pro (X x) (Q y) = (Q $ x % 1, Q y)
pro (X x) (D y) = (D $ fromIntegral x, D y)
pro (X x) (Z y) = (Z (fromIntegral x, 0), Z y)

pro (Q x) (D y) = (D $ fromRational x, D y)
pro (Q x) (Z y) = (Z (fromRational x, 0), Z y)

pro (D x) (Z y) = (Z (x, 0), Z y)

pro x y = pro y x

When numbers of different types meet, the lesser type is promoted to the greater type.

All elements of a J array must have the same type; if one is promoted, then all must be promoted.

Let’s examine multiplication:

maxint = 2^63 - 1
minint = -2^63

checkOverflow z
  | z >= minint && z <= maxint = I z
  | otherwise                  = D $ fromIntegral z

jMul (I x) (I y) = checkOverflow $ x * y
jMul (X x) (X y) = X (x * y)
jMul (Q x) (Q y) = Q (x * y)
jMul (D x) (D y) = D (x * y)
jMul (Z (a, b)) (Z (c, d)) = Z (a*c - b*d, a*d + b*c)
jMul x y = uncurry jMul $ pro x y

With the I type, when the result overflows a 64-bit int, we promote to a double, possibly losing precision. With the X type, we only promote to double if the other argument is a double.

For a number of type I, the motivation appears to be that the internal representation should never need more than 128 bits, come what may. In fact, the exponentiation operator in J pre-emptively converts int types to doubles so even intermediate results are guaranteed to be small, though it does mean numbers one might expect to have type I will actually have type D.

For numbers of type X, the motivation appears to be to preserve precision as long as possible.

On the one hand, this handles overflow gracefully. All too often we discover a fixed-with int is too small long after writing the code. On the other hand, explicit type conversions prevent bugs. Functions like fromIntegral and fromRational are cumbersome but increase reliability.

Trying to be clever automatically can backfire: when we need to know what is happening, we must refer to documentation to understand the rules, then hope we can trace them correctly by hand.

Above, we’ve omitted positive and negative infinity, which J denotes by one and two underscores respectively. J seems to have multiple flavours of each infinity because J arrays must have the same type. We’ll see how far we can get without implementing infinities.

Taken Literally

Strange rules govern J number literals:

   2x 3 4
2 3 4
   2 3 4e0
2 3 4
   2x 3 4e0
|ill-formed number
   9223372036854775807
9223372036854775807
   9223372036854776832
9223372036854775807
   9223372036854776833
9.22337e18
   --9223372036854775808
9223372036854775807
   -_9223372036854775808
9.22337e18

Wading through the documentation, we learn radix notation for J numbers, and a few miscellaneous statements about numbers, but there is insufficient detail to explain the above.

We ignore these oddities. In fact, we’ll start simple and only support 64-bit integer constants.

We don’t serve your type here

There are non-numerical J types: strings, boxes, and gerunds. Boxes can be thought of as pointers. For our purposes, gerunds are boxed strings. In fact, they often behave the same:

   ((+`(<'-')) @. 1)  1 2 3
_1 _2 _3
   ((+`-))     @. 1)  1 2 3
_1 _2 _3

However, somewhat inexplicably:

   (+`-) = (+`(<'-'))
1 0

We’ll just equate gerunds with boxed strings, sacrificing more J compatibility.

Concrete nouns

The above suggests representing J nouns the type Shaped Jumble, where Shaped is our shaped array type we developed before, and Jumble has been extended to accommodate boxes and strings:

data Jumble = I Integer
            | X Integer
            | Q (Ratio Integer)
            | D Double
            | Z (Double, Double)
            | S String
            | Box (Shaped Jumble) deriving Eq

instance Show Jumble where
  show (S x) = x
  show (Box x) = "[" ++ show x ++ "]"
  -- ...

We’re almost ready. We just need to understand reflection in J.


Ben Lynn blynn@cs.stanford.edu 💡