2006-10-31

I'm playing with the "Ian's stuff" last two days.
I'm learning by doing and the example project I
picked is to write Ruby in it.

  To do this, one will require to write:

- The mapping from the Ruby object model to the
  Coke one.  Since the Coke's model is malleable and
  the essential part of it is more or less
  instance-based, the class and module hierarchy
  (and instance-specific behavior) of Ruby should be
  easily mapped onto it.  A method definition of
  Ruby has some complex rules and allows variable
  arguments.  There should be some indirection so
  that such method can be represented a set of coke
  functions.

- The grammar definition.  Alex has been working
  on a Meta-II like grammar processor (written in
  itself on top of Coke).

  Unfortunately, the latter is not readily usable
for platforms I use.  So, I'd do the former first
for the time being.

  At a glance, Coke syntax may look a bit like
Lisp.  But there is one fundamental difference:
"expressions are not cons pairs, but Expression
objects."  An Expression object is a subclass of
IdSt Array, which is fixed-width collection of
other objects.  For example, if you write '(1 2 3)
in Coke, it is an (Array-like) Expression object
whose length is 3.  In the other words, the
Lispy-looking "()-world" is already contaminated
by objects in a fundamental way.

BTW, an expert says: "The most accurate thing you
could say is that Coke expressions are parse tree
nodes and the semantics is that of C."

  There are also a few things to keep in mind:

- The primary data that Coke manipulates is
  "primitive" data.
- A quotation (more or less) gives you simple way
  to represent a simple object.

- With a pair of square brackets, you can send a
  message to an object.

  These are simple in essence, but can really
confuse you in different ways.  One is that the
SmallInteger object has a tagged representation,
but the interactive shell shows them as primitive
numbers.

  In below, I'll get some executable examples.  I
use jolt in idst-5.7 distribution.  You I'd
recommend to download the source and built it by
yourself.  As you go, you'll want to modify the
source of jolt as well.  You should always give
"boot.k" as initial command line argument of
executable of jolt (called "main" or "main.exe")
and "-" to mean you are using it interactively:

---------
bash$ ./main boot.k -
---------

It should show some message and a prompt (".").

  Let us type "3" (one character for three not the
double quatation marks) into the interactive
shell, it means 3 in the primitive data world.
So, the interactive shell returns 3:

---------
.3
 => 3
---------

  On the other hand, you can type "'3" to mean 3 in
the objects' world.  Note that the -eval and
-print part of the interactive shell (or
read-eval-print loop of the shell) doesn't know it
was an object so it prints out "7" (3<<1|1):

---------
.'3
 => 7
---------

Similarly, a string enclosed by double quotation
marks represents a primitive string (like C
string), but a single quotation makes it an
object.

---------
."abc" ; => C string
.'"abc" ; => String object
---------

Under this shell, the number arithmetic can be
confusing:

---------
.(* 3 3)
 => 9     ; primitive data 3 times primitive data 3.
.(* '3 '3)
 => 49    ; (3<<1|1)*(3<<1|1).
.['3 *'3]
 => 19    ; (3*3)<<1|1
.[3 * 3]
 => 3     ; ((3>>1)*(3>>1)<<1|1)
---------

The lispy version of function (prefix "*" at the
head of ()-list) interprets given arguments as
primitive numbers and multiply them.  The message
version of "*" interprets receiver and arguments
as objects and multiply them.  In both cases, the
printed results are given in the primitive data
representation.  What happens if you try to send
"*" message to a primitive data?  If fails rather
horribly:

---------
.[2 * 2]
     13 [main] main 308 _cygtls::handle_exceptions: Exception: STATUS_ACCESS_VIOLATION
    560 [main] main 308 open_stackdumpfile: Dumping stack trace to main.exe.stackdump
---------

and crashes because the object version of multiply
cannot accept primitive data.  Booleans and
Characters in IdSt are simply mapped to the Coke
world; Any non-zero value is true and zero is
false.  And, Characters are integers.  (There
seems no notation for an IdSt Character object in
Coke.)

  Let us continue the experiment.  A good way is
to create your own test program file (let us call
it "tmp.k") and write your stuff into it.  To
invoke it from the interactive shell, you do:

---------
bash$ ./main boot.k tmp.k -
---------

  As an example function, let's define
"vector-map" function in Coke.  The vector-map
function accepts a function and an Array.  It
applies the function to each element of the Array
and stores the results into new Array and return
it.  First, you add a few lines to your test file
to "import" the pre-compiled IdSt objects into the
Coke-world and name them accordingly.  Also you
can define a utility method "print" for Object:

---------
(define Object (import "Object"))
(define Array (import "Array"))
(define String (import "String"))
(define [Object print] [StdOut nextPutAll: [self printString]])
---------

  Then, define vector-map (it took me quite a
while):

---------
(define vector-map
  (lambda (func list)
     (let ((s [list size])
           (ret [Array new: [list size]])
	   (idx '0)
	   (tmp 0))
	(while [idx < s]
	   (set tmp (func [list at: idx]))
	   [ret at: idx put: tmp]
	   (set idx [idx + '1]))
	ret)))
---------

Or, one could write:

---------
(define vector-map
  (lambda (func list)
     (let ((s [[list size] _integerValue])
           (ret [Array new: [list size]])
	   (idx 0)
	   (tmp 0))
	(while (< idx s)
	   (set tmp (func [list at: [SmallInteger value_: idx]]))
	   [ret at: [SmallInteger value_: idx] put: tmp]
	   (set idx (+ idx 1)))
	ret)))
---------

to get the same effect.  In the former, "s" and
"idx" hold objects (for example, "[list size]"
returns a number object), and in the latter, they
hold primitive data.  Just see the usage of them
in the body of "while" changes from []-world to
()-world.  I haven't measured the performance, but
the former is simpler to read.  One thing you
would forget most is to put a quote in the
increment of "idx".  by the time you write a few
things in the ()-world, switching to []-world
requires some mind twist, especially "idx" already
holds and object so you don't have to quote it.

  The variable "tmp" is unnecessary, so you can
write:

---------
(define vector-map
  (lambda (func list)
     (let ((s [list size])
           (ret [Array new: [list size]])
	   (idx '0))
	(while [idx < s]
	   [ret at: idx put: (func [list at: idx])]
	   (set idx [idx + '1]))
	ret)))
---------

I put "tmp" to signify that naked 0 is kind of nil
(in fact, it is NULL in the C sense) and good to
initialize a variable.  In either one, you can use
it:

---------
(define n '(1 2 3))
(define n (vector-map (lambda (x) ['3 * x]) n))
[n print]
---------

  Now, since the objects are almost there for you
to manipulate, you might think that the following
would work:

---------
(define n '(1 2 3))
[[n collect: (lambda (x) ['3 * x])] print]
---------

  However, an expert of Coke told me that it
doesn't work because "collect:" expects a block in
IdSt world, but Coke's lambda is a function and
not a closure.  For the time being, we put up with
this kind of verbose description.

  The reason that the above code doesn't work is
not about mixing ()'s and []'s.  They are quite
tolerant as long as it makes sense.  For example,
you can write:

---------
[((lambda (x) ['3 * x]) ['(1 2 3) first]) print]
---------

The lambda takes an object and multiply it by 3 in
the object world.  You give the first element of
the list (an Expression array) and pass it to the
function as an argument.  To the returned object,
you send "print" message.  Unlike Smalltalk or
IdSt, you have to put "[]" to each message
sending.  For example, you can't write:

---------
['(1 2 3) first print]
---------

to mean the same thing.

  Okay, here is what I've learned so far.  If I
don't get bored, I'll keep going and write more
entries.
squeakerのブログ

欧文の日記

Pepsi and Coke (No. 1)

欧文の日記

ハロウィーン