Babel
言葉の混乱を象徴するタイトルの映画は、2時間半近くという長尺でありかつ単純なサスペンスなどはないにもかかわらず、大胆な表現によってあきさせない佳作であった。
複数の言語を使う場面が切り替わり、しかも耳の聞こえない少女達の「言語」もある。なぜ第4の舞台が日本であり日本語なのかという疑問を持ったりもしたが、一般的な聴衆にとってまったく見当もつかない言葉を話す先進国である、という必然性はあるかもしれない。
東京のマンションの場面はどうやらDanさんのおうちだったようだが(http://blog.livedoor.jp/dankogai/archives/50204795.html)、途中に出てくる噴水のある公園は、俺が高校時代チャリ通(禁止だったけど)していた道の傍らにあるんだよな。現地に行き、観衆のステレオタイプに迎合しないで現地の景色をそのまま切り取り、一般にありそうな人々の行動を切り取っているのも(当たり前のはずなのだが)新鮮に感じるところなのだろう。(新宿の近くじゃないけど)
ただ、審判にあれだけ食ってかかるような高校があったらもっと問題になると思うし、モロッコの警察がいきなり発砲するのは怖かった。
バレーボールの技術指導は陽子・ゼッターランドとなっていた。
映画館は、HollywoodのArcLight Domeというところ。最初にMCをする係員がいたりする映画館だった。ひとり$14.00は日本並み。AFIが進行中ということもあってなかなか混んでいた。
Pepsi and Coke (No. 5)
I fetched the "head" tree from the SVN repository of idst and tried to make it run on my Windows machine (again). I said "I tried" but the way I did was just to complain to Ian in person. The issue turned out that some macros (such as _B, _W, _L, and _X) defined and used by jolt name-crash with the cygwin gcc compiler's definitions. So, I just undef them. (That was what Ian told me to do). There was still a few more smaller issues, but I managed to compile jolt from the head.
The JavaScript implementation comes with it still doesn't work, but I think I should switch to this version anyway. So, things I wiil write on this blog are based on that version.
I use Meadow (Emacs) on Windows for doing stuff, but until I started playing with Coke, I was just happy with the Mingw32 binaries. I did have cygwin installed but rarely used for anything. In the light of Coke, I set up a bit better environment for the combination of Meadow, Mingw32, and Cygwin. A few small modification to the jolt code let me use jolt and idc under cmuscheme.el and a bash shell running in the shell-mode. (Ian and Alex don't run these interactive stuff under Emacs. That bothers me, and also the fact that they are productive without the support from these great tools bothers me, too.)
Anyway, now it's time to talk about defining new syntax and the Compiler. The syntax thing is somewhat similar to some macro systems in Lispy languages, and there is a section on it in http://www.piumarta.com/pepsi/coke.html. So, I'm not going into the detail, but just jump to an example.
For what I would like to do with Ruby, I need to handle a variable number of arguments for a method. However, the Coke at the language level doesn't support such var-args well. (The built-in "and", "or" and some other operators handle var-args, so it is clearly possible under that level. I'll talk about it later.)
To see if I can do such a thing, I tried to implement the "max" function. The function gets one or more arguments and returns the biggest value. Well, I made some (a lot of) unsatisfying attempts and ended up with asking Ian. One of my wrong attempts was like this:
(define [Object evalWith: compiler]
[SmallInteger value_: [[compiler compile: self] value]])
(define max1
(lambda (node compiler)
(let ((s [node size])
(idx '0)
(m [[node at: idx] evalWith: compiler]))
(set idx [idx + '1])
(while [idx < s]
(let ((c [[node at: idx] evalWith: compiler]))
(if [m < c]
(set m c)))
(set idx [idx + '1]))
m)))
(syntax max ; (max ...)
(lambda (node compiler)
(let ((n (max1 [node copyFrom: '1] compiler)))
n)))Here, the syntax of "max" doesn't do too much but just passes the arguments to a helper-function called "max1". The first argument passed to max1 is an Expression object (again, it is like an Array). The size of the arguments to "max1" is one fewer than to "max", as the first node is the Symbol "#max" itself. In "max1", the nodes (bound to "node") are evaluated in a brute force (and wrong) way with the compiler and compared to each other to get the answer. This version lets you do stuff like:
.(max '1 '2 '3 '4 11) => 11
or if you have a global variable:
.(define x '5) .(max '1 '2 '3 '4 x) => 11
So you can mix the quoted and not-quoted numbers, and use varaibles. However, it doesn't work a case like:
.((lambda (y) (max '1 '2 '3 '4 y)) '5) undefined: y
My assumption when I was writing this was that the "compiler" argument in the syntax definition knows about the static nature of the environment. In other words, syntax expansion is something that happens at compile time, but variable binding happens at runtime. In the last example, the Compiler that is expanding the "max" form is in the global environment (i.e., its "environment" instance variable is GlobalEnvironment). So, even if you try to evaluate "y" in the environment, it is not defined yet.
Ian's version that does the right thing is like this:
(define make-max
(lambda (exprs)
(let ((idx '0)
(lim [exprs size])
(body [WriteStream on: [Expression new: lim]]))
(while (< idx lim)
[body nextPut: `(if (< val (set tmp ,[exprs at: idx])) (set val tmp))]
(set idx (+ idx 2)))
[body contents])))
(syntax max
(lambda (node compiler)
`(let ((val ,[node second])
(tmp 0))
,@(make-max [node copyFrom: '2])
val)))It also has got the syntax definition of "max" and a helper-function "make-max". The difference is that the helper-function, "make-max" returns an Expression (instead of the value) and "max" embeds the returned Expression into another enclosing Expression and returns it. To see what really happens, you can evaluate to display all the syntax expansion:
[Compiler parseOption: '"-vs"]
After evaluating, you can evaluate:
.(max '1 '2 '3)
and you get:
SYNTAX -> (#max (#quote 1) (#quote 2) (#quote 3)) REWRITE => (#let ((#val (#quote 1)) (#tmp 0)) (#if (#< #val (#set #tmp (#quote 2))) (#set #val #tmp)) (#if (#< #val (#set #tmp (#quote 3))) (#set #val #tmp)) #val) BUILTIN -> (#quote 1) REWRITE => nil BUILTIN -> (#quote 2) REWRITE => nil BUILTIN -> (#set #tmp (#quote 2)) REWRITE => nil BUILTIN -> (#< #val (#set #tmp (#quote 2))) REWRITE => nil BUILTIN -> (#set #val #tmp) REWRITE => nil BUILTIN -> (#if (#< #val (#set #tmp (#quote 2))) (#set #val #tmp)) REWRITE => nil BUILTIN -> (#quote 3) REWRITE => nil BUILTIN -> (#set #tmp (#quote 3)) REWRITE => nil BUILTIN -> (#< #val (#set #tmp (#quote 3))) REWRITE => nil BUILTIN -> (#set #val #tmp) REWRITE => nil BUILTIN -> (#if (#< #val (#set #tmp (#quote 3))) (#set #val #tmp)) REWRITE => nil BUILTIN -> (#let ((#val (#quote 1)) (#tmp 0)) (#if (#< #val (#set #tmp (#quote 2))) (#set #val #tmp)) (#if (#< #val (#set #tmp (#quote 3))) (#set #val #tmp)) #val) REWRITE => nil => 7
The chain of "if" and assignment is generated at the compile time and it is evaluated right away so that you get 7 as the result. Experimenting a syntax definition at the interactive shell's top-level was confusing to me. The syntax expansion is a complie-time behavior, but it is also evaluated right away so that you might get (as I did) many wrong ideas. You might also feel that the execution would be inefficient, because a lot of rewriting has to be done when you type. But, this is only because you are in the interactive shell. If you define a function that uses "max", the expansion is done once, and resulting (fixed length of) "if" chain is embedded into the calling function just once. (I didn't have to use the Compiler explicitly.
Also, you have to be fully aware what a quoted number (like '3) is. When such a number (like (#quote 3)) is evaluated (after reading), an object of 3 is created. But if you try to deal with it at the syntax expansion time in some way, (as I did in my other attempts), you'll never get it right.
A fun mistake that probably you will do some number of times is that you get into an infinite loop of syntax expansion. If you define something like:
(syntax foo
(lambda (node compiler)
`(let ((first [,node first]))
...The syntax expander embeds the original form including the leading "foo" in place of ",node" and re-evaluate it.
Another note is that Ian uses primitive value 2 as an increment to get the next SmallInteger very often. For example, there is a syntax of "for" defined, and a version of "vector-map" Ian suggested was basically like this:
(define vector-map
(lambda (func list)
(let ((result [list new: [list size]]))
(for (index '0 2 [[list size]])
[result at: index put: (func [list at: index])])
result)))Here, the initial value for the "index" variable is "'0", but the step is "2". It is indeed correct that "(== (+ '0 2) '1)" is true, but I can't help feeling it as a "bad know-how".
The Compiler compiles Expressions, and also numbers, strings, and symbols into an abstract stack machine instructions. The stack machine architecture is called "VPU". The predefined forms such as "while", "if", "and", and function calls have pre-defined (but as changeable as other things) translation rules. Let's take a look at the definitions of SmallInteger>>compile:, Compiler>>integer:, and Compiler>>form:,
SmallInteger compile: compiler [ ^compiler integer: self ]
Compiler integer: anInteger
[
vpu ldInt: anInteger.
^nil
]
Compiler form: aForm
[
| head syntax |
aForm isEmpty ifTrue: [self syntaxError: aForm].
head := aForm first.
(syntax := environment lookupSyntax: head) ~~ nil ifTrue: [^self recompile: (self applySyntax: syntax with: aForm)].
(syntax := SyntaxTable at: head ifAbsent: []) ~~ nil ifTrue: [^self recompile: (self performSyntax: syntax with: aForm)].
"apply"
aForm reverseDo: [:arg | arg compile: self].
vpu iCalli: aForm size - 1.
^nil
]
If you call "compile:" on a SmallInteger, it eventually pushes itself onto the VPU's arithmetic stack by "ldInt:" instruction. Compiler>>form: deals with a full S-expression. The first part of it has something to do with the syntax. The buttom part of it, simply says that the arguments (assuming all of them are integers) are pushed onto the arithmetic stack, and a call instruction (iCallI:)is executed. It is simple. A sequence of VPU instructions is translated into native code and executed. It is pretty efficient.
Another experiment with the Compiler is something like this. You define a function by using compile:
.[[Compiler compile: '(lambda (x) (+ x 3))] print] VPULabel => 9584804
A VPULabel is an object that represents a label (duh). You can send "value" to obtain the label, and simply call it by:
([[Compiler compile: '(lambda (x) (+ x 3))] value] 5) => 8
This means that the function call in Coke is relly close to what a native function call does.
You could imagine to take advantage of this compiler to write a more efficient version of "max". If you look at the implementation of "and" ("xLogAnd:"), it is really simple. (It definitely reminds me of an undergrad assignment for which I needed to write a PL/O compiler and bytecode interpreter.) What it does is very similar to "max" after all. (And Alex has a great idea about doing the complex Ruby argument handling at that level, by adding a VPU instruction. It will make it run really fast.)
I wrote 4 long articles and 1 short one. While the way you have to write code, test, and debug is painful for an average hacker like me, I also feel great potential of the system. Writing these articles helped me a lot, and I hope some readers think they are of some use.
Pepsi and Coke (No. 4)
I thought I was going to write about the "syntax" thing, but I think it is worthwhile to clarify myself on objects and the execution system first.
So, objects. After spending a few days on Jolt Coke, I went back and re-read the page at http://www.piumarta.com/pepsi/pepsi.html and http://www.piumarta.com/pepsi/coke.html (read the documents; what a concept). And, I found there are a lot of stuff I ignored first^^; Anway, here is what are important to get the feel for it.
- Objects in the Jolt/Coke system is based on the "Id" Object model.
- There is a "hard-wired" language (default language) called "IdSt" on this model.
- The Id object model is "a kind of" prototype-based model. But, there is a concept of a "family" of objects. The objects in a family have shared behavior. The shared behavior is provided by a "vtable" object.
- In IdSt, there is a default tree of families of objects that looks like an inheritance hierarchy of a typical OO language.
- For IdSt, there is a static compiler that can create an executable binary.
- The Jolt system (the interactive shell) is written in IdSt and compiled to be an executable binary.
- In the Jolt binary, IdSt objects are "there for your service". From the "above" (i.e, the code running on the interactive shell), you can have access to these objects. The stuff you write in []'s are these objects and messages to them.
- You can also modify the behavior of these objects at compile time and runtime. You can go down pretty deep. (And, Jolt is just a prototype. We are supposed to be able to go down even deeper in the future systems.)
To understand the Id object model (1), and the IdSt language (2), you should just read the page, read the code and try by yourself.
The concept of family of objects (3) is a bit different from Smalltalk's class hierarchy. If you think about a class-instance model, there is an object that is bound to an globally accessible identifier, and it holds the method dictionary that dictates the behavior of guys in the family. In IdSt, there is globally accessible one in the family via an identifier, but it is just like other ones in the family. In the other words, there are bunch of objects that share the same vtable, and only one happens to have global name. (That one is the prototype of the family.) For example, there is a code snippet from "main.st" that shows how the Singleton pattern is implemented:
Options : Object ( verbose )
Options verbose [ ^verbose ]
Options new
[
self := super new.
verbose := false.
]
[ Options := Options new ]The first line defines a new family called "Options" that is a subclass, err, a family, that delegates to "Object". An object in the family will have an instance variable called "verbose". (And, the first line actually create the prototype object in the family.) The second line defines a simple getter for the instance variable. The next definition of "new" defines what to do when a new instance is being created. "self := super new" may look strange, but it simply substitutes the default receiver for the rest of method. "verbose := false" sets the instance variable of the newly created instance to false. (Note from Göran: And since this method is meant to return a new instance it will do so, since there is an implicit return of self at the end, which now refers to the new instance instead of the prototype Options instance.)
Then, the last line is executed. The line creates a new instance and replaces the original prototype with it. (It is not exactly a singleton, as you can substitute the prototype, but anyway...)
And, do you remember my "vector-map" example? Like #collect: in Smalltalk, I wanted the result returned from "vector-map" to be the same kind as the "list" argument's, but I was using the definition below that always returns an Array, no matter what kind of collection is given.
(define vector-map
(lambda (func list)
(let ((s [list size])
(ret [Array new: [list size]])
(idx '0))
(while [idx < s]
[ret at: idx put: (func [list at: idx])]
(set idx [idx + '1]))
ret)))The right code should use "[list new: [list size]]", instead of "[Array new: [list size]]". The instances share the same vtable, and know how to do prototypical operations like "new:".
There is a default hierarchy of families (4). If you look at examples/jolt/Object.st, there is a portion of code:
Object : _object ()
UndefinedObject : Object ()
Magnitude : Object ()
Time : Magnitude ( _seconds _nanoseconds )
Number : Magnitude ()
SmallInteger : Number ()
Float : Number ()
Symbol : Object ( _size _elements )
Association : Object ( key value )
Collection : Object ()
SequenceableCollection : Collection ()
ArrayedCollection : SequenceableCollection ( size )
Array : ArrayedCollection ( _oops )
ByteArray : ArrayedCollection ( _bytes )
String : ByteArray ()
OrderedCollection : SequenceableCollection ( firstIndex lastIndex array )
IdentitySet : Collection ( tally lists )
IdentityDictionary : IdentitySet ()
FastIdentityDictionary : IdentityDictionary ()
SlotDictionary : Object ( _size _tally _keys _values default )
StaticBlockClosure : Object ( _function _arity )
BlockClosure : StaticBlockClosure ( outer state _nlr )
SinkStream : Object ()
ReadStream : Object ( collection position readLimit )
WriteStream : ReadStream ( writeLimit )
FileStream : ReadStream ( file )
ConsoleFileStream : FileStream ( _prompt )
File : Object ( _fd )
Function : Object ()
Random : Object ( seed a m q r )Note that this hierarchy will be changed, and also you can change it as you like. Nothing is special about this structure (err, probably a few things are special). Also note that there are a few more family definitions in other files in the source code of current Jolt system.
The static compiler that compiles the Jolt system and creates an executable is "idc" (4) (5). Check the documents.
The IdSt objects that implement the Jolt system are alive and accessible from the interactive shell (6). You can import a prototype (or any object) by using "import" function from the interactive shell:
(define Array (import "Array"))
The definition of "import" is in boot.k. It is very simple. (As I gather, "_libid_import" imports a symbol from the main running executable.) You can define a new family with form:
(define-type MyType Object (inst var names))
and add a method to it with form:
(define [MyType someMethodNameLikeFooBar: anArg]
[anArg + '1])You can modify the behavior of interpreter quite deeply. For example, there is no expression that terminates the interactive shell and let us suppose you would like to implement one. If you look at examples/jolt/main.st, there is the default read-eval-print loop of the interactive shell.
The core part of it (it is in the equivalent of the main() function of the program) reads:
[(expr := scanner next) isNil]
whileFalse:
[Options verbose ifTrue: [StdErr print: expr; cr].
expr := expr eval.
echo ifTrue: [StdErr nextPutAll: ' => '; print: expr; cr]].It (literally) reads the next expression from "scanner" (a CokeScanner), evaluates it, and prints it (if "echo" is true).
As you see, the whileTrue: loop above terminates when it gets nil from the scanner. So, if you type the following two lines into the shell:
(define CokeScanner (import "CokeScanner")) (define [CokeScanner next] 0)
the definition of CokeScanner's next is changed so that it always returns NULL (or nil), therefore the loop terminates. Or, you can define:
(define bye (lambda () (define [CokeScanner next] 0)))
and type "(bye)" to quite the shell.
BTW, I haven't measure things, but as I gather, if you re-define the parts of Jolt written in IdSt from the interactive shell, things may run "faster". The code generated by the idc compiler doesn't look awfully efficient, but the dynamic code generator (it is not very well optimized by the way) bypasses some stuff.
Okay, this was the my basic "objects and how it works" explanation.
A unrelated note: a stumble block I tripped a few times was that the comparison operators in the []-world may return naked 0. For example, "['1 > '2]" returns 0 (or nil). You can simply think that the returned object is UndefinedObject and its representation in the ()-world is 0. As Alex pointed out, you can write:
(define [UndefinedObject m] [StdOut nextPutAll: '"hello world\n"])
then you can call that method on nil:
[['5 > '6] m]
I'm getting some understanding on how the Compiler and native code generation works. Tomorrow, I may write about that, or stick to the original plan and write about the syntax stuff.