-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathInSeofInGa.tex
5129 lines (4486 loc) · 232 KB
/
InSeofInGa.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%% LyX 2.3.6.1 created this file. For more info, see http://www.lyx.org/.
%% Do not edit unless you really know what you are doing.
\documentclass[12pt,twoside,american]{scrartcl}
\usepackage[T1]{fontenc}
\usepackage{geometry}
\geometry{verbose,tmargin=2cm,bmargin=2cm,lmargin=2cm,rmargin=1cm,footskip=1cm}
\pagestyle{headings}
\setlength{\parindent}{0cm}
\usepackage{color}
\usepackage{babel}
\usepackage{array}
\usepackage{float}
\usepackage{url}
\usepackage{graphicx}
\usepackage{tablefootnote}
\usepackage[unicode=true,
bookmarks=true,bookmarksnumbered=true,bookmarksopen=false,
breaklinks=true,pdfborder={0 0 1},backref=false,colorlinks=true]
{hyperref}
\hypersetup{pdftitle={Internal Secrets of Infocom Games},
pdfauthor={Michael Ko},
pdfsubject={Everything you wanted to know about the internal working of all text-based Infocom games},
pdfkeywords={Infocom,ZIL},
pdfnewwindow=true,pdfstartview=XYZ,plainpages=false,colorlinks=true,linkcolor=black,citecolor=black,pdfpagelabels}
\makeatletter
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands.
\newcommand{\noun}[1]{\textsc{#1}}
%% Because html converters don't know tabularnewline
\providecommand{\tabularnewline}{\\}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% User specified LaTeX commands.
\usepackage[T1]{fontenc}
\makeatother
\usepackage{listings}
\renewcommand{\lstlistingname}{Listing}
\begin{document}
\author{Michael Ko}
\title{Internal Secrets of Infocom Games\thanks{\protect\url{https://ifsecrets.blogspot.com} and \protect\url{https://docs.google.com/document/d/1yNa4M2gN5cG6WoKOSrHtS4_Sp4uwHt1Q7Pvtw8O8hFQ} }}
\subtitle{Everything you wanted to know about the internal working of all text-based
Infocom games}
\date{{[}2022-09-18{]}}
\maketitle
\pagebreak\tableofcontents{}
\section*{\vfill{}
\protect\pagebreak}
\section*{Introduction}
Decades ago, Infocom was synonymous with good storytelling and creative
puzzles. It wasn\textquoteright t the first company to release a text
interactive fiction game, but its first release, Zork I, made it the
flagbearer for all other IF games to follow. Infocom took over the
personal computer market for interactive fiction games by having their
games programmed in ZIL (Zork Implementation Language) which is a
compact language that easily handled text and object structures. ZIL
programs were compiled into Z-code, an assembly code that would run
on a \textquotedblleft Z-machine\textquotedblright . Infocom created
multiple Z-machine emulators (AKA Zork Interpreter Program or ZIP)
to run on various computers especially home computers like the TSR-80,
Apple II, and Commodore 64. Such an approach allowed a single program
to be run on different computers without having to create a separate
program for each machine.
Buried in these ZIL programs were the basic structure of how an Infocom
game is organized and executed. And it was Infocom\textquoteright s
parser which could understand complex commands with such a small amount
of code that seemed miraculous to IF fans. The inner workings of Infocom
games remained a mystery until reverse engineering was done that decoded
the Z-code format. Much of the knowledge about Z-code was documented
in the Z-Machine Standards Document by Graham Nelson and Mark Howell.
In this, the Z-code instructions and data structure used by some of
those instructions (like Vocabulary by \texttt{read} or objects by
\texttt{get\_prop}) tables were described. As it was a general document,
other data structures specific to Infocom games like verb syntaxes
were not mentioned. Several excellent utilities (infodump by Mark
Howell and ZILF by Jesse McGrew and Josh Lawren) have helped provide
more insight on the data structures of Infocom games. Utilities to
translate the Z-code to a more easily readable format did help show
the inner routines of the games. Without variable names, attribute
references, and object property names, making sense of the game code
remained difficult. An internal Infocom use only document, \textquotedblleft \emph{Learning
ZIL - or - Everything You Always Wanted to Know About Writing Interactive
Fiction But Couldn't Find Anyone Still Working Here to Ask}\textquotedblright ,
gave basic details about the how the games run and how the syntax
parts of the game were designed in ZIL. The release of an MDL-based
Zork source code did offer more insight into the inner workings of
the original Zork game, but understanding MDL was an added barrier.
While there are basic parsing and command processing algorithms in
the MDL-version of Zork, they were significantly changed when translated
to the microcomputer versions. The release of the \textbf{Mini-Zork}
source code and other tidbits from the Infocom cabinet helped unmask
and clarify the game code and the coding process that are essential
to all Infocom games.
This blog will describe those hidden details of Infocom games, using
\textbf{Zork 1} as the base. It will also provide some of the modifications
made to these core routines by specific Infocom games. Subsequent
post will provide the inner puzzle workings in all Infocom games.
Only the text games will be used. None of the graphics based games
are analyzed.
All routine and variable names in the Infocom source code (\textbf{Mini-Zork})
and documentation are in all capital letters. Game names are in bold.
Unnamed variables and routines are given names that are a best guess.
Apologizes for any errors in these posts. They'll be fixed if possible.
\section{ZIL, ZILCH, ZAP, and ZIP}
\subsection{Introduction}
ZIL is a derivative of MDL (MIT Design Language), a LISP-like language,
with a minimal set of instructions needed to created IF games. The
Infocom game files created from ZIL source code were run through the
ZIL compiler (ZILCH) to create Z assembly language code. This assembly
code was then sent through the Z Assembler Program (ZAP) to create
the actual Z-code. No copies of ZILCH or ZAP software have ever been
released or leaked to the public. There is also very little documentation
about these programs. The Infocom Cabinet files only mentions some
compilation flags used in ZILCH and the basic function of ZAP. The
final Z-code could be run on a \textquotedblleft Z-machine\textquotedblright{}
or any computer that emulated one using a Zork Interpreter Program
(ZIP). Different ZIP versions have different memory and processing
requirements. All have frequently used and writable data in their
\textquotedblleft core\textquotedblright{} memory which enables quicker
execution of the program. Less used and read-only data is read from
a storage medium (file or disk) on an as needed basis.
Throughout the history of Infocom, six versions of the Z-code were
recreated, but only the first 5 were used for text-only games.
\subsection{ZIP versions 1}
Many of the Z-code instructions used in Infocom games are in this
first version of ZIP. It is unclear what the memory requirements are
for it though. Only one ZIP 1 game, \textbf{Zork 1}, was ever released
by Infocom.
\subsection{ZIP version 2 }
ZIP 2 was used to create \textbf{Zork 1}-R15 and the first version
of Zork 2. It introduced frequent words (AKA abbreviations) and a
new header element (serial number) which is just a 6 digit ASCII representation
of the release date in year-month-day format.
\subsection{ZIP version 3}
Version 3 was used to create a majority of the Infocom games. Infocom
documents indicates the version 3 compatible ZIP should have a minimum
of 40K (maybe 32K) of core memory and a floppy drive with at least
80K of storage. Due to limitations to the ZIP 3 format, the maximum
size game file is 128K. Also the response time should be a few seconds
for the average command.
Deadline would be the first new game using this version. This version
increased the number of abbreviations and added more output functions.
Even after ZIP 4 and 5 were introduced, Infocom would produce 24 games
using ZIP 3.
For outputs, ZIP 3 compatible programs had 2 or 3 \textquotedblleft windows\textquotedblright{}
where text could be displayed instead of 1 in ZIP 1 and 2. One of
these, the status bar is not usually used unless a special flag bit
is set in the header. Most games create the status bar in the main
or lower, window(0). Two additional output streams were created. One
\textquotedblleft streams\textquotedblright{} the text into a specified
memory block. The other was called a command script which recorded
the input and output of the game.
Finally, game verification was added by adding a file length and checksum
into the header. The VERIFY command would check the game data file
against these values.
\subsection{Enhanced ZIP (EZIP), version 4}
Version 4, renamed Enhanced ZIP (EZIP), was introduced with \textbf{A
Mind Forever Voyaging}. A smaller version of EZIP was apparently created
called, LZIP, or lower-case EZIP for computers with memory limitations.
The minimal hardware requirements for EZIP were at least 128KB of
memory and a single disk drive that can at least access 140KB. The
computer system needs to have upper and lower case characters and
a screen to display 80 characters across with at least 14 lines. These
requirements would allow games to only take a few seconds to complete
a typical \textquotedblleft go\textquotedblright{} command. The maximum
size game file is 256K.
EZIP increased the memory space to allow more objects and rooms along
with longer words (up to 9 characters) in the vocabulary. Several
new Z-code operands also simplified some of the table-related coding
for these games, offered more ways to call routines, and gave more
control to how text displayed on the screen by dividing it up into
various windows.
The specs of EZIP limited these games to an IBM PC, Macintosh, enhanced
Apple //e, C128, Amiga, and Atari ST. Four games were released using
this. No known LZIP version of games seem to exist.
\subsection{Extended ZIP (XZIP), version 5}
Introduced with Beyond Zork, version 5 (XZIP) allowed for games with
more objects and rooms and added more opcodes for routine calling,
text, and table manipulations. Borderzone and Sherlock would be the
only other new games to use this version. Infocom would release its
Solid Goal (AKA Greatest Hits) series which were re-releases of past
popular games but in the XZIP compatible format. However these games
did not utilize the added functionality of XZIP.
\subsection{YZIP, version 6}
YZIP (successor to XZIP) is version 6 which mainly added mouse, graphical
windows, and menu functions with Z-codes. The graphical based Infocom
games used this version and will not be discussed.
\section{General structure of Infocom Games and Data Structures }
\subsection{Introduction and Story Headers}
Since version 1 of ZIP, the story header is a fixed 64 bytes at the
start of every Infocom game that provides information regarding the
location of specific data and code. The header for ZIP 1 games used
the first 9 words/18 bytes. Subsequent ZIP versions would add more
values into the header. XZIP (Version 5) would use all 32 words (64
bytes).
The original story header information about the game and addresses
to specific types of data. Details are in Appendix A. The first word,
ZVERSION, ensures that the ZIP is compatible with the given story
file and emulates the proper Z-machine version. The version number
is located in the first byte. The second byte has the mode flags which
are set by the ZAP assembler (for bits 0 and 1) and the ZIP at startup
for the remaining bits. START address will be the first operation
to be executed by the ZIP. It points to the first instruction and
not the start of the routine. The next three addresses are used by
the various instructions. All the object-related operators use the
address in OBJECT to manipulate the object data. The READ instruction
uses the Vocabulary data at VOCAB to match and tokenize the characters
in the input buffer. GLOBALS points to the table holding the global
variables data.
ZIP 1 Story Header:
\begin{table}
\begin{tabular}{|l|l|l|}
\hline
Word & Name & Function\tabularnewline
\hline
\hline
00 & ZVERSION & %
\begin{tabular}{l}
Byte 0: Z-machine version \tabularnewline
Byte 1: Z-machine mode\tabularnewline
\end{tabular}\tabularnewline
\hline
01 & ZORKID & Release number\tabularnewline
\hline
02 & ENDLOD & End address of pre-loaded memory, start of variable (or exchangeable
memory)\tabularnewline
\hline
03 & START & Address of first instruction (not routine), Program Counter\tabularnewline
\hline
04 & VOCAB & Address to Vocabulary table\tabularnewline
\hline
05 & OBJECT & Address to Object data\tabularnewline
\hline
06 & GLOBALS & Address to Global Variable table\tabularnewline
\hline
07 & PURBOT & Start address of read-only (static) memory\tabularnewline
\hline
\end{tabular}
\caption{ZIP 1 Story Header}
\end{table}
Other information would be added to the story header with newer ZIP
versions. The complete layout is in Appendix A.
\subsection{The Memory Layout of Infocom Games }
The layout of data in the Infocom game files is fairly consistent
between games. An example using \textbf{Zork 1} is shown below. The
\textquotedblleft memory\textquotedblright{} of the emulated Z-machine
is first loaded with the header information starting at \$0000. Memory
from \$0000 up to PURBOT encompasses the writeable memory such as
variables, objects, tables, and buffers. Data after PURBOT is read-only
data such as syntax data, routines, preposition data, and strings.
The exact locations of the syntax data, action routines, and preposition
table are stored in global variables, not the header. The ZIP will
continue to load data into memory. All data up to the address in ENDLOD
must be loaded. The ZIP may decide to load data past ENDLOD at its
discretion. All data after ENDLOD can be swapped out with other disk-based
data as needed. Usually routines and strings are the data that is
swapped. All important and quickly accessible data remains resident
in memory below the ENDLOD address. A sample layout of data structure
in \textbf{Zork 1} gives more details (using infodump):
\begin{verbatim}
Base End Size
0000 3F 40 Story file header
0040 7D 3D Objects - Property Defaults (Address at OBJECTS, word 5)
007E 974 935 Objects - Entries
0975 203A 16C6 Objects - Property data
203B 221A 1E0 Global Variables (Address at GLOBALS, word 6)
221B 2C28 A0E Misc Data (such as tables and buffers)
2C29 2CFE D6 Syntax - Pointer table (Address at PURBOT)
2CFF 33d4 6D6 Syntax - Entries (referred by the Pointer Table)
33D5 34b6 E2 Action routine table (Address in global variables)
34B7 3598 E2 Pre-action routine table (Address in global variables)
3599 35DE 46 Preposition table (Address in global variables)
35DF 46CA 10EC Vocabulary (Address at VOCAB, word 4)
46CC EDC5 A6FA Routines (Load up to ENDLOD, $5DEC here)
EDC6 14328 5563 Strings
14329 143FF D7 Empty (fill up to the end of memory page)
\end{verbatim}
\subsection{The Massive Object Table}
Objects are the heart of the Infocom games and are the first data
after the story header. Storing them in a logical and efficient manner
for the game to access is important. Each object contains information
such as the name, properties and attributes. It will also have a parent
and possibly siblings. The object\textquoteright s parent is usually
a room or container that contains the object. Siblings are objects
that are in the same room or container. Rooms are also considered
objects and have a generic ROOMS object as the parent for all rooms.
Attributes are true/false bits that set various characteristics of
an object such as is it takable (TAKEBIT) or a source of fire (FLAMEBIT).
They are numbered 0 to 31. Each object will define all the attributes.
Properties are like attributes but hold groups of bytes instead of
a bit. The maximum size for a property is 8 bytes. If no property
is given for an object, the default property value (a word) is used.
Properties are numbered 1 to 31. Objects are numbered from 1 to 255.
\newpage{}
\begin{itemize}
\item Default Property Table:
\begin{itemize}
\item 31 words that are the default property values if it is not explicit
set for a specific object
\end{itemize}
\item Object Entries - repeated for all objects
\begin{itemize}
\item 4 bytes representing the 32 attribute bits (\#0=top bit of byte 0,
\#31=bottom bit of byte 3)
\item 1 byte each for object number of the parent, sibling, and child of
the object
\item 1 word address to that object\textquoteright s property table
\end{itemize}
\item Property Table - repeated for all objects with their own property
table
\begin{itemize}
\item 1 byte for size of the object name in words
\item Z-string of the object name
\item 1 byte for property info (LLLNNNNN)
\begin{itemize}
\item bits 5-7 = length of property-1 (length 1-8 bytes is represented as
0-7)
\item bits 0-4 = property number
\end{itemize}
\item 1-8 bytes of property data
\item The 1 byte property info and 1-8 property data are repeated for each
property, listed in descending order
\end{itemize}
\end{itemize}
Example object property table:
\begin{figure}[H]
\includegraphics[width=1\textwidth]{figures/PropertyTableExample}
\caption{Property Table Example}
\end{figure}
The meaning of an attribute or property number is consistent in all
objects in that game but not consistent between games. For example,
the attribute bit and properties used in \textbf{Zork 1} are listed
in Appendices B and C. The number of objects is not specifically stored
here but calculated based upon the number of object entries. It can
also be calculated by the difference between the addresses of the
first object entry and first property table entry divided by 9 bytes
for each entry.
\subsection{Global Variables}
The section following the objects is a 480 byte block corresponding
to 240 global variables with each holding a word. By default in Infocom
version 1-3 games, variable 0 contains the object number of the player\textquoteright s
current location. Variable 1 is the score while variable 2 is the
number of turns or game time. A game does not need to use all 240
global variables and can use part of this memory block for other variable
data such as tables and buffers. The addresses to these other data
structures as well as addresses to specialized grammar structures
like syntax entries, preposition table, and action routine table are
typically stored in global variables.
\subsection{Tables and Buffers}
Between the last global variable and the start of the static data
(at PURBOT), there is usually extra memory that is used to store custom
data structures. These are typical tables and buffers. Tables are
a fixed set of words where the first word (element 0) contains the
number of elements in the table. Examples include tables to hold the
direct or indirect object numbers. A buffer is just a fixed set of
bytes used to hold any type of data. The most common buffers used
in Infocom games are the input buffer (INBUF) or token buffer (LEXV).
These data structures are not standard to ZIL and are specific to
Infocom games. For ZIP 1-3 games, these tables are located in the
preloaded memory of the computer. For EZIP and XZIP games, tables
can also be located amongst the game routines. The addresses to these
buffers and tables are usually stored in specific global variables
but can sometimes be hardcoded into the routines that use them. If
a game does not need all 240 global variables, the space for those
unused variables can be used for buffers and tables. Infocom games
have their own custom ZIL routines for reading and writing to these
structures.
\subsection{Vocabulary}
The vocabulary contains information about every valid word in the
game. It begins with a header which contains the information about
separator characters and the entries themselves.
\begin{itemize}
\item 1 byte - N number of separator characters (used to mark the end of
words)
\item N bytes - list of N separator characters
\item 1 byte - length of each vocabulary entry (Z-string + word type data)
\begin{itemize}
\item ZIP 1-3 games: 4 bytes for Z-string
\item EZIP and XZIP games: 6 bytes for Z-string
\item All games use 3 bytes for the word type data except Sherlock (uses
2 bytes)
\end{itemize}
\item 2 bytes - number of vocabulary entries
\end{itemize}
A long list of vocabulary entries will follow this header. Each entry
starts with a 4 or 6 byte Z-string of the word. The remaining 2 or
3 bytes contain:
\begin{itemize}
\item 1 byte for word type:
\end{itemize}
\medskip{}
\begin{tabular}{|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|}
\hline
\multicolumn{6}{|c|}{Primary word type} & \multicolumn{2}{c|}{Secondary word type}\tabularnewline
\hline
Bit 7 & Bit 6 & Bit 5 & Bit 4 & Bit 3 & Bit 2 & Bit 1 & Bit 0\tabularnewline
\hline
\$80 & \$40 & \$20 & \$10 & \$08 & \$04 & \$00 = Noun & \$02 = Adjective\tabularnewline
\hline
Noun & Verb & Adjective & Direction & Preposition & Special & \$01 = Verb & \$03 = Direction\tabularnewline
\hline
\end{tabular}
\medskip{}
\begin{itemize}
\item 1 byte for second ID value - The secondary ID value if a secondary/non-default
word type is requested (original format, all games except \textbf{Sherlock})
\item 1 byte for default ID value - The default ID value
\end{itemize}
ID values are a special number that corresponds to the token such
as a verb number for a verb. For example, the default ID value for
a the noun \textquotedblleft MATCH\textquotedblright{} would be its
object number. During word type matching, the default value is returned
if there is a match.
If a secondary word type is also given and is a valid word type, then
the secondary ID value is given. If there is no match, then the default
one is returned. For example, \textquotedblleft INFLAT\textquotedblright{}
has its last 3 bytes as {[}62 a8 d3{]}. Bits 7-2 of the word type
value \$60 indicate this token can be a verb (\$40) or adjective (\$20).
The last 2 bits (\$02 in this example) indicates the secondary word
type, adjective. So the default word type is a verb. When the program
checks if \textquotedblleft INFLAT\textquotedblright{} is a verb or
adjective:
\begin{lstlisting}[basicstyle={\ttfamily}]
call wt?("INFLAT",$40) or <WT? "INFLAT", PS?VERB>
call wt?("INFLAT",$20) or <WT? "INFLAT", PS?ADJECTIVE>
\end{lstlisting}
the word type matching routine will return the default, \$d3. If a
second word type (\$02) is also passed to the matching routine:
\begin{lstlisting}[basicstyle={\ttfamily}]
call wt?("INFLAT",$40,$02) or <WT? "INFLAT", PS?VERB, P1?ADJECTIVE>
call wt?("INFLAT",$20,$02) or <WT? "INFLAT", PS?ADJECTIVE, P1?ADJECTIVE>
\end{lstlisting}
then the routine will return \$a8. If the second word type does not
match, such as \$03 (for direction), then the default, \$d3, will
be returned.
The E/XZIP vocabulary introduced 6 bytes for the Z-string which allowed
for longer tokens (maximum of 9 characters). The size of each entry
in Vocabulary grew to 9 bytes. A newer compressed vocabulary entry
format was also created which used a single ID value. The need for
a second ID value mainly occurred with tokens that can be a direction
and preposition. This format only had 8 byte entries. Only \textbf{Sherlock}
used this compressed vocabulary format and had modified syntax routines
which used the PREP table to look for the this second ID value (preposition
value) in situations where a token could a direction or preposition.
\subsection{Strings}
Much of the code for Infocom games are the various descriptions and
text responses that are stored as strings in the special Z-character
format. It is a compressed format where the characters are represented
by 5 bits instead of 8. This allows 3 letters to be stored in the
space of 2 bytes instead of 3. More detailed information including
the character sets is in the Z-Machine Standards Document.
\section{Syntax Entries - The Biggest Mystery of them All}
\subsection{Introduction}
Probably the most innovative part of Infocom games were their ability
to understand commands written in conversational English. The different
types of grammar information were discussed in the \textquotedblleft Learning
ZIL\textquotedblright{} document. The structure of syntax entries
in ZIL was shown, but the layout in the Z-code files was not mentioned.
Infodump and ZILF do provide some extra information about the syntax
structure. There are 3 additional grammar-related data blocks in the
game not mentioned in the header: prepositions, syntax pointer table,
and syntax entries.
\subsection{Prepositions}
There is a separate table of prepositions to speed up the syntax matching
process.
\begin{itemize}
\item 1 byte for number of prepositions
\item 2 word entries: address of preposition in vocabulary and preposition
number
\end{itemize}
The prepositions are numbered from \$FF and decrease. The address
to this table is stored as a global variable. EZIP and XZIP use a
compact form of the Preposition table that used a byte instead of
a word for the preposition ID number.
\subsection{Syntax Entry Pointer Table}
The syntax entries are probably the most confusing part of Infocom
games thanks to the lack of documentation. They provide the syntax
structure for a particular action. To find the matching syntax entry,
the verb number is needed. Since verbs can have synonyms, different
verbs can have the same verb number like \textquotedblleft GET\textquotedblright{}
and \textquotedblleft TAKE\textquotedblright . So they would use the
same syntax entries. The syntax entry table lists the address where
the group of syntax entries for that specific verb number is located.
This table is just a block of addresses with verb number \$FF is the
first address. Subsequent addresses correspond to smaller verb numbers.
\subsection{Syntax Entries}
PARSER will then look through each of the syntax entries for the matched
verb number and return the entry that best completely matches the
given (if any) prepositions and noun clauses types. For example, the
syntax entry table for verb number \$F3 (or GET) has 7 different syntax
entries. The syntaxes of \textquotedblleft GET object\textquotedblright ,
\textquotedblleft GET object from object\textquotedblright , and \textquotedblleft GET
on object\textquotedblright{} would correspond to 3 different entries.
To store this grammatical information, this group of entries start
with a byte indicating how many entries for that verb. It is followed
by multiple 8 byte entries for all acceptable grammatical combinations:\medskip{}
\begin{tabular}{|l|l|}
\hline
Byte & Contents\tabularnewline
\hline
\hline
0 & Number of object clauses\tabularnewline
\hline
1 & Prep number for direct object\tabularnewline
\hline
2 & Prep number for indirect object\tabularnewline
\hline
3 & GWIMBIT number for direct object\tabularnewline
\hline
4 & GWIMBIT number for indirect object\tabularnewline
\hline
5 & LOC byte for direct object\tabularnewline
\hline
6 & LOC byte for indirect object\tabularnewline
\hline
7 & Action Number\tabularnewline
\hline
\end{tabular}
\subsection{Get What I Mean (GWIM) Feature}
The GWIMBIT number is used in the FIND feature mentioned in section
9.5 of \textquotedblleft Learning ZIL\textquotedblright{} to find
unspecified but necessary objects in a command. PARSER will attempt
to find an object in the current location with a set attribute flag
corresponding to the GWIMBIT number. If only one object is found to
match, it will assume the user meant that object and use it in the
given command. If no object or more than one object matches, PARSER
will ask for clarification (called orphaning). The player can then
give a clarifying answer without retyping the entire previous command
or type a completely new command.
For example, the syntax entry for \textquotedblleft IGNITE OBJ WITH
OBJ\textquotedblright{} has a GWIMBIT number for the indirect object
set to the FLAME bit. If that entry is the best syntax match the command
\textquotedblleft IGNITE TORCH\textquotedblright , the indirect object
is still missing. PARSER will try to find an object with the FLAME
bit set to use as the indirect object. If a an object in the current
location has the FLAME bit set (like a lantern), PARSER will assume
the indirect object is the lantern. The command will then assume to
be \textquotedblleft IGNITE TORCH WITH LANTERN\textquotedblright{}
and the indirect object will be set to LANTERN.
\subsection{Location Restriction of Objects}
The LOC byte is probably the most mysterious value in the syntax entry.
The highest 7 bits indicate how PARSER searches and checks on requested
objects. For example, PARSER will not complete an action with an object
on the ground if its syntax entry requires an object be held or carried.
While \textquotedblleft Learning ZIL\textquotedblright{} has listed
9 possible properties, the source code for \textbf{Mini-Zork} indicates
only 7 are used through version 3:\medskip{}
\begin{tabular}{|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|>{\centering}p{0.1\columnwidth}|}
\hline
Bit 7 & Bit 6 & Bit 5 & Bit 4 & Bit 3 & Bit 2 & Bit 1 & Bit 0\tabularnewline
\hline
\multicolumn{4}{|c|}{Location-related Flags} & \multicolumn{4}{c|}{Possession-related Flags}\tabularnewline
\hline
\$80 & \$40 & \$20 & \$10 & \$08 & \$04 & \$02 & \$01\tabularnewline
\hline
{\footnotesize{}HELD} & {\footnotesize{}CARRIED} & {\footnotesize{}ON-GROUND} & {\footnotesize{}IN-ROOM} & {\footnotesize{}TAKE} & {\footnotesize{}MANY} & {\footnotesize{}HAVE} & {\footnotesize{}Not used}\tabularnewline
\hline
{\scriptsize{}At top level and not inside another container} & {\scriptsize{}Not at top level, contained inside another object} & {\scriptsize{}At top level of a room and not inside another container} & {\scriptsize{}Not at top level, contained inside another object on
the ground} & {\scriptsize{}Will automatically TAKE object in the current location
if necessary before using it} & {\scriptsize{}Multiple objects are allowed for a particular action} & {\scriptsize{}Must already be in the user\textquoteright s possession} & \tabularnewline
\hline
\end{tabular}\medskip{}
When the GWIM feature tries to search for an unspecified objected,
that routine needs to know how far to search. This is indicated by
the flags for HELD, CARRIED, IN-ROOM, and ON-GROUND which guide how
the function, SEARCH-LIST, finds objects in the given location (a
room or the user). More details will be given in section 13.2.
\textquotedblleft Learning ZIL\textquotedblright{} does mention an
EVERYWHERE and ADJACENT option, but there is no evidence that they
were ever used in version 1-5 games as confirmed by the internal Infocom
documents on ZIL. It could\textquoteright ve been used in the graphical
YZIP- based games.
\subsection{Pre-actions and Actions}
The action number is used to look up the routine address from the
ACTION and PRE-ACTION tables. These tables sequentially list the packed
addresses with any reference table. All syntaxes that refer to a similar
action (INSERT DOWN object, DROP object, and SPILL object IN object)
will use the same action number. INSERT object ON object and INSERT
object UNDER object use two different action numbers as the game processes
these actions differently. The action number is also used to lookup
the address for the pre-action routine if one exists (if not, \$0000
is used). A pre-action routine can check the objects, variables, or
game status before a particular action routine is called. The same
pre-action routine can be used with different action routines.
\subsection{An Example}
Multiple verbs have the same verb number such as CARRY, GET, and TAKE
in this example. The verb number will then correspond a group of syntax
entries. Here only 3 are used:
\begin{verbatim}
[02 00 f0 11 00 64 00 39] "carry OBJ from OBJ"
[01 f9 00 00 00 00 00 31] "carry out OBJ"
[01 00 00 11 00 34 00 39] "carry OBJ"
\end{verbatim}
In the first example, the \$02, indicates two noun clauses required
for that syntax. The next two bytes indicate the required prepositions
for the noun clauses. The \$00 indicates no preposition before the
direct object clause. The \$F0 refers to the preposition for the indirect
object clause, FROM in this case. The GWIMBIT number \$11 is the attribute
to check on objects if the direct object is missing. There is no GWIMBIT
number for the indirect object. The direct object LOC byte \$64 indicates
the direct object should be CARRIED or ON-GROUND. Also, multiple objects
can be in the direct object clause. In the second example, the OUT
preposition is needed before the direct object. In the third example,
the direct object LOC byte \$34 indicates the direct object needs
to be ON-GROUND or IN-ROOM. The last value (\$39 or \$31) is the action
number which indicates what specific routine to execute for that command.
Since the first and third examples are very similar, the same routine
will be used for both types of commands.
\subsection{Update: Compact Syntaxes to Save Space}
A variable sized syntax entry format was used only with Sherlock to
help save space. There are 3 sizes for the syntax entries based upon
the number of noun clauses. The format is described below. The preposition
number is stored in the lower 6 bits (after subtracting \$C0/192 from
the preposition number) of bytes 0 and 3.\medskip{}
\begin{tabular}{|>{\raggedright}p{0.1\columnwidth}|>{\raggedright}p{0.2\columnwidth}|>{\raggedright}p{0.09\columnwidth}|>{\raggedright}p{0.09\columnwidth}|>{\raggedright}p{0.09\columnwidth}|>{\raggedright}p{0.09\columnwidth}|>{\raggedright}p{0.09\columnwidth}|>{\raggedright}p{0.09\columnwidth}|}
\hline
& Byte 0 & Byte 1 & Byte 2 & Byte 3 & Byte 4 & Byte 5 & Byte 6\tabularnewline
\hline
No objects & \# of objects (high 2 bits) / Prep ID (low 6 bits) & Action Number & - - - & - - - & - - - & - - - & - - -\tabularnewline
\hline
Only direct object & \# of objects (high 2 bits) / Prep ID (low 6 bits) & GWIMBIT byte & LOC byte & Action Number & - - - & - - - & - - -\tabularnewline
\hline
Direct and indirect objects & \# of objects (high 2 bits) / Prep ID (low 6 bits) & GWIMBIT byte & LOC byte & Prep ID (low 6 bits) & GWIMBIT byte & LOC byte & Action Number\tabularnewline
\hline
\end{tabular}
\section{Execution of Infocom Games - An Overview}
\subsection{Introduction}
For all of their perceived complexity, Infocom games have a fairly
straightforward and consistent method of getting player commands,
parsing them, and executing them quickly. Each game would start with
initial setting of global variables and interrupts. This loop starts
with PARSER getting a command. It will then check and extract all
needed grammatical information. The game will use PERFORM to call
an action on the various objects in that command. CLOCKER will check
and executed any interrupts if necessary. Finally, the program repeats
this indefinitely or the game stops. Each game used the same basic
code from initialization of the game through handling interrupts.
The repeated use of previous code help minimize new bugs and kept
the play of the games consistent. But every game had some kind of
new code which could introduce bugs. The description of these main
backbone routines will be from the original \textbf{Zork 1} game from
1979. Information about updates to these routines will then follow
each section.
\subsection{Initialization with GO}
Initialization of each game begins with the GO routine which performs
(at least) four required tasks:
\begin{itemize}
\item Set important game variables like object number for WINNER or starting
location
\item Set any interrupt routines
\item Display the version information of the game
\item Execute the LOOK command for the starting location
\end{itemize}
The game then moves to the MAIN-LOOP.
\subsection{Heart Beat of the Game with MAIN-LOOP}
The game will repeatedly get new commands and execute them in an indefinite
loop, the MAIN-LOOP. The commands are obtained and parsed using PARSER
which returns the action, direct objects, and indirect objects referenced
by the given command. This loop will then call PERFORM to execute
the given action on all the direct and indirect objects. The turn
of the loop will end by calling CLOCKER which executes any necessary
interrupts. MAIN-LOOP will then repeat all of these steps indefinitely.
\subsection{Understanding the User with PARSER}
PARSER takes the player\textquoteright s input or any remaining input
from the last command and extracts the parts of command: verb, direct
object, and indirect object. The part of speech of tokens are identified
and later used by CLAUSE to find the start and end of object clauses.
\subsection{Completing Previous Command with ORPHAN-MERGE}
If a previous command was orphaned, the given command is examined
to by ORPHAN-MERGE to see if it supplies the missing information from
the orphaned command. If so, then ORPHAN-MERGE combines the new information
with the previous command. PARSER continues to process this fixed
command just like any other command.
\subsection{Match the Command with SYNTAX-CHECK}
The use of syntax templates allowed Infocom games to understand multiple
commands using similar tokens but in different orders. After PARSER
identifies the verb, direct object, and indirect object, SYNTAX-CHECK
will try to match a syntax template to the given verb, prepositions,
and objects. The action with the matched syntax template is returned.
If the best matched syntax template still has missing objects, the
game will attempt to supply them with objects in the current location.
\subsection{Find the Objects with SNARF-OBJECT}
After the matching syntax template and objects are verified, the game
will process each object clause and match the objects with objects
from the game. It will also handle modifiers like EXCEPT and quantities
like ALL.
\subsection{Final check on Objects with TAKE-CHECK and MANY-CHECK}
The game will also confirm if the objects need to be taken first (TAKE-CHECK)
and if multiple objects are allowed for the action (MANY-CHECK). Once
all the checks are completed, the game creates a table of all the
objects that are referred in each object clause.
\subsection{Call the Actions with PERFORM}
Finally, the game will execute the requested action (PRSA) on the
given direct (PRSO) and indirect (PRSI) objects with PERFORM. If one
noun clause has one object but the other has multiple, the game will
cycle through the clause with the multiple objects and use the same
object for the other clause. However, if both clauses have multiple
objects, only the direct objects will be cycled. Only the first indirect
object will be used.
\subsection{Tick the CLOCKER}
After the player command(s) are performed, the game\textquoteright s
interrupt system \textquotedblleft ticks\textquotedblright . Any interrupt
whose tick count reaches 1 will have its associated routine called
and then become disabled.
\subsection{Graphic representation of Game Flow}
Routine layout for \textbf{Zork 1}. Dotted lines indicate the order
of function calls by MAIN-LOOP.
\begin{figure}[H]
\includegraphics[width=1\columnwidth]{figures/RoutineLayoutForZork1}
\caption{Routine Layout for \textbf{Zork 1}}
\end{figure}
\section{GO - Gentlemen, Start Your Engines!}
\begin{description}
\item [{Arguments:}] None
\item [{Return:}] None
\end{description}
\subsection{Introduction}
Every Infocom game begins with an initialization routine, GO, that
sets certain variables and executes specific routines. It will then
jump into the MAIN-LOOP which will indefinitely ask for commands from
the user and process them.
\subsection{Running GO}
All Infocom games have the similar initialization routine for each
game. \textquotedblleft Learning ZIL\textquotedblright{} mentions
that the GO routine should:
\begin{itemize}
\item Set special global variables
\item Set interrupts, usually with the QUEUE or INT routines
\item Display an opening text/title screen
\item Call V-VERSION to show copyright information, release number, and
serial number
\item Call V-LOOK to describe the current location
\item Call the MAIN-LOOP
\end{itemize}
The important global variables that are set include WINNER (object
number for current active actor which is usually the PLAYER object),
HERE (current location of the WINNER), and LIT (indicates if the current
location is lit). All version 4 (except AMFV) and 5 games will also
check the width of the screen. Some games will not execute if the
screen width is too small. Infocom documentation recommends all games
start with an opening title screen and display game information before
showing the current locations description.
\section{MAIN-LOOP - Heart of the Game}
\begin{description}
\item [{Arguments:}] None
\item [{Return:}] None
\end{description}
\subsection{Introduction}
MAIN-LOOP is the heart of all Infocom games and keeps the game structure
orderly. It repeatedly requests for parsed commands and loops indefinitely.
MAIN-LOOP does not get modified too much with newer games. Many of
the changes were to make programming game-specific details and restrictions
easier to do. These game changes essentially provided more checks
on the player input and provided better responses. Only significant
changes to MAIN-LOOP will be later described.
\subsection{The Details}
MAIN-LOOP will call PARSER to ask and process a user\textquoteright s
command. If PARSER cannot properly parse the command, MAIN-LOOP will
continue to call PARSER to process new commands. If it has successfully
processed a command, PARSER will set PRSA (parser action) with the
requested action number (the 8th byte in a syntax entry) and fill
the PRSO (parser direct object) and PRSI (parser indirect object)
tables (P-PRSO and P-PRSI) with all the direct and indirect objects
requested. This is different to what is described in \textquotedblleft Learning
ZIL\textquotedblright . MAIN-LOOP then loops and acts upon all the
objects:
\begin{enumerate}
\item Check the number of objects in the direct and indirect clauses
\item If the direct objects clause has no objects, then see if the action
is GO.
\begin{enumerate}
\item If so, then call PERFORM with GO and the direction in PRSO.
\item If no objects are needed for the requested action, then call PERFORM
on PRSA with no objects
\item If at least 1 object clause is needed for the requested action, then
print an error message. Display a specific error message if the command
is an invalid response to an orphaned command.
\end{enumerate}
\item One clause will be designated the multiple one and one clause has
a constant object, first one in the clause.
\item Call the requested action with PERFORM multiple times for each object
in the multiple object clause as while the other clause just has its
first object used.
\begin{enumerate}
\item If M-END is returned, then halt the processing of multiple objects.
Erase any remaining commands.
\item If M-END is not returned, then continue looping through the multiple
objects
\end{enumerate}
\item Increase number of turns by 1 (even if multiple object are processed).
\item Call CLOCKER to check interrupts even if the given command was not
valid. This was later changed in Deadline and other future games to
only calling CLOCKER if PARSER was successful.
\end{enumerate}
\subsection{Details of Multiples of Multiples}
The MAIN-LOOP handles commands with multiple objects for a given action.
It will loop through these objects and execute the same action for
each object. However, there is some confusion as to how it determines
preference if two sets of objects are given. The examples below show
how MAIN-LOOP iterates through multiple objects.\medskip{}
\begin{tabular}{|>{\raggedright}p{0.3\textwidth}|>{\raggedright}p{0.3\textwidth}|>{\raggedright}p{0.3\textwidth}|}
\hline
{\footnotesize{}Multiple Direct Objects} & {\footnotesize{}Multiple Indirect Objects} & {\footnotesize{}Multiple Direct and Indirect Objects}\tabularnewline
\hline
\hline
{\footnotesize{}IGNITE CANDLE AND PAPER WITH TORCH}{\footnotesize\par}
{\footnotesize{}\qquad{}IGNITE CANDLE WITH TORCH}{\footnotesize\par}
{\footnotesize{}\qquad{}IGNITE PAPER WITH TORCH} & {\footnotesize{}CUT TREE WITH AXE AND SWORD}{\footnotesize\par}
{\footnotesize{}\qquad{}CUT TREE WITH AXE}{\footnotesize\par}
{\footnotesize{}\qquad{}CUT TREE WITH SWORD} & {\footnotesize{}IGNITE CANDLE AND PAPER WITH TORCH AND FIRE}{\footnotesize\par}
{\footnotesize{}\qquad{}IGNITE CANDLE WITH TORCH}{\footnotesize\par}
{\footnotesize{}\qquad{}IGNITE PAPER WITH TORCH}\tabularnewline
\hline
\end{tabular}\medskip{}
So any additional indirect objects are ignored when there are both
multiple direct and indirect objects. MAIN-LOOP will always iterate
through the direct object clause if it has the same or more objects
than the indirect clause. The indirect object remains constant (the
first one in the clause) for all iterations. The only exception is
for only 1 direct object and multiple indirect objects. MAIN-LOOP
will then iterate through the indirect objects while keeping the direct
object constant.
\subsection{Update: Managing Global Variables}
Only minor improvements were made with handling the PRSA, PRSO, and
PRSI variables. Updating the L- versions of these variables which
are used by the AGAIN command was moved into the MAIN-LOOP section
starting with \textbf{Zork 2}. Later games would excluding updating
these variables if certain commands were used. \textbf{Zork 3} added
the option of checking the the LIT variable with commands that require
no objects. If it was clear, then a \textquotedblleft It\textquoteright s
too dark to see\textquotedblright{} error would be given. \textbf{Zork
2} (R28) also moved the updating of the IT-OBJECT and its location
variable into the MAIN-LOOP instead of PERFORM. \textbf{LGOP} and
\textbf{Plundered Hearts} also added a specific check on the visibility
of the IT-OBJECT in the MAIN-LOOP.
\subsection{Update: How many NOT-HERE-OBJECTs?}
To generate a better user responses when some objects are missing
in a command, MAIN-LOOP (since \textbf{Infidel}) started to count
how many requested objects in a multiple object command were not present.
This number was kept in the global P-NOT-HERE variable and used to
provide a more specific error message for missing objects. For example,
if more than P-NOT-HERE was greater than 1, the error message would
use \textquotedblleft objects\textquotedblright{} instead of \textquotedblleft object\textquotedblright .
One final coding relic is the P-MULT flag. It is cleared and set in
the MAIN-LOOP, but has been used only in \textbf{Infidel}\textquoteright s
NOT-HERE-OBJECT ACTION routine. All subsequent ZIP 3 games since \textbf{Infidel}
still set this flag, but it is never used. It is also present in various
developmental versions but not used. Its true function remains unclear.
\subsection{Update: Checking For Invalid Exceptions}
Starting with \textbf{Deadline}, MAIN-LOOP would check for specific
invalid situation where an action should not be done on a specific
object. \textbf{Deadline} ensured that none of the referred objects
in a command was the WINNER. \textbf{Planetfall} had its own special
check on objects used with \textquotedblleft PICK UP\textquotedblright{}
by making sure the PRSO was on/in PRSI. If not, it would skip over
that PRSO.
\textbf{Wishbringer} (R69) would be the first game to check for these