Using
Options
- Code Options
- Dumping Debugging Information
Pragmas
Extensions
Differences
CrossCompilation
Internals
- The Run Time System
- Ajhc Core Type System
  - On boxed kinds
  - On sort box, the unboxed tuple, and friends

Using

Building Projects

Ajhc does its own dependency chasing to track down source files, you need only provide it with the file containing your 'main' function on the command line. For instance, if you had a program 'HelloWorld.hs', the following would compile it to an executable named 'hello'.

; ajhc HelloWorld.hs -o hello

Ajhc searches for modules in its search path, which defaults to the current directory. Modules are searched for based on their names. For instance, the module Data.Foo will be searched for in 'Data/Foo.hs' and 'Data.Foo.hs'. The search path may be modifed with the '-i' command line option, or by setting the 'AJHC_PATH' environment variable.

Using Libraries

Ajhc libraries are distributed as files with an 'hl' suffix, such as 'base-1.0.hl'. In order to use a haskell library you simply need to place the file in a directory that Ajhc will search for it. For instance, $HOME/lib/ajhc. You may set the environment variable AJHC_LIBRARY_PATH to specify alternate locations to search for libraries or specify directory to search with the -L command line option. -L- will clear the search path.

You can then use libraries with the '-p' command line option, for instance if you had a library 'mylibrary-1.0.hl' in your search path, the following would use it.

; ajhc -p mylibrary MyProgram.hs -o myprogram

You can list all available libraries by passing the --list-libraries option to Ajhc. If you include '-v' for verbose output, you will get detailed information about the libraries in a YAML format suitable for processing by external tools.

Environment Variables

Ajhc's behavior is modified by several enviornment variables.

AJHC_OPTS : this is read and appended to the command line of Ajhc invocations.

AJHC_PATH : This specifies the path to search for modules.

AJHC_LIBRARY_PATH : This specifies the path to search for libraries.

AJHC_CACHE : This specified the directory Ajhc will use to cache values. having a valid cache is essential for Ajhc performance. It defaults to ~/.ajhc/cache.

Building Haskell Libraries

Libraries are built by passing Ajhc a file describing the library via the --build-hl option. The library file format is a stadard YAML file.

; ajhc --build-hl mylibrary.yaml

Library File Format

The library file is a YAML document, Ajhc will recognize several fields and ignore unknown ones.

Name : The name of your library

Version : The version of your library, The version number is used to differentiate different versions of the library passed to the '-p' command line option but is not otherwise special to Ajhc.

Exposed-Modules : A list of modules to be included in the library and exposed to users of the library as its public interface. This may include modules that are part of another library, they will be re-exported by the current library.

Hidden-Modules : A list of modules that the library may use internally but that should not be exposed to the user. Ajhc may optimize based on this information. If this list is not exhaustive Ajhc will still build your library, but it will print out a warning.

Extensions : A list of extensions which should be enabled during compilation of this module. When possible, Ajhc will match ghc extensions to their closest Ajhc counterparts.

Options : Extra command line options to Ajhc for this library build.

Build-Depends : libraries to include, in the same format as passed to the '-p' command line option

Hs-Source-Dirs : Directory to search for Haskell source files in, this differs from the '-i' command line option in that the directory in this field is relative to the directory the library description .yaml file is located while the '-i' option is always relative to the current working directory.

Include-Dirs : directories to be included in the preprocessor search path as if via '-I'. The directories are interpreted relative to the directory that contains the yaml file.

C-Sources : C files that should be linked into programs that utilize this library.

Include-Sources : files that should be made available for inclusion when compiling the generated C code but don't need to be linked into the executable.

example library files can be seen in lib/jhc/jhc.yaml and lib/base/base.yaml

Dependency Information

Ajhc can output dependency information describing how source files and libraries depend on each other while compiling code. The dependency information is generated when the --deps name.yaml option is passed to Ajhc. It is presented in the standard YAML format and its fields are as described below.

LibraryDeps: the libraries that are dependend on. It is a hash of library ids to the specific filename of the library used.
LibraryDesc: if building a library, this field contains the name of the library description file used.
ModuleDeps: a hash of module names to the list of modules that are directly dependend on by said module.
ModuleSouce: a hash of module name to the haskell source file used.

An example tool to processs the deps.yaml file and spit out appropriate Makefile rules is included as 'utils/deps_to_make.prl'.

Options

Usage: ajhc [OPTION...] Main.hs
  -V                --version                 print version info and exit
                    --version-context         print version context info and exit
                    --help                    print help information and exit
                    --info                    show compiler configuration information and exit
                    --purge-cache             clean out Ajhc compilation cache
  -v                --verbose                 chatty output on stderr
  -z                                          Increase verbosity of statistics
  -d [no-]flag                                dump specified data during compilation
  -f [no-]flag                                set or clear compilation options
  -X ExtensionName                            enable the given language extension
  -o FILE           --output=FILE             output to FILE
  -i DIR            --include=DIR             where to look for source files
  -I DIR                                      add to preprocessor include path
  -D NAME=VALUE                               add new definitions to set in preprocessor
                    --optc=option             extra options to pass to c compiler
  -c                                          just compile the modules, caching the results.
  -C                                          compile to C code
  -E                                          preprocess the input and print result to stdout
  -k                --keepgoing               keep going on errors
                    --cross                   enable cross-compilation, choose target with the -m flag
                    --stop=parse/typecheck/c  stop after the given pass, parse/typecheck/c
                    --width=COLUMNS           width of screen for debugging output
                    --main=Main.main          main entry point
  -m arch           --arch=arch               target architecture options
                    --entry=<expr>            main entry point, showable expression
                    --show-ho=file.ho         Show ho file
                    --noauto                  Don't automatically load base and haskell98 packages
  -p package                                  Load given haskell library package
  -L path                                     Look for haskell libraries in the given directory
                    --build-hl=desc.yaml      Build hakell library from given library description file
                    --annotate-source=<dir>   Write preprocessed and annotated source code to the directory specified
                    --deps=<file.yaml>        Write dependency information to file specified
                    --interactive             run interactivly                                                             ( for debugging only)
                    --ignore-cache            Ignore existing compilation cache entries.
                    --readonly-cache          Do not write new information to the compilation cache.
                    --no-cache                Do not use or update the cache.
                    --cache-dir=AJHC_CACHE    Use a global cache located in the directory passed as an argument.
                    --stale=Module            Treat these modules as stale, even if they exist in the cache
                    --list-libraries          List of installed libraries
                    --tdir=dir/               specify the directory where all intermediate files/dumps will be placed.
                    --targetsini=targets.ini  specify the targets.ini file.

valid -d arguments: 'help' for more info
    all-types, aspats, atom, bindgroups, boxy-steps, c, class, class-summary, core, core-afterlift
    core-beforelift, core-initial, core-mangled, core-mini, core-pass, core-steps, datatable
    datatable-builtin, dcons, decls, defs, derived, e-alias, e-info, e-size, e-verbose, exports, grin
    grin-datalog, grin-final, grin-graph, grin-initial, grin-normalized, grin-posteval, grin-preeval
    imports, ini, instance, kind, kind-steps, optimization-stats, parsed, preprocessed, program
    progress, renamed, rules, rules-spec, scc-modules, sigenv, srcsigs, stats, steps, tags, the
    types, verbose, veryverbose

valid -f arguments: 'help' for more info
    bang-patterns, boehm, controlled, cpp, customthread, debug, default, defaulting, exists, ffi
    forall, full-int, glasgow-exts, global-optimize, inline-pragmas, jgc, lint, m4
    monomorphism-restriction, negate, nothread, prelude, profile, pthread, raw, rules, standalone
    type-analysis, type-families, unboxed-tuples, unboxed-values, user-kinds, wrapper

Code Options

Various options affecting how Ajhc interprets and compiles code can be controlled with the '-f' flag, the following options are availible, you can negate any particular one by prepending 'no-' to it.

Code options
bang-patterns	- bang patterns
cpp	pass haskell source through c preprocessor
exists	- exists keyword for existential types recognized
ffi	support foreign function declarations
forall	- forall keyword for rank-n types and explicit quantification
m4	pass haskell source through m4 preprocessor
prelude	implicitly import Prelude
type-families	type/data family support
unboxed-tuples	allow unboxed tuple syntax to be recognized
unboxed-values	allow unboxed value syntax
user-kinds	user defined kinds

Typechecking
defaulting	perform defaulting of ambiguous types
monomorphism-restriction	enforce monomorphism restriction

Debugging
lint	perform lots of extra type checks

Optimization Options
global-optimize	perform whole program E optimization
inline-pragmas	use inline pragmas
rules	use rules
type-analysis	perform a basic points-to analysis on types right after method generation

Code Generation
boehm	use Boehm garbage collector
customthread	use your own thread mechanism
debug	enable debugging code in generated executable
full-int	extend Int and Word to 32 bits on a 32 bit machine (rather than 30)
jgc	use the jgc garbage collector
nothread	do not support thread
profile	enable profiling code in generated executable
pthread	use pthread for concurrency
raw	just evaluate main to WHNF and nothing else.
standalone	compile to a standalone executable
wrapper	wrap main in exception handler

Default settings
default	inline-pragmas rules wrapper defaulting type-analysis monomorphism-restriction global-optimize full-int prelude
glasgow-exts	forall ffi unboxed-tuples

Dumping Debugging Information

You can have Ajhc print out a variety of things while running as Controlled by the '-d' flag. The following is a list of possible parameters you can pass to '-d'.

Front End
defs	Show all defined names in a module
derived	show generated derived instances
exports	show which names are exported from each module
imports	show in scope names for each module
ini	all ini configuration options
parsed	parsed code
preprocessed	code after preprocessing/deliting
renamed	code after uniqueness renaming
scc-modules	show strongly connected modules in dependency order

Type Checker
all-types	show unified type table, after everything has been typechecked
aspats	show as patterns
bindgroups	show bindgroups
boxy-steps	show step by step what the type inferencer is doing
class	detailed information on each class
class-summary	summary of all classes
dcons	data constructors
decls	processed declarations
instance	show instances
kind	show results of kind inference for each module
kind-steps	show steps of kind inference
program	impl expls, the whole shebang.
sigenv	initial signature environment
srcsigs	processed signatures from source code
types	display unified type table containing all defined names

Intermediate code
core	show intermediate core code
core-afterlift	show final core before writing ho file
core-beforelift	show core before lambda lifting
core-initial	show core right after E.FromHs conversion
core-mangled	de-typed core right before it is converted to grin
core-mini	show details even when optimizing individual functions
core-pass	show each iteration of code while transforming
core-steps	show what happens in each pass
datatable	show data table of constructors
datatable-builtin	show data table entries for some built in types
e-alias	show expanded aliases
e-info	show info tags on all bound variables
e-size	print the size of E after each pass
e-verbose	print very verbose version of E code always
optimization-stats	show combined stats of optimization passes
rules	show all user rules and catalysts
rules-spec	show specialization rules

Grin code
grin	dump all grin to the screen
grin-datalog	print out grin information in a format suitable for loading into a database
grin-final	final grin before conversion to C
grin-graph	print dot file of final grin code to outputname_grin.dot
grin-initial	grin right after conversion from core
grin-normalized	grin right after first normalization
grin-posteval	show grin code just before eval/apply inlining
grin-preeval	show grin code just before eval/apply inlining
steps	show interpreter go
tags	list of all tags and their types

Backend code
c	don't delete C source file after compilation

Internal
atom	dump atom table on exit

General
progress	show basic progress indicators
stats	show extra information about stuff
verbose	progress
veryverbose	progress stats

Pragmas

Pragmas are special compiler directives that change its behavior in certain ways. In general, each compiler is free to define its own pragmas however Ajhc does try to implement the same ones as other compilers when it makes sense. pragmas appear in source code as {-# PRAGMANAME ... #-}

Function Properties

These must appear in the same file as the definition of a function. To apply one to a instance or class method, you must place it in the where clause of the instance or class declaration.

Pragma
NOINLINE	Do not inline the given function during core transformations. The function may be inlined during grin transformations.
INLINE	Inline this function whenever possible
SUPERINLINE	Always inline no matter what, even if it means making a local copy of the functions body.
VCONSTRUCTOR	Treat the function as a virtual constructor. CPR analysis and the worker/wrapper transforms will treat the function application as if it were a constructor. This implies 'NOINLINE'.

Class Pragmas

Pragma
NOETA	By default, Ajhc eta-expands all class methods to help enable optimizations. This disables this behavior.

Rules/Specializations

Pragma
RULES	rewrite rules. These have the same syntax and behave similarly to GHC's rewrite rules, except 'phase' information is not allowed.
CATALYST	A special type of rewrite rule that only fires if it enables the use of another RULE, so a CATALYST may allow optimizations that require passing through a non-optimal intermediate stage.
SPECIALIZE	create a version of a function that is specialized for a given type
SUPERSPECIALIZE	has the same effect as SPECIALIZE, but also places a run-time check in the generic version of the function to determine whether to call the specialized version.

Type Pragmas

Pragma
CTYPE	Specify the external type that a data or newtype should use for foreign function interfaces.
	The type must be a newtype or unary data constructor of a type that is already foreignable.
	Example
	_~
	data {-# CTYPE "unsigned short" #-} CUShort = CUShort Word16
	_~

Header Pragmas

These pragmas are only valid in the 'head' of a file, meaning they must come before the initial 'module' definition and in the first 4096 bytes of the file and must be preceded by and contain only characters in the ASCII character set.

OPTIONS_JHC : Specify extra options to use when processing this file. The options available are equivalent to the command line options, though, not all may have meaning when applied to a single file.

LANGUAGE : Specify various language options

Extensions

Module Search Path

Modules in Ajhc are searched for based on their name as in other Haskell compilers. However in addition to searching for 'Data/Foo.hs' for the module 'Data.Foo', Ajhc will also search for 'Data.Foo.hs'.

extensions to the FFI

foreign imports with multiple return values.

foreign C imports may return multiple values. To indicate this is the case, use an unboxed tuple as the return value. The first return value will be the value the function directly returns, the rest will be passed as pointers at the end of the functions argument list. Only pure (non IO) functions may return multiple values.

-- frexp has C prototype
-- double frexp(double x, int *exp);
-- so it would normally have an import like so, requiring the IO module and
-- Storable to call what is otherwise a pure function.

foreign import ccall "math.h frexp"  c_frexp :: Double -> Ptr CInt -> IO Double

-- This extension allows it to be declared as so
foreign import ccall "math.h frexp"  c_frexp2 :: Double -> (# Double, CInt #)

-- The second return value is added as the last 'exp' parameter then read out
-- of the allocated memory. The contents of the memory passed into the function
-- is undefined.

'capi' calling convention

The 'capi' calling convention may be used instead of 'ccall' for static imports. The convention means that the foreign function may not be addressable as an address, but rather may be implemnted as a macro, builtin, or other compiler specific feature. Ajhc will ensure that the routine is never used as a pointer and the headers specified in the dependency string are included anywhere the imported function appears. This differs from 'ccall' in that ccall makes no guarentees the given header file will be in scope and that a linker symbol of the exact name is exported.

package relative dependencies

dependecies in a foreign import may be written as p:foo.c or p:foo.h, this means that the file should be interpreted as part of the internal implementation of the package. Ajhc willl ensure the files do not clash with those of other packages that may have the same name. The files should be listed in the c-files and c-headers sections of the library config file.

Explicit namespaces in import/export lists

Ajhc allows explicit namespaces in import and export lists. These may be used to restrict or modiy what is imported/exported by a declaration.

* 'type' - The name is a type, as in something defined by 'type', 'newtype',
   or 'data', or the constructors of a kind declaration.
* 'class' - Specifies that the name is that of a class.
* 'data'  - Specifies that the name is a data constructor.
* 'kind'  - specifies that the name is a user defined kind.

In addition, another extension is that classes and types are in independent namespaces, so a type and a class of the same name may be in scope and not conflict.

user defined kinds

Ajhc allows users to define custom kinds when the -fuser-kinds extension is enabled. The syntax is:

data kind Nat = Z | S Nat

This will define a new kind 'Nat' and two types 'Z' and 'S' which inhabit it. The types introduced by kind declarations represent unboxed values with no values, So they have no run-time representation (notably, not even ⊥)

Standalone deriving

Ajhc supports a standalone deriving mechanism under certain circumstances.

Rank-N Polymorphism

Ajhc supports higher ranked polymorphism. Ajhc will never infer types of higher rank, however when the context unambiguously specifies a higher ranked type, it will be used. For instance, user supplied type annotations and arguments to data constructors defined to by polymorphic will get the correct polymorphic type.

Existential types

Ajhc supports first class existential types, using the 'exists' keyword. It also supports existential data types in a similar fashion to ghc.

Unboxed Values

Unboxed values in Ajhc are specified in a similar fashion to GHC however the lexical syntax is not changed to allow # in identifiers. # is still used in the syntax for various unboxed constructs, but normal Haskell rules apply to haskell identifiers. The convention is to suffix such types with '_' to indicate their status as unboxed. All unboxed values other than unboxed tuples are enabled by the -funboxed-value flag. For compatibility with GHC, the MagicHash extension name also turns on unboxed-values.

Unboxed Tuples

Ajhc supports unboxed tuples with the same syntax as GHC, (# 2, 4 #) is an unboxed tuple of two numbers. Unboxed tuples are enabled with -funboxed-tuples. Unboxed tuples are kind-polymorphic, able to hold both boxed and unboxed values. (but not another unboxed tuple)

Unboxed Strings

Unboxed strings are enabled with the -funboxed-values flag. They are specified like a normal string but have a '#' at the end. Unboxed strings have types 'BitsPtr_'.

Unboxed Characters

Unboxed characters can be expressed by putting a hash after a normal character literal. Unboxed characters are of type Char_ which is a newtype of Bits32_ and defined in Jhc.Prim.Bits

Unboxed Numbers

Unboxed numbers are enabled with the -funboxed-values flag. They are postpended with a '#' such as in 3# or 4#. Ajhc supports a limited form of type inference for unboxed numbers, if the type is fully specified by the environment and it is a suitable unboxed numeric type then that type is used. Otherwise it defaults to Int__. Whether the type is fully specifed follows the same rules as rank-n types. Unboxed numbers do the right thing for enumerations, so 0# can be used for the unboxed False value and the appropriate type will be infered.

Operations on unboxed values

To operate on unboxed vaules you need to bring the appropriate primitive operators into scope. You can do this via the special form of FFI declaration for importing primitives. Any C-- primitive may be imported as well as a variety of utility routines. the primitive import mechanism is 'smart' in that it will dig through newtypes and take care of boxing/unboxing values as needed. So you can import a primitive on Char and it will take care of boxing the value up in the 'Char' constructor as well as the Char_ newtype for Bits32_, ultimately choosing the right Bits32_ primitive. imported primitives are normal haskell declarations so may be exported/imported from modules or passed as higher order functions like normal.

Foreign Primitives

In addition to foreign imports of external functions as described in the FFI spec. Jhc supports 'primitive' imports that let you communicate primitives directly to the compiler. In general, these should not be used other than in the implementation of the standard libraries. They generally do little error checking as it is assumed you know what you are doing if you use them. All haskell visible entities are introduced via foreign declarations in jhc.

They all have the form

foreign import primitive "specification" haskell_name :: type

where "specification" is one of the following

seq : evaluate first argument to WHNF, then return the second argument

zero,one : the values zero and one of any primitive type.

const.C_CONSTANT : the text following const is directly inserted into the resulting C file

peek.TYPE : the peek primitive for raw value TYPE

poke.TYPE : the poke primitive for raw value TYPE

sizeOf.TYPE, alignmentOf.TYPE, minBound.TYPE, maxBound.TYPE, umaxBound.TYPE : various properties of a given internal type.

error.MESSAGE : results in an error with constant message MESSAGE.

constPeekByte : peek of a constant value specialized to bytes, used internally by Jhc.String

box : take an unboxed value and box it, the shape of the box is determined by the type at which this is imported

unbox : take an boxed value and unbox it, the shape of the box is determined by the type at which this is imported

increment, decrement : increment or decrement a numerical integral primitive value

fincrement, fdecrement : increment or decrement a numerical floating point primitive value

exitFailure__ : abort the program immediately

C-- Primitive : any C-- primitive may be imported in this manner.

Differences

Differences from Haskell 98

Language Differences

Class contexts on data types are silently ignored.
Class methods are fully 'eta expanded' out to the argument count specified by the type. This is often beneficial as instances that need to share partial applications are rare. This behavior can be turned off with the NOETA pragma for specific methods.

Library Changes

In addition to a larger set of base libraries roughly modeled on GHC's base. Jhc provides a number of extensions/minor modifications to the standard libraries. These are designed to be mostly backwards compatible and most are to the class system.

Data.Bits
- Num is no longer a super class of Data.Bits. It never should have been.
- There are new methods logicalShiftR and arithmeticShiftR that do a logical and arithmetic shift respectively. shiftR will always map to one of those as appropriate.
- shiftR and shiftL do not check for negative arguments, if you might want negative arguments, use the general 'shift' routine. 'shift' also comes in logical and arithmetic varieties.

Library Additions

There are many other additional libraries provided with jhc, here I list only changes that affect modules that are defined by the haskell 98 or FFI specifications.

Data.Int and Data.Word provide WordPtr, WordMax, IntPtr, and IntMax that correspond to the C types uintptr_t, uintmax_t, intptr_t, and intmax_t respectively.
fromInt,toInt,fromDouble,toDouble have been added alongside Integer and Rational routines in their respective classes.
floating point truncation and rounding functions have varieties that don't return an integral type, but rather return something of the same type as its argument. These have the same name but end in 'f'.

Notable Differences from GHC

Jhc differs from GHC in certain ways that are allowed by Haskell 98, but might come as a surprise to some.

An Int may be only 30 bits and may not observe simple binary truncation on overflow. If you need known bit width and binary semantics for your numbers then use the types in Data.Int and Data.Word. Overflow on Int or Word has undefined results.
A Char may only preserve values within the Unicode range. Storing values greater than 0x10FFFF has undefined results.
The Int and Word types are at most 32 bits, even on 64 bit architectures.
All text based IO is performed according to the current locale. This means that Unicode works seamlessly, but older programs that assumed IO was performed by simple truncation of chars down to 8 bits will fail. Use the explicit binary routines if you need binary IO.

Differences That are Considered Misfeatures

These misfeatures will be fixed at some point.

Integer corresponds to IntMax rather than an arbitrary precision type. As soon as a suitable arbitrary precision library emerges, it will be replaced.
Ix is not derivable.

CrossCompilation

Basics

Unlike many other compilers, Ajhc is a native cross compiler. What this means is that every compile of Ajhc is able to create code for all possible target systems. This leads to many simplifications when it comes to cross compiling with Ajhc. Basically in order to cross compile, you need only pass the flag '--cross' to Ajhc, and pass an appropriate '-m' option to tell Ajhc what machine you are targetting. An example would be

; ajhc --cross -mwin32 test/HelloWorld.hs

The targets list is extensible at run-time via the targets.ini file explained below.

targets.ini

This file determines what targets are available. The format consists of entries as follows.

[targetname]
key1=value
key2=value
key3+=value
merge=targetname2

merge is a special key meaning to merge the contents of another target into the current one. The configuration file is read in order, and the final value set for a given key is the one that is used.

An example describing how to cross compile for windows is as follows:

[win32]
cc=i386-mingw32-gcc
cflags+=-mwindows -mno-cygwin
executable_extension=.exe
merge=i686

This sets the compiler to use as well as a few other options then jumps to the generic i686 routine. The special target [default] is always read before all other targets. If '--cross' is specified on the command line then this is the only implicitly included configuration, otherwise Ajhc will assume you are compiling for the current architecture and choose an appropriate target to include in addition to default.

Ajhc will attempt to read several targets.ini files in order. they are

$PREFIX/etc/ajhc-$VERSION/targets.ini : this is the targets.ini that is included with Ajhc and contains the default options.

$PREFIX/etc/ajhc-$VERSION/targets-local.ini : Ajhc will read this if it exists, it is used to specify custom system wide configuration options, such as the name of local compilers.

$HOME/.ajhc/targets.ini : this is where a users local configuration information goes.

$HOME/etc/ajhc/targets.ini : this is simply for people that prefer to not use hidden directories for configuration

The last value specified for an option is the one used, so a users local configuration overrides the system local version which overrides the built in options.

Options available

Option	Meaning
cc	what c compiler to use. generally this will be gcc for local builds and something like ARCH − HOST-gcc for cross compiles
byteorder	one of le or be for little or big endian
gc	what garbage collector to use. It should be one of static or boehm.
cflags	options to pass to the c compiler
cflags_debug	options to pass to the c compiler only when debugging is enabled
cflags_nodebug	options to pass to the c compiler only when debugging is disabled
profile	whether to include profiling code in the generated executable
autoload	what haskell libraries to autoload, seperated by commas.
executable_extension	specifies an extension that should be appended to executable files, (i.e. .EXE on windows)
merge	a special option that merges the contents of another configuration target into the currrent one.
bits	the number of bits a pointer contains on this architecture
bits_max	the number of bits in the largest integral type. should be the number of bits in the 'intmax_t' C type.
arch	what to pass to gcc as the architecture

Special defines to set cflags

Define	Meaning
_JHC_ARM_STAY_IN_THUMB_MODE	set bit0 to any function pointers, for Cortex-M*. (more detail)
_JHC_JGC_NAIVEGC	run gc when have no more blocks.
_JHC_JGC_SAVING_MALLOC_HEAP	get smaller malloc heap.
_JHC_JGC_LIMITED_NUM_GC_STACK	number of limited gc_stack entries.
_JHC_JGC_GC_STACK_SHIFT	bit shift to specify gc_stack size. Use it internally like this: (1 << (_JHC_JGC_GC_STACK_SHIFT)).
_JHC_JGC_STACKGROW	number of stack entry growed when run short of it.
_JHC_JGC_LIMITED_NUM_MEGABLOCK	number of limited megablock entries.
_JHC_JGC_BLOCK_SHIFT	bit shift to specify block size. Use it internally like this: (1 << (_JHC_JGC_BLOCK_SHIFT)).
_JHC_JGC_MEGABLOCK_SHIFT	bit shift to specify megablock size. Use it internally like this: (1 << (_JHC_JGC_MEGABLOCK_SHIFT)).
_JHC_USE_OWN_STDIO	force you should implement your own printf,fputs,fflush,getchar,getc,putchar,putc.

Internals

The Run Time System

Ajhc is very minimalist in that it does not have a precompiled run time system, but rather generates what is needed as part of the compilation process. However, back ends do have specific run-time representations of data, which can be affected by things like the choice of garbage collector. The following describes the general layout for the C based back-ends, but compiler options such as garbage collection method or whether we do full program analysis, will affect which features are used and whether certain optimized layouts are possible.

Unboxed values directly translate to values in the target language, an unboxed Int will translate directly into an 'int' as an argument and an unboxed pointer will be a raw pointer. Unboxed values have no special interpretation and are not followed by the garbage collector. If the target language does not support a feature such as multiple return values, it will have to be simulated. It would not be wrong to think of Grin code that only deals with unboxed values to be isomorphic to C-- or C augmented with multiple return values.

Boxed values have a standard representation and can be followed. Unlike some other implementation, being boxed does not imply the object is located on the heap. It may be on the stack, heap, or even embedded within the smart pointer itself. Being boxed only means that the object may be represented by a smart pointer, which may or may not actually be a pointer in the traditional sense.

A boxed value in Ajhc is represented by a 'smart pointer' of c type sptr_t. a smart pointer is the size of a native pointer, but can take on different roles depending on a pair of tag bits, called the ptype.

smart pointers take on a general form as follows:

-------------------------
|    payload        | GL|
-------------------------

  G - if set, then the garbage collector should not treat value as a pointer to be followed
  L - lazy, this bit being set means the value is potentially not in WHNF

A sptr_t on its own in the wild can only take on one of the following forms:

-------------------------
|    whnf raw value | 10|
-------------------------

-------------------------
|    whnf location  | 00|
-------------------------

WHNF stands for 'Weak Head Normal Form' and means that the value is not a suspended function and hence not a pointer to a thunk. It may be directly examined and need not be evaluated. wptr_t is an alias for sptr_t that is guarenteed to be of one of the above forms. It is used to improve safety for when we can statically know that a value is WHNF and hence we can skip the expensive 'eval'.

The difference between the raw value and the whnf location is that the first contains uninterpreted bits, while the second is a pointer to a location on the heap or stack and hence the garbage collector should follow it. The format of the memory pointed to by the whnf location is unspecified and dependent on the actual type being represented.

Partial (unsaturated) applications are normal WHNF values. Saturated applications which may be 'eval'ed and updated are called thunks and must not be pointed to by WHNF pointers. Their representation follows.

-------------------------
|   lazy location   | 01|
-------------------------

A lazy location points to either a thunk, or a redirection to a WHNF value. A lazy location is always a pointer to an allocated block of memory which always begins with a restricted smart pointer. This restricted smart pointer is represented by the C type alias 'fptr_t'. fptr_t's only occur as the first entry in a lazy location, they never are passed around as objects in their own right.

A fptr_t may be a whnf value or a code pointer. If a fptr_t is a whnf value (of one of the two forms given above) then it is called a redirection, the lazy location should be treated exactly as if it were the whnf given. This is used to redirect an evaluated thunk to its computed value.

A fptr_t may also be a 'code pointer' in which case the lazy location is called a thunk. A code pointer is a pointer to executable machine code that evaluates a closure and returns a wptr_t, the returned wptr_t is then generally written over the code pointer, turning the thunk into a redirection. It is the responsibility of the code pointed to to perform this redirection.

-------------------------
|    code pointer   | 11|
-------------------------
|     data ...          |

When debugging, the special code pointer BLACK_HOLE is also sometimes stored in a fptr_t to detect certain run-time errors.

Note that unlike other implementations, a fptr_t may not be another lazy location. you can not have chained redirections, a redirection is always a redirection to a whnf value.

sptr_t - a tagged smart pointer, may contain a whnf value or a lazy location.
wptr_t - a tagged smart pointer that contains a whnf value (either raw or a location)
fptr_t - a tagged smart pointer, may contain a whnf value indicating a redirection, or a code pointer indicating a thunk.

Ajhc Core Type System

Ajhc's core is based on a pure type system. A pure type system (also called a PTS) is actually a parameterized set of type systems. Ajhc's version is described by the following.

Sorts  = (*, !, **, #, (#), ##, □)
Axioms = (*:**, #:##, !:**, **:□, ##:□)

-- sort kind
*   is the kind of boxed values
!   is the kind of boxed strict values
#   is the kind of unboxed values
(#) is the kind of unboxed tuples
-- sort superkind
**  is the superkind of all boxed value
##  is the superkind of all unboxed values
-- sort box
□   superkinds inhabit this

in addition there exist user defined kinds, which are always of supersort ##

The following Rules table shows what sort of abstractions are allowed, a rule of the form (A,B,C) means you can have functions of things of sort A to things of sort B and the result is something of sort C. Function in this context subsumes both term and type level abstractions.

Notice that functions are always boxed, but may be strict if they take an unboxed tuple as an argument. When a function is strict it means that it is represented by a pointer to code directly, it cannot be a suspended value that evaluates to a function.

These type system rules apply to lambda abstractions. It is possible that data constructors might exist that cannot be given a type on their own with these rules, even though when fully applied it has a well formed type. An example would be unboxed tuples. This presents no difficulty as one concludes correctly that it is a type error for these constructors to ever appear when not fully saturated with arguments.

as a shortcut we will use *# to mean every combination involving * and #, and so forth.
for instance, (*#,*#,*) means the set (*,*,*) (#,*,*) (*,#,*) (#,#,*)

Rules =
   (*#!,*#!,*)  -- functions from values to values are boxed and lazy
   (*#!,(#),*)  -- functions from values to unboxed tuples are boxed and lazy
   ((#),*#!,!)  -- functions from unboxed tuples to values are boxed and strict
   ((#),(#),!)  -- functions from unboxed tuples to unboxed tuples are boxed and strict
   (**,*,*)     -- may have a function from an unboxed type to a value
   (**,#,*)
   (**,!,*)
   (**,**,**)  -- we have functions from types to types
   (**,##,##)  -- MutArray_ :: * -> #
   (##,##,##)  -- Complex_ :: # -> #

The defining feature of boxed values is

_|_ :: t iff t::*

This PTS is functional but not injective

The PTS can be considered stratified into the following levels

□                - sort box
**,##,           - sort superkind
*,#,(#),!        - sort kind
Int,Bits32_,Char - sort type
3,True,"bob"     - sort value

On boxed kinds

The boxed kinds (* and !) represent types that have a uniform run time representation. Due to this, functions may be written that are polymorphic in types of these kinds. Hence the rules of the form (**,?,?), allowing taking types of boxed kinds as arguments.

the unboxed kind # is inhabited with types that have their own specific run time representation. Hence you cannot write functions that are polymorphic in unboxed types

On sort box, the unboxed tuple, and friends

Although sort box does not appear in the code, it is useful from a theoretical point of view to talk about certain types such as the types of unboxed tuples. Unboxed tuples may have boxed and unboxed arguments, without sort box it would be impossible to express this since it must be superkind polymorphic. sort box allows one to express this as (in the case of the unboxed 2-tuple)

∀s1:□ ∀s2:□ ∀k1:s1 ∀k2:s2 ∀t1:k1 ∀t2:k2 . (# t1, t2 #)

However, although this is a valid typing of what it would mean if a unboxed tuple were not fully applied, since we do not have any rules of form (##,?,?) or (□,?,?) this type obviously does not typecheck. Which is what enforces the invarient that unboxed tuples are always fully applied, and is also why we do not need a code representation of sort box.

Do we need a superbox?

You will notice that if you look at the axioms involving the sorts, you end up with a disjoint graph

         □             - the box
        / \
      **   ##          - superkind
      /\     \
     *  !     #   (#)  - kind

This is simply due to the fact that nothing is polymorphic in unboxed tuples of kind (#) so we never need to refer to any super-sorts of them. We can add sorts (##),(□) and □□ to fill in the gaps, but since these sorts will never appear in code or discourse, we will ignore them from now on.

           □□            - sort superbox
          /  \
         □    (□)        - sort box
        / \      \
      **   ##     (##)   - sort superkind
      /\     \    |
     *  !     #   (#)    - sort kind

Ajhc User's Manual

John Meacham & Metasepi team