Skip to content

Commit d36f0f9

Browse files
broken formatting
1 parent bebe758 commit d36f0f9

File tree

2 files changed

+13
-5
lines changed

2 files changed

+13
-5
lines changed

docs/2024/pointer-soup/index.html

Lines changed: 10 additions & 4 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

markdown/pointer_soup.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,10 @@ Currently, uLisp has three different representations of symbols: built-in symbol
1212

1313
1. Built-in symbols are represented as a number that is the index of the symbol in the built-in symbol table. However, to allow them to be distinguished from other kinds of symbols the index is "twisted" into a slightly different format by first adding a constant that has the top two bits set (specifically, `:::c 0xF4240000` on 32-bit uLisp), and then performing a rightwards bit rotation by 2 bits. The effect is that the bottom two bits of a built-in symbol's representation will always be 1 (mathematically, the resultant number will be congruent to 3 modulo 4).
1414

15-
As a pointer, this is almost universally invalid; processors typically choke on pointers that aren't aligned to a multiple of the processor's word size (which is either 4 or 8 bytes).
15+
As a pointer, this is almost universally invalid; processors typically choke on pointers that aren't aligned to a multiple of the processor's word size (which is either 4 or 8 bytes).
16+
1617
2. Packed symbols are a way to save memory for common symbol names by encoding them as a base-40 number instead of storing every character. If the name contains only valid characters (namely, letters, digits, `$`, `-`, and `*`) and is short enough, the name is encoded in base-40 and then the resultant number is "twisted" in the same manner as built-in indexes (just with a different constant) again to ensure that at least one of the bottom two bits is set to 1.
18+
1719
3. Long symbols are sort of the "fallback" option: if no other representation fits, the symbol is encoded in the same manner as a uLisp string, with the characters stored in a linked list of chunks inside the uLisp `Workspace`, and then the symbol object's value is a pointer to the head of the list. Since it's obviously a valid pointer, the bottom two bits of a long symbol are guaranteed to be both 0's.
1820

1921
When the symbol is printed, the determination as to which kind of symbol it is has a few steps. If the bottom two bits are 0, then it's obviously a long symbol, and it is printed in the same manner as a string (just without quotes). If either of the bottom two bits are 1 (a "misaligned" pointer), then it is either a built-in symbol or a packed symbol. The number is un-rotated, and if it is greater than the built-in symbols' scaling constant, then it must be a built-in, otherwise it's a packed symbol.

0 commit comments

Comments
 (0)