@@ -67,19 +67,63 @@ bc_type_code ;
6767// pointer[k] index[k] Name and description
6868// ---------- -------- --------------------
6969//
70- // NULL non-NULL "Index": some entries present.
70+ // NULL NULL "ELLPACK:4", because p is a simple stride [0 4 8 ...]
71+ // a possible extension. a 2D ELLPACK:4 matrix has
72+ // 4 entries in each row, in any columns.
73+ //
74+ // axis 0: nothing, name "L:4", one number: 4
75+ //
76+ // axis 1: index, [ size 40 ]
77+ //
78+ // . x . x . x . . . x
79+ // . . x . x . x x . .
80+ // x . x x . . x . . .
81+ // . x . . x . . x x .
82+ // . . . x . x . x x .
83+ // . . . . . x x x . x
84+ // . . . x x x x . . .
85+ // . x x . x x . . . .
86+ // . . . . . . x x x x
87+ // . . . x x x . x . .
88+ // x x x . . . . . x .
89+ //
90+ //
91+ // NULL non-NULL "Index": some entries present. (Erik: "Sparse", not compressed, coo)
7192// indices need not be in order, nor unique.
7293// size of index [k] array is nindex [k].
7394// in_order [k] can be true or false.
7495//
75- // non-NULL non-NULL "Hyper": some entries present.
96+ // NULL non-NULL "Hyper_ELLPACK:4", because p is a simple stride [0 4 8 ...]
97+ //
98+ // rows: 0 2 5, each have 4 entries
99+ //
100+ // . x . x . x . . . x
101+ // . . . . . . . . . .
102+ // x . x x . . x . . .
103+ // . . . . . . . . . .
104+ // . . . . . . . . . .
105+ // . . . . . x x x . x
106+ // . . . . . . . . . .
107+ // . . . . . . . . . .
108+ // . . . . . . . . . .
109+ // . . . . . . . . . .
110+ // . . . . . . . . . .
111+ //
112+ // (hyper-ELL:4, index)
113+ //
114+ // axis 0: index = [0 2 5]
115+ //
116+ // axis 1: index = [ size 12 ]
117+ // any given row is empty, or has exactly 4 entries.
118+ //
119+ // non-NULL non-NULL "Hyper": some entries present. (Erik: "DC" "doubly compressed")
76120// indices must be in order and unique.
77121// index [k] has size nindex [k]
78122// pointer [k] has size nindex [k]+1 and must be
79123// monotonically non-decreasing.
80124// in_order [k] must be true.
81125//
82- // non-NULL NULL "Sparse": all entries present.
126+ // non-NULL NULL "Sparse": all entries present. (Erik: "Compressed" or C)
83127// pointer [k] has size dim [k]+1.
84128// nindex [k] not used (or can be set to
85129// dim [k] for consistency).
@@ -319,14 +363,153 @@ bc_type_code ;
319363// axis, since all objects to the right have the same size.
320364//
321365// (5) Like rule 1, once "Index" appears, the remaining formats to the right
322- // must be "Index" or "Full". This is because "Index" has no pointer so
323- // all formats to the right must have a known size, or be a list like
324- // (Index, Index, Full) where the total size is given nindex [...].
366+ // must be "Sparse, "Index" or "Full". This is because "Index" has no
367+ // pointer so all formats to the right must have a known size, or be a
368+ // list like (Index, Index, Full) where the total size is given nindex
369+ // [...]. "Sparse" has known size: it is the entire dimension.
370+ //
371+ // (6) (..., Hyper, Sparse, ...) can be defined but is not useful.
372+ // The same can be done with (..., Index, Full, ...) by just deleting
373+ // the pointer for the Hyper axis. The pointer vector contains a
374+ // list of constant stride (see below).
375+
376+ /*
377+ 10-by-10-by-10: suppose the 1st dimension is empty except for 0,2,5
378+ suppose the axis order is 0,1,2 (all "by row")
379+
380+ axis 0: entry 0: a 2D matrix, containing 5 entries (say by row)
381+ . . . . . . . . . .
382+ . . x . . . . . . .
383+ . . . . . . . . . .
384+ . . . x x . x . . .
385+ . . . . . . . . . .
386+ . . . . . . . . . .
387+ . . . . . . x . . .
388+ . . . . . . . . . .
389+ . . . . . . . . . .
390+ . . . . . . . . . .
391+
392+ axis 0: entry 2: a 2D matrix, containing 7 entries
393+ . . . . . . . . . .
394+ . . . . . . . . . .
395+ . . . . . . . . . .
396+ . x . . . . . . x .
397+ . . . . . . . . . .
398+ . . . . . . . . . .
399+ . . . . . x x x x .
400+ . . . . . . . . . .
401+ . x . . . . . . . .
402+ . . . . . . . . . .
403+
404+ axis 0: entry 5: a 2D matrix, containing 3 entries
405+ . . . . . . . . . .
406+ x . . x . . . . . .
407+ . . . . . x . . . .
408+ . . . . . . . . . .
409+ . . . . . . . . . .
410+ . . . . . . . . . .
411+ . . . . . . . . . .
412+ . . . . . . . . . .
413+ . . . . . . . . . .
414+ . . . . . . . . . .
415+
416+ (Hyper, Sparse, Index): can be specified but has some useless info
417+ 0 2D matrix, 10-by-10, CSR, 5 entries
418+ 2 2D matrix, 10-by-10, CSR, 7 entries
419+ 5 2D matrix, 10-by-10, CSR, 3 entries
420+
421+ axis0: index(0) = [0, 2, 5], pointer(0) = [0 10 20 31=end], len = 3
422+ Note that pointer(0) an array of size 3+1, is useless since
423+ the next axis is "Sparse" so each has fixed size (of 10 each)
424+
425+ axis1: pointer(1) = an array of size 31, since there are 3 objects
426+ in the axis0 dimension. Each object is a pointer of size 10
427+ plus one end marker.
428+
429+ pointer(1) = [ 0 1 1 1 4 4 4 5 5 5 5 5 5 5 7 7 7 12 12 12 12 14 15 15 15 15 15 15 15 15 ]
430+
431+ 0 1 2 3 4 5 6 7 8 9 -
432+ . . . . . . . . . . 0 <= pointer for this 2D slice
433+ . . x . . . . . . . 1
434+ . . . . . . . . . . 1
435+ . . . x x . x . . . 1
436+ . . . . . . . . . . 4
437+ . . . . . . . . . . 4
438+ . . . . . . x . . . 4
439+ . . . . . . . . . . 5
440+ . . . . . . . . . . 5
441+ . . . . . . . . . . 5
442+
443+ 0 1 2 3 4 5 6 7 8 9 -
444+ . . . . . . . . . . 5
445+ . . . . . . . . . . 5
446+ . . . . . . . . . . 5
447+ . x . . . . . . x . 5
448+ . . . . . . . . . . 7
449+ . . . . . . . . . . 7
450+ . . . . . x x x x . 7
451+ . . . . . . . . . . 11
452+ . x . . . . . . . . 12
453+ . . . . . . . . . . 12
454+
455+ 0 1 2 3 4 5 6 7 8 9 -
456+ . . . . . . . . . . 12
457+ x . . x . . . . . . 12
458+ . . . . . x . . . . 14
459+ . . . . . . . . . . 15
460+ . . . . . . . . . . 15
461+ . . . . . . . . . . 15
462+ . . . . . . . . . . 15
463+ . . . . . . . . . . 15
464+ . . . . . . . . . . 15
465+ . . . . . . . . . . 15
466+ 15 <= end marker
467+
468+ axis2: index(2) = [ 2 3 4 6 6 1 8 6 7 8 9 0 3 5]
469+ an array of size 15
470+
471+ 10-by-10-by-10
472+ (Index, Sparse, Index)
473+ Erik: (S-C-S)
474+ 0 2D matrix, 10-by-10, CSR, 5 entries
475+ 2 2D matrix, 10-by-10, CSR, 7 entries
476+ 5 2D matrix, 10-by-10, CSR, 3 entries
477+
478+ same as above, but drop pointer(0) as not needed. So
479+ this is better than (Hyper, Sparse, Index).
480+
481+ Consider duplicates:
482+
483+ 10-by-10-by-10
484+ (Index, Sparse, Index): with duplicate in axis 0.
485+ Erik: (S-C-S)
486+ 0 2D matrix, 10-by-10, CSR, 5 entries
487+ 5 2D matrix, 10-by-10, CSR, 7 entries
488+ 5 2D matrix, 10-by-10, CSR, 3 entries
489+
490+ Here, the A(5,:,:) matrix is specified twice, so
491+ A(5,:,:) is the sum of both 2D matrices, with a total
492+ of 7 to 10 entries. A dup operator can be specified,
493+ or implied.
494+
495+ consider a 10-by-20-by-30-by-40 tensor:
496+
497+ (Index, Index, Hyper, Index) ugly with hack: requires look-ahead,
498+ group order to indices. Not allowed in this proposed format.
499+
500+ (Index, Sparse, Hyper, Index) fine
501+
502+ (Index, Index, Sparse, Index) fine
503+
504+ etc.
505+
506+ (Sparse, Hyper, Index, Index) fine
507+ */
325508
326509/*
327510LANGUAGE OF VALID FORMATS
328511
329- These 5 rules lead to a simple finite-state machine that descibes the language
512+ These 6 rules lead to a simple finite-state machine that descibes the language
330513of valid formats. The starting state (0th rank) can be any of the four
331514formats. Each state has a self-loop (not shown). The end state of the
332515language must be Index or Full.
@@ -337,14 +520,14 @@ language must be Index or Full.
337520 | fixed size
338521
339522 "Sparse" "Index"
340- (pointer present - ------------------> no pointer
523+ (pointer present < ------------------> no pointer
341524 no index. index present
342525 size is size is
343- dim [k] <---\ /---> nindex[k]
344- \ \ / \
345- \ \ / \
346- \ \ / \
347- \ \ / \
526+ dim [k] /---> nindex[k]
527+ \ / \
528+ \ / \
529+ \ / \
530+ \ / \
348531 \ "Hyper" / ---> "Full"
349532 \-----> (both pointer no pointer
350533 and index. no index
@@ -356,34 +539,42 @@ language must be Index or Full.
356539 NO INDEX | INDEX IS PRESENT | NO INDEX
357540 must be | in order if axis[k].in_order | must be
358541 in order | is true, unordered if false | in order
542+ but I would say "Hyper" must
543+ be in order with no duplicates.
359544
360545
361546That is, the format can start with any mix of Sparse and/or Hyper (or none of
362547them), in any order. These formats have pointers so the size of the objects to
363548the right of them can vary in size.
364549
365- The Sparse and Hyper formats have a pointer, so the objects they describe to
366- the right of them in axis k+1 have variable sizes.
550+ The Sparse and Hyper formats have a pointer, so if the axis k is Sparse or Hyper,
551+ the objects they describe to the right of them in axis k+1 can have variable
552+ sizes (any format, but only Hyper has variable size).
367553
368554The Index and Full formats have no pointer, so the objects they describe
369- in their axes and the axes to the right of them must have a fixed size.
555+ in their axes and the next axis to the right of them must have a fixed size
556+ (that is, Sparse, Index, or Full, but not Hyper).
370557
371558The Sparse and Full formats have no index, so their own size must be dim [k]
372559if they describe the kth axis. "Sparse" is short-hand for a dense list of
373560objects, each of variable size. "Full" is short-hand for a dense list of
374561objects of fixed size.
375562
563+ Regarding duplicates/out-of-order: I think only the Index type of axis
564+ should allow for duplicates and out-of-order indices. Duplicates are
565+ meant to be summed, in any axis.
566+
376567*/
377568
378569// rank = 3
379570//
380- // describe some for future extensions. 12 possible formats:
571+ // possible formats:
381572
382573// (Index , Index , Index) all COO
574+ // (Index , Sparse, Index) 1D list of 2D CSR/CSC matrices
383575
384576// (Hyper , Index , Index) 1D hyperlist of 2D COO matrices
385577// (Hyper , Hyper , Index) 1D hyperlist of 2D hypersparse mtx
386- // (Hyper , Sparse, Index) 1D hyperlist of 2D CSR/CSC matrices
387578
388579// (Sparse, Index , Index) 1D dense array of 2D COO matrices
389580// (Sparse, Hyper , Index) 1D dense array of 2D hypersparse
0 commit comments