Skip to content

Conversation

@arteymix
Copy link
Member

No description provided.

arteymix and others added 30 commits October 29, 2025 12:35
Add basic support for downloading data from CELLxGENE for datasets that
are also available in GEO.
When the option is set, no symlink is created.
Filter taxa and assays when scraping CELLxGENE.

Add an option to select specific allowed taxa.

Add more supported single-cell assays.

Detect unsecure http:// links to GEO.
Increase reporting frequency to every 100 vectors for agggregation.
Reduce the number of times we compute start/end positions by caching the
value. Unfortunately, because we keep cell types contiguous, there's no
suitable workaround.

Report aggregation throughput and indicate that we are loading
single-cell vectors.

Indicate progress when loading vectors for aggregation and add support
for streaming single-cell vectors when aggregating.
Make sure that we do not report progress to the console when writing
single-cell data to the standard output.
Remove the edit button, there is no editBioMaterial.html page anymore.
Refactor CompletionGenerator subclasses to reuse some of thet logic in an
abstract class.

Add a sample output for reviewing (will be removed before merging!)
Generate wiki markup for updating curator documentation
arteymix and others added 20 commits December 15, 2025 16:16
…ls-in-bulk-data-vectors

Store number of cells in raw/processed bulk vectors
display cell counts on diff exp tree
remove gene search and redirect to gembrow
Make ExpressionDataDoubleMatrix immutable.

Use a raw double[][] matrix for ExpressionDataDoubleMatrix and other
multi-assay matrices to minimize bookeeping.

Pefer DoubleMatrix2D from Colt over DoubleMatrix when modifying a matrix
without altering the columns/rows labels.

Add asDoubleMatrix2D() to conveniently obtain a Colt matrix.

Rename getMatrix() to asDoubleMatrix() to make it clear that the matrix
is being copied to a new data format.

Rename getRawMatrix() to getMatrix().

Fix ExpressionDataDoubleMatrix not being able to hold negative
infinities since those were converted to NaNs.

Fix LowVarianceTest, two zeroes were sampled and converted to negative
infinities.
…uction

The TwoChannelMissingValues implementation was accessing the dimension
of the preferred matrix with a design element from the base channel.
This fails if the design element was filtered out.
…ject

Use data vector slicing utilities to re-arrange the preferred raw data
and mask to match the target dimension that will be used for the
processed data.

Fix #1572 since we no-longer need to initialize the assays/samples from
the vectors.
Add some basic type safety to the Persister interface.
The contiguous trick cannot be used if there are no assays to slice.
Rewriting the ProcessedExpressionDataVectorCreationHelperService to not
use DoubleVectorValueObject resulted in a regression in outlier masking.
Those were actually performing the masking when being converted back to
processed data vectors.

Instead, explicitly mask outlier assays in the service.
Make sure that URLs generated by the REST API are to other resources
provided by the API.

Add missing beans in tests.
Remove plexus-io workaround, it's been updated in the affected Maven
plugins.

Update baseCode to 1.1.33.

Update gsec to 0.0.22.
@arteymix arteymix merged commit 38bc1e2 into master Jan 7, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants