Package: scITD 1.0.4

scITD: Single-Cell Interpretable Tensor Decomposition

Single-cell Interpretable Tensor Decomposition (scITD) employs the Tucker tensor decomposition to extract multicell-type gene expression patterns that vary across donors/individuals. This tool is geared for use with single-cell RNA-sequencing datasets consisting of many source donors. The method has a wide range of potential applications, including the study of inter-individual variation at the population-level, patient sub-grouping/stratification, and the analysis of sample-level batch effects. Each "multicellular process" that is extracted consists of (A) a multi cell type gene loadings matrix and (B) a corresponding donor scores vector indicating the level at which the corresponding loadings matrix is expressed in each donor. Additional methods are implemented to aid in selecting an appropriate number of factors and to evaluate stability of the decomposition. Additional tools are provided for downstream analysis, including integration of gene set enrichment analysis and ligand-receptor analysis. Tucker, L.R. (1966) <doi:10.1007/BF02289464>. Unkel, S., Hannachi, A., Trendafilov, N. T., & Jolliffe, I. T. (2011) <doi:10.1007/s13253-011-0055-9>. Zhou, G., & Cichocki, A. (2012) <doi:10.2478/v10175-012-0051-4>.

Authors:Jonathan Mitchel [cre, aut], Evan Biederstedt [aut], Peter Kharchenko [aut]

scITD_1.0.4.tar.gz
scITD_1.0.4.zip(r-4.5)scITD_1.0.4.zip(r-4.4)
scITD_1.0.4.tgz(r-4.5-x86_64)scITD_1.0.4.tgz(r-4.5-arm64)scITD_1.0.4.tgz(r-4.4-x86_64)scITD_1.0.4.tgz(r-4.4-arm64)
scITD_1.0.4.tar.gz(r-4.5-noble)scITD_1.0.4.tar.gz(r-4.4-noble)
scITD_1.0.4.tgz(r-4.4-emscripten)
scITD.pdf |scITD.html✨
scITD/json (API)

# Install 'scITD' in R:

install.packages('scITD', repos = c('https://j-mitchel.r-universe.dev', 'https://cloud.r-project.org'))

Uses libs:

c++– GNU Standard C++ Library v3

Datasets:

test_container - Data container for testing tensor formation steps

On CRAN:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

cpp

1.98 score 19 scripts 203 downloads 68 exports 159 dependencies

Last updated 2 years agofrom:0f6459f706. Checks:9 OK. Indexed: yes.

Target	Result	Latest binary
Doc / Vignettes	OK	Mar 13 2025
R-4.5-win-x86_64	OK	Mar 13 2025
R-4.5-mac-x86_64	OK	Mar 13 2025
R-4.5-mac-aarch64	OK	Mar 13 2025
R-4.5-linux-x86_64	OK	Mar 13 2025
R-4.4-win-x86_64	OK	Mar 13 2025
R-4.4-mac-x86_64	OK	Mar 13 2025
R-4.4-mac-aarch64	OK	Mar 13 2025
R-4.4-linux-x86_64	OK	Mar 13 2025

Exports:apply_combat clean_data colMeanVars compare_decompositions compute_associations compute_donor_props compute_LR_interact convert_gn determine_ranks_tucker form_tensor get_all_lds_factor_plots get_ctype_prop_associations get_ctype_subc_prop_associations get_ctype_vargenes get_donor_meta get_gene_modules get_intersecting_pathways get_lm_pvals get_meta_associations get_min_sig_genes get_normalized_variance get_num_batch_ranks get_one_factor get_one_factor_gene_pvals get_pseudobulk get_subclust_enr_dotplot get_subclust_enr_fig get_subclust_umap get_subclusters get_subtype_prop_associations get_sums identify_sex_metadata initialize_params instantiate_scMinimal make_new_container nmf_unfolded norm_var_helper normalize_counts normalize_pseudobulk parse_data_by_ctypes pca_unfolded plot_donor_matrix plot_donor_sig_genes plot_dscore_enr plot_gsea_hmap_w_similarity plot_gsea_sub plot_loadings_annot plot_mod_and_lig plot_multi_module_enr plot_scores_by_meta plot_select_sets plot_subclust_associations prep_LR_interact project_new_data reduce_dimensions reduce_to_vargenes render_multi_plots run_gsea_one_factor run_jackstraw run_stability_analysis run_tucker_ica scale_variance seurat_to_scMinimal stack_tensor subset_scMinimal tucker_ica_helper update_params vargenes_anova

Dependencies:abind annotate AnnotationDbi askpass babelgene backports BH Biobase BiocGenerics BiocManager BiocParallel Biostrings bit bit64 blob boot broom cachem car carData circlize cli clue cluster codetools colorspace ComplexHeatmap corrplot cowplot cpp11 crayon curl data.table DBI Deriv digest doBy doParallel dplyr dqrng edgeR fansi farver fastmap fastmatch fgsea FNN foreach formatR Formula futile.logger futile.options genefilter generics GenomeInfoDb GenomeInfoDbData GetoptLong ggplot2 ggpubr ggrepel ggsci ggsignif GlobalOptions glue gridBase gridExtra gtable httr ica igraph IRanges irlba isoband iterators jsonlite KEGGREST labeling lambda.r lattice lifecycle limma lme4 locfit magrittr MASS Matrix MatrixGenerics MatrixModels matrixStats memoise mgcv microbenchmark mime minqa modelr msigdbr munsell nlme nloptr NMF nnet numDeriv openssl pbkrtest pbmcapply pillar pkgconfig plogr plyr png polynom pROC purrr quantreg R6 rbibutils RColorBrewer Rcpp RcppAnnoy RcppArmadillo RcppEigen RcppProgress Rdpack reformulas registry reshape2 rjson rlang Rmisc rngtools RSpectra RSQLite rstatix rTensor S4Vectors scales sccore shape sitmo snow SparseM statmod stringi stringr survival sva sys tibble tidyr tidyselect UCSC.utils utf8 uwot vctrs viridisLite withr XML xtable XVector

Help page	Topics
Apply ComBat batch correction to pseudobulk matrices. Generally, this should be done through calling the form_tensor() wrapper function.	apply_combat
Calculate F-Statistics for the association between donor scores for each factor donor values of shuffled gene_ctype fibers	calculate_fiber_fstats
Helper function to check whether receptor is present in target cell type	check_rec_pres
Clean data to remove genes only expressed in a few cells and donors with very few cells. Generally, this should be done through calling the form_tensor() wrapper function.	clean_data
Calculates column mean and variance. Adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/src/misc2.cpp	colMeanVars
Plot a pairwise comparison of factors from two separate decompositions	compare_decompositions
Compute associations between donor proportions and factor scores	compute_associations
Get donor proportions of each cell type or subtype	compute_donor_props
Compute and plot the LR interactions for one factor	compute_LR_interact
Convert gene identifiers to gene symbols	convert_gn
count_word. From older version of simplifyEnrichment package.	count_word
Run rank determination by svd on the tensor unfolded along each mode	determine_ranks_tucker
Form the pseudobulk tensor as preparation for running the tensor decomposition.	form_tensor
Generate loadings heatmaps for all factors	get_all_lds_factor_plots
Get gene callout annotations for a loadings heatmap	get_callouts_annot
Get explained variance of the reconstructed data using one cell type from one factor	get_ctype_exp_var
Compute and plot associations between donor factor scores and donor proportions of major cell types	get_ctype_prop_associations
Compute and plot associations between donor factor scores and donor proportions of cell subtypes	get_ctype_subc_prop_associations
Partition main gene by cell matrix into per cell type matrices with significantly variable genes only. Generally, this should be done through calling the form_tensor() wrapper function.	get_ctype_vargenes
Get metadata matrix of dimensions donors by variables (not per cell)	get_donor_meta
Get the explained variance of the reconstructed data using one factor	get_factor_exp_var
Calculate adjusted p-values for gene_celltype fiber-donor score associations	get_fstats_pvals
Compute WGCNA gene modules for each cell type	get_gene_modules
Get logical vectors indicating which genes are in which pathways	get_gene_set_vectors
Compute subtype proportion-factor association p-values for all subclusters of a given major cell type	get_indv_subtype_associations
Extract the intersection of gene sets which are enriched in two or more cell types for a factor	get_intersecting_pathways
Get the leading edge genes from GSEA results	get_leading_edge_genes
Compute gene-factor associations using univariate linear models	get_lm_pvals
Computes the max correlation between each factor of the decomposition done using the whole dataset to each factor computed using the subsampled/bootstrapped dataset	get_max_correlations
Get metadata associations with factor donor scores	get_meta_associations
Evaluate the minimum number for significant genes in any factor for a given number of factors extracted by the decomposition	get_min_sig_genes
Identify gene sets that are enriched within specified gene co-regulatory modules. Uses a hypergeometric test for over-representation. Used in plot_multi_module_enr().	get_module_enr
Get normalized variance for each gene, taking into account mean-variance trend	get_normalized_variance
Plot factor-batch associations for increasing number of donor factors	get_num_batch_ranks
Get the donor scores and loadings matrix for a single-factor	get_one_factor
Get significant genes for a factor	get_one_factor_gene_pvals
Collapse data from cell-level to donor-level via summing counts. Generally, this should be done through calling the form_tensor() wrapper function.	get_pseudobulk
Get F-Statistics for the real (non-shuffled) gene_ctype fibers	get_real_fstats
Calculate reconstruction errors using svd approach	get_reconstruct_errors_svd
Get vectors indicating which genes are significant in which cell types for a factor of interest	get_significance_vectors
Get list of cell subtype differential expression heatmaps	get_subclust_de_hmaps
Get scatter plot for association of a cell subtype proportion with scores for a factor	get_subclust_enr_dotplot
Get a figure showing cell subtype proportion associations with each factor. Combines this plot with subtype UMAPs and differential expression heatmaps. Note that this function runs better if the number of cores in the conos object in container$embedding has n.cores set to a relatively small value < 10.	get_subclust_enr_fig
Get heatmap of subtype proportion associations for each celltype/subtype and each factor	get_subclust_enr_hmap
Get a figure to display subclusterings at multiple resolutions	get_subclust_umap
Perform leiden subclustering to get cell subtypes	get_subclusters
Compute and plot associations between factor scores and cell subtype composition for various clustering resolution parameters	get_subtype_prop_associations
Calculates factor-stratified sums for each column. Adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/src/misc2.cpp	get_sums
Visualize the similarity matrix and the clustering. Adapted from simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/ht_clusters.R	ht_clusters
Extract metadata for sex information if not provided already	identify_sex_metadata
Initialize parameters to be used throughout scITD in various functions	initialize_params
Create an scMinimal object. Generally, this should be done through calling the make_new_container() wrapper function.	instantiate_scMinimal
Check if a character is a go ID	is_GO_id
Create a container to store all data and results for the project. You must provide a params list as generated by initialize_params(). You also need to provide either a Seurat object or both a count_data matrix and a meta_data matrix.	make_new_container
Merge small subclusters into larger ones	merge_small_clusts
Computes non-negative matrix factorization on the tensor unfolded along the donor dimension	nmf_unfolded
Calculates the normalized variance for each gene. This is adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/R/Pagoda2.R Generally, this should be done through calling the form_tensor() wrapper function.	norm_var_helper
Helper function to normalize and log-transform count data	normalize_counts
Normalize the pseudobulked counts matrices. Generally, this should be done through calling the form_tensor() wrapper function.	normalize_pseudobulk
Parse main counts matrix into per-celltype-matrices. Generally, this should be done through calling the form_tensor() wrapper function.	parse_data_by_ctypes
Computes singular-value decomposition on the tensor unfolded along the donor dimension	pca_unfolded
Plot matrix of donor scores extracted from Tucker decomposition	plot_donor_matrix
Plot donor celltype/subtype proportions against each factor	plot_donor_props
Generate a gene by donor heatmap showing scaled expression of top loading genes for a given factor	plot_donor_sig_genes
Compute enrichment of donor metadata categorical variables at high/low factor scores	plot_dscore_enr
Plot enriched gene sets from all cell types in a heatmap	plot_gsea_hmap
Plot already computed enriched gene sets to show semantic similarity between sets	plot_gsea_hmap_w_similarity
Look at enriched gene sets from a cluster of semantically similar gene sets. Uses the results from previous run of plot_gsea_hmap_w_similarity()	plot_gsea_sub
Plot the gene by celltype loadings for a factor	plot_loadings_annot
Plot trio of associations between ligand expression, module eigengenes, and factor scores	plot_mod_and_lig
Generate gene set x ct_module heatmap showing co-expression module gene set enrichment results	plot_multi_module_enr
Plot reconstruction errors as bar plot for svd method	plot_rec_errors_bar_svd
Plot reconstruction errors as line plot for svd method	plot_rec_errors_line_svd
Plot dotplots for each factor to compare donor scores between metadata groups	plot_scores_by_meta
Plot enrichment results for hand picked gene sets	plot_select_sets
Generate a plot for either the donor scores or loadings stability test	plot_stability_results
Plot association significances for varying clustering resolutions	plot_subclust_associations
Plot a heatmap of differential genes. Code is adapted from Conos package. https://github.com/kharchenkolab/conos/blob/master/R/plot.R	plotDEheatmap_conos
Prepare data for LR analysis and get soft thresholds to use for gene modules	prep_LR_interact
Project multicellular patterns to get scores on new data	project_new_data
Gets a conos object of the data, aligning datasets across a specified variable such as batch or donors. This can be run independently or through get_subtype_prop_associations().	reduce_dimensions
Reduce each cell type's expression matrix to just the significantly variable genes. Generally, this should be done through calling the form_tensor() wrapper function.	reduce_to_vargenes
Create a figure of all loadings plots arranged	render_multi_plots
Reshape loadings for a factor from linearized to matrix form	reshape_loadings
Run fgsea for one cell type of one factor	run_fgsea
Run gsea separately for all cell types of one specified factor and plot results	run_gsea_one_factor
Compute enriched gene sets among significant genes in a cell type for a factor using hypergeometric test	run_hypergeometric_gsea
Run jackstraw to get genes that are significantly associated with donor scores for factors extracted by Tucker decomposition	run_jackstraw
Test stability of a decomposition by subsampling or bootstrapping donors. Note that running this function will replace the decomposition in the project container with one resulting from the tucker parameters entered here.	run_stability_analysis
Run the Tucker decomposition and rotate the factors	run_tucker_ica
Get a list of tensor fibers to shuffle	sample_fibers
Scale font size. From simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/ht_clusters.R	scale_fontsize
Scale variance across donors for each gene within each cell type. Generally, this should be done through calling the form_tensor() wrapper function.	scale_variance
Convert Seurat object to scMinimal object. Generally, this should be done through calling the make_new_container() wrapper function.	seurat_to_scMinimal
Shuffle elements within the selected fibers	shuffle_fibers
Create the tensor object by stacking each pseudobulk cell type matrix. Generally, this should be done through calling the form_tensor() wrapper function.	stack_tensor
Helper function from simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/utils.R	stop_wrap
Subset an scMinimal object by specified genes, donors, cells, or cell types	subset_scMinimal
Data container for testing tensor formation steps	test_container
Helper function for running the decomposition. Use the run_tucker_ica() wrapper function instead.	tucker_ica_helper
Update any of the experiment-wide parameters	update_params
Compute significantly variable genes via anova. Generally, this should be done through calling the form_tensor() wrapper function.	vargenes_anova

Package: scITD 1.0.4

scITD: Single-Cell Interpretable Tensor Decomposition

Citation

Readme and manuals

Help Manual

Usage by other packages (reverse dependencies)