Package: scITD 1.0.4

scITD: Single-Cell Interpretable Tensor Decomposition

Single-cell Interpretable Tensor Decomposition (scITD) employs the Tucker tensor decomposition to extract multicell-type gene expression patterns that vary across donors/individuals. This tool is geared for use with single-cell RNA-sequencing datasets consisting of many source donors. The method has a wide range of potential applications, including the study of inter-individual variation at the population-level, patient sub-grouping/stratification, and the analysis of sample-level batch effects. Each "multicellular process" that is extracted consists of (A) a multi cell type gene loadings matrix and (B) a corresponding donor scores vector indicating the level at which the corresponding loadings matrix is expressed in each donor. Additional methods are implemented to aid in selecting an appropriate number of factors and to evaluate stability of the decomposition. Additional tools are provided for downstream analysis, including integration of gene set enrichment analysis and ligand-receptor analysis. Tucker, L.R. (1966) <doi:10.1007/BF02289464>. Unkel, S., Hannachi, A., Trendafilov, N. T., & Jolliffe, I. T. (2011) <doi:10.1007/s13253-011-0055-9>. Zhou, G., & Cichocki, A. (2012) <doi:10.2478/v10175-012-0051-4>.

Authors:Jonathan Mitchel [cre, aut], Evan Biederstedt [aut], Peter Kharchenko [aut]

scITD_1.0.4.tar.gz
scITD_1.0.4.zip(r-4.5)scITD_1.0.4.zip(r-4.4)scITD_1.0.4.zip(r-4.3)
scITD_1.0.4.tgz(r-4.4-x86_64)scITD_1.0.4.tgz(r-4.4-arm64)scITD_1.0.4.tgz(r-4.3-x86_64)scITD_1.0.4.tgz(r-4.3-arm64)
scITD_1.0.4.tar.gz(r-4.5-noble)scITD_1.0.4.tar.gz(r-4.4-noble)
scITD_1.0.4.tgz(r-4.4-emscripten)scITD_1.0.4.tgz(r-4.3-emscripten)
scITD.pdf |scITD.html
scITD/json (API)

# Install 'scITD' in R:
install.packages('scITD', repos = c('https://j-mitchel.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:

On CRAN:

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

1.93 score 17 scripts 192 downloads 68 exports 157 dependencies

Last updated 1 years agofrom:0f6459f706. Checks:OK: 8 WARNING: 1. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 13 2024
R-4.5-win-x86_64OKNov 13 2024
R-4.5-linux-x86_64WARNINGNov 13 2024
R-4.4-win-x86_64OKNov 13 2024
R-4.4-mac-x86_64OKNov 13 2024
R-4.4-mac-aarch64OKNov 13 2024
R-4.3-win-x86_64OKNov 13 2024
R-4.3-mac-x86_64OKNov 13 2024
R-4.3-mac-aarch64OKNov 13 2024

Exports:apply_combatclean_datacolMeanVarscompare_decompositionscompute_associationscompute_donor_propscompute_LR_interactconvert_gndetermine_ranks_tuckerform_tensorget_all_lds_factor_plotsget_ctype_prop_associationsget_ctype_subc_prop_associationsget_ctype_vargenesget_donor_metaget_gene_modulesget_intersecting_pathwaysget_lm_pvalsget_meta_associationsget_min_sig_genesget_normalized_varianceget_num_batch_ranksget_one_factorget_one_factor_gene_pvalsget_pseudobulkget_subclust_enr_dotplotget_subclust_enr_figget_subclust_umapget_subclustersget_subtype_prop_associationsget_sumsidentify_sex_metadatainitialize_paramsinstantiate_scMinimalmake_new_containernmf_unfoldednorm_var_helpernormalize_countsnormalize_pseudobulkparse_data_by_ctypespca_unfoldedplot_donor_matrixplot_donor_sig_genesplot_dscore_enrplot_gsea_hmap_w_similarityplot_gsea_subplot_loadings_annotplot_mod_and_ligplot_multi_module_enrplot_scores_by_metaplot_select_setsplot_subclust_associationsprep_LR_interactproject_new_datareduce_dimensionsreduce_to_vargenesrender_multi_plotsrun_gsea_one_factorrun_jackstrawrun_stability_analysisrun_tucker_icascale_varianceseurat_to_scMinimalstack_tensorsubset_scMinimaltucker_ica_helperupdate_paramsvargenes_anova

Dependencies:abindannotateAnnotationDbiaskpassbabelgenebackportsBHBiobaseBiocGenericsBiocManagerBiocParallelBiostringsbitbit64blobbootbroomcachemcarcarDatacirclizecliclueclustercodetoolscolorspaceComplexHeatmapcorrplotcowplotcpp11crayoncurldata.tableDBIDerivdigestdoBydoParalleldplyrdqrngedgeRfansifarverfastmapfastmatchfgseaFNNforeachformatRFormulafutile.loggerfutile.optionsgenefiltergenericsGenomeInfoDbGenomeInfoDbDataGetoptLongggplot2ggpubrggrepelggsciggsignifGlobalOptionsgluegridBasegridExtragtablehttricaigraphIRangesirlbaisobanditeratorsjsonliteKEGGRESTlabelinglambda.rlatticelifecyclelimmalme4locfitmagrittrMASSMatrixMatrixGenericsMatrixModelsmatrixStatsmemoisemgcvmicrobenchmarkmimeminqamodelrmsigdbrmunsellnlmenloptrNMFnnetnumDerivopensslpbkrtestpbmcapplypillarpkgconfigplogrplyrpngpolynompROCpurrrquantregR6RColorBrewerRcppRcppAnnoyRcppArmadilloRcppEigenRcppProgressregistryreshape2rjsonrlangRmiscrngtoolsRSpectraRSQLiterstatixrTensorS4VectorsscalessccoreshapesitmosnowSparseMstatmodstringistringrsurvivalsvasystibbletidyrtidyselectUCSC.utilsutf8uwotvctrsviridisLitewithrXMLxtableXVectorzlibbioc

Readme and manuals

Help Manual

Help pageTopics
Apply ComBat batch correction to pseudobulk matrices. Generally, this should be done through calling the form_tensor() wrapper function.apply_combat
Calculate F-Statistics for the association between donor scores for each factor donor values of shuffled gene_ctype fiberscalculate_fiber_fstats
Helper function to check whether receptor is present in target cell typecheck_rec_pres
Clean data to remove genes only expressed in a few cells and donors with very few cells. Generally, this should be done through calling the form_tensor() wrapper function.clean_data
Calculates column mean and variance. Adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/src/misc2.cppcolMeanVars
Plot a pairwise comparison of factors from two separate decompositionscompare_decompositions
Compute associations between donor proportions and factor scorescompute_associations
Get donor proportions of each cell type or subtypecompute_donor_props
Compute and plot the LR interactions for one factorcompute_LR_interact
Convert gene identifiers to gene symbolsconvert_gn
count_word. From older version of simplifyEnrichment package.count_word
Run rank determination by svd on the tensor unfolded along each modedetermine_ranks_tucker
Form the pseudobulk tensor as preparation for running the tensor decomposition.form_tensor
Generate loadings heatmaps for all factorsget_all_lds_factor_plots
Get gene callout annotations for a loadings heatmapget_callouts_annot
Get explained variance of the reconstructed data using one cell type from one factorget_ctype_exp_var
Compute and plot associations between donor factor scores and donor proportions of major cell typesget_ctype_prop_associations
Compute and plot associations between donor factor scores and donor proportions of cell subtypesget_ctype_subc_prop_associations
Partition main gene by cell matrix into per cell type matrices with significantly variable genes only. Generally, this should be done through calling the form_tensor() wrapper function.get_ctype_vargenes
Get metadata matrix of dimensions donors by variables (not per cell)get_donor_meta
Get the explained variance of the reconstructed data using one factorget_factor_exp_var
Calculate adjusted p-values for gene_celltype fiber-donor score associationsget_fstats_pvals
Compute WGCNA gene modules for each cell typeget_gene_modules
Get logical vectors indicating which genes are in which pathwaysget_gene_set_vectors
Compute subtype proportion-factor association p-values for all subclusters of a given major cell typeget_indv_subtype_associations
Extract the intersection of gene sets which are enriched in two or more cell types for a factorget_intersecting_pathways
Get the leading edge genes from GSEA resultsget_leading_edge_genes
Compute gene-factor associations using univariate linear modelsget_lm_pvals
Computes the max correlation between each factor of the decomposition done using the whole dataset to each factor computed using the subsampled/bootstrapped datasetget_max_correlations
Get metadata associations with factor donor scoresget_meta_associations
Evaluate the minimum number for significant genes in any factor for a given number of factors extracted by the decompositionget_min_sig_genes
Identify gene sets that are enriched within specified gene co-regulatory modules. Uses a hypergeometric test for over-representation. Used in plot_multi_module_enr().get_module_enr
Get normalized variance for each gene, taking into account mean-variance trendget_normalized_variance
Plot factor-batch associations for increasing number of donor factorsget_num_batch_ranks
Get the donor scores and loadings matrix for a single-factorget_one_factor
Get significant genes for a factorget_one_factor_gene_pvals
Collapse data from cell-level to donor-level via summing counts. Generally, this should be done through calling the form_tensor() wrapper function.get_pseudobulk
Get F-Statistics for the real (non-shuffled) gene_ctype fibersget_real_fstats
Calculate reconstruction errors using svd approachget_reconstruct_errors_svd
Get vectors indicating which genes are significant in which cell types for a factor of interestget_significance_vectors
Get list of cell subtype differential expression heatmapsget_subclust_de_hmaps
Get scatter plot for association of a cell subtype proportion with scores for a factorget_subclust_enr_dotplot
Get a figure showing cell subtype proportion associations with each factor. Combines this plot with subtype UMAPs and differential expression heatmaps. Note that this function runs better if the number of cores in the conos object in container$embedding has n.cores set to a relatively small value < 10.get_subclust_enr_fig
Get heatmap of subtype proportion associations for each celltype/subtype and each factorget_subclust_enr_hmap
Get a figure to display subclusterings at multiple resolutionsget_subclust_umap
Perform leiden subclustering to get cell subtypesget_subclusters
Compute and plot associations between factor scores and cell subtype composition for various clustering resolution parametersget_subtype_prop_associations
Calculates factor-stratified sums for each column. Adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/src/misc2.cppget_sums
Visualize the similarity matrix and the clustering. Adapted from simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/ht_clusters.Rht_clusters
Extract metadata for sex information if not provided alreadyidentify_sex_metadata
Initialize parameters to be used throughout scITD in various functionsinitialize_params
Create an scMinimal object. Generally, this should be done through calling the make_new_container() wrapper function.instantiate_scMinimal
Check if a character is a go IDis_GO_id
Create a container to store all data and results for the project. You must provide a params list as generated by initialize_params(). You also need to provide either a Seurat object or both a count_data matrix and a meta_data matrix.make_new_container
Merge small subclusters into larger onesmerge_small_clusts
Computes non-negative matrix factorization on the tensor unfolded along the donor dimensionnmf_unfolded
Calculates the normalized variance for each gene. This is adapted from pagoda2. https://github.com/kharchenkolab/pagoda2/blob/main/R/Pagoda2.R Generally, this should be done through calling the form_tensor() wrapper function.norm_var_helper
Helper function to normalize and log-transform count datanormalize_counts
Normalize the pseudobulked counts matrices. Generally, this should be done through calling the form_tensor() wrapper function.normalize_pseudobulk
Parse main counts matrix into per-celltype-matrices. Generally, this should be done through calling the form_tensor() wrapper function.parse_data_by_ctypes
Computes singular-value decomposition on the tensor unfolded along the donor dimensionpca_unfolded
Plot matrix of donor scores extracted from Tucker decompositionplot_donor_matrix
Plot donor celltype/subtype proportions against each factorplot_donor_props
Generate a gene by donor heatmap showing scaled expression of top loading genes for a given factorplot_donor_sig_genes
Compute enrichment of donor metadata categorical variables at high/low factor scoresplot_dscore_enr
Plot enriched gene sets from all cell types in a heatmapplot_gsea_hmap
Plot already computed enriched gene sets to show semantic similarity between setsplot_gsea_hmap_w_similarity
Look at enriched gene sets from a cluster of semantically similar gene sets. Uses the results from previous run of plot_gsea_hmap_w_similarity()plot_gsea_sub
Plot the gene by celltype loadings for a factorplot_loadings_annot
Plot trio of associations between ligand expression, module eigengenes, and factor scoresplot_mod_and_lig
Generate gene set x ct_module heatmap showing co-expression module gene set enrichment resultsplot_multi_module_enr
Plot reconstruction errors as bar plot for svd methodplot_rec_errors_bar_svd
Plot reconstruction errors as line plot for svd methodplot_rec_errors_line_svd
Plot dotplots for each factor to compare donor scores between metadata groupsplot_scores_by_meta
Plot enrichment results for hand picked gene setsplot_select_sets
Generate a plot for either the donor scores or loadings stability testplot_stability_results
Plot association significances for varying clustering resolutionsplot_subclust_associations
Plot a heatmap of differential genes. Code is adapted from Conos package. https://github.com/kharchenkolab/conos/blob/master/R/plot.RplotDEheatmap_conos
Prepare data for LR analysis and get soft thresholds to use for gene modulesprep_LR_interact
Project multicellular patterns to get scores on new dataproject_new_data
Gets a conos object of the data, aligning datasets across a specified variable such as batch or donors. This can be run independently or through get_subtype_prop_associations().reduce_dimensions
Reduce each cell type's expression matrix to just the significantly variable genes. Generally, this should be done through calling the form_tensor() wrapper function.reduce_to_vargenes
Create a figure of all loadings plots arrangedrender_multi_plots
Reshape loadings for a factor from linearized to matrix formreshape_loadings
Run fgsea for one cell type of one factorrun_fgsea
Run gsea separately for all cell types of one specified factor and plot resultsrun_gsea_one_factor
Compute enriched gene sets among significant genes in a cell type for a factor using hypergeometric testrun_hypergeometric_gsea
Run jackstraw to get genes that are significantly associated with donor scores for factors extracted by Tucker decompositionrun_jackstraw
Test stability of a decomposition by subsampling or bootstrapping donors. Note that running this function will replace the decomposition in the project container with one resulting from the tucker parameters entered here.run_stability_analysis
Run the Tucker decomposition and rotate the factorsrun_tucker_ica
Get a list of tensor fibers to shufflesample_fibers
Scale font size. From simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/ht_clusters.Rscale_fontsize
Scale variance across donors for each gene within each cell type. Generally, this should be done through calling the form_tensor() wrapper function.scale_variance
Convert Seurat object to scMinimal object. Generally, this should be done through calling the make_new_container() wrapper function.seurat_to_scMinimal
Shuffle elements within the selected fibersshuffle_fibers
Create the tensor object by stacking each pseudobulk cell type matrix. Generally, this should be done through calling the form_tensor() wrapper function.stack_tensor
Helper function from simplifyEnrichment package. https://github.com/jokergoo/simplifyEnrichment/blob/master/R/utils.Rstop_wrap
Subset an scMinimal object by specified genes, donors, cells, or cell typessubset_scMinimal
Data container for testing tensor formation stepstest_container
Helper function for running the decomposition. Use the run_tucker_ica() wrapper function instead.tucker_ica_helper
Update any of the experiment-wide parametersupdate_params
Compute significantly variable genes via anova. Generally, this should be done through calling the form_tensor() wrapper function.vargenes_anova