gencfu - Linux


gencfu is a tool used primarily in bioinformatics to generate clustered features (CFUs) from single-cell sequencing data. CFUs are groups of cells that share similar expression profiles and are believed to belong to the same cell type or subpopulation.


gencfu [options] <input.tsv> <output.tsv>


  • -c, –clusters: Number of clusters to generate. Default: 10
  • -m, –metric: Distance metric to use for clustering. Options include:
    • euclidean
    • cosine
    • pearson Default: euclidean
  • -t, –threshold: Minimum distance threshold for merging clusters. Default: 0.5
  • -i, –iterations: Number of iterations to run the clustering algorithm. Default: 100
  • -s, –seed: Random seed for the clustering algorithm. If not specified, a random seed will be used.
  • -o, –output-stats: Write additional statistics to the output file.


  • Generate 15 CFUs from a single-cell RNA-seq dataset using the default settings:
gencfu -c 15 input.tsv output.tsv
  • Generate 20 CFUs using the cosine distance metric with a threshold of 0.8:
gencfu -c 20 -m cosine -t 0.8 input.tsv output.tsv
  • Run the clustering algorithm for 200 iterations and write additional statistics to the output file:
gencfu -i 200 -s 1234 -o output.tsv input.tsv

Common Issues

  • If the clustering algorithm does not converge, try increasing the number of iterations (-i).
  • If the resulting CFUs are too broad or too specific, adjust the threshold (-t) accordingly.


gencfu can be integrated into analysis pipelines for single-cell RNA-seq data. It can be used as a preprocessing step to identify cell types and subpopulations, which can then be used for downstream analysis, such as gene expression analysis or cell trajectory reconstruction.

Related Commands

  • Seurat: A comprehensive software package for single-cell RNA-seq analysis that includes clustering functionality.
  • Scanpy: Another popular Python package for single-cell RNA-seq analysis that provides various clustering methods.