11. beta_UMAP.py

11.1. Description

This program performs UMAP (Uniform Manifold Approximation and Projection) non-linear dimension reduction.

Example of input data file

ID     Sample_01       Sample_02       Sample_03       Sample_04
cg_001 0.831035        0.878022        0.794427        0.880911
cg_002 0.249544        0.209949        0.234294        0.236680
cg_003 0.845065        0.843957        0.840184        0.824286

Example of input group file



  • Rows with missing values will be removed
  • Beta values will be standardized into z scores
  • Only the first two components will be visualized
--version show program’s version number and exit
-h, --help show this help message and exit
-i INPUT_FILE, --input_file=INPUT_FILE
 Tab-separated data frame file containing beta values with the 1st row containing sample IDs and the 1st column containing CpG IDs.
 Comma-separated group file defining the biological groups of each sample. Different groups will be colored differently in the 2-dimensional plot. Supports a maximum of 20 groups.
 Number of components. default=2
 This parameter controls the size of the local neighborhood UMAP will look at when attempting to learn the manifold structure of the data. Low values of ‘–nneighbors’ will force UMAP to concentrate on local structure, while large values will push UMAP to look at larger neighborhoods of each point when estimating the manifold structure of the data. Choose a value from [2, 200]. default=15
 This parameter controls how tightly UMAP is allowed to pack points together. Choose a value from [0, 1). default=0.2
-l, --label If True, sample ids will be added underneath the data point. default=False
 Ploting character: 1 = ‘dot’, 2 = ‘circle’. default=1
 Opacity of dots. default=0.5
 Location of legend panel: 1 = ‘topright’, 2 = ‘bottomright’, 3 = ‘bottomleft’, 4 = ‘topleft’. default=1
-o OUT_FILE, --output=OUT_FILE
 The prefix of the output file.

11.3. Command

$beta_UMAP.py -i cirrHCV_vs_normal.data.tsv -g cirrHCV_vs_normal.grp.csv -o cirrHCV_vs_normal -l

11.4. Output files

  • cirrHCV_vs_normal.UMAP.r
  • cirrHCV_vs_normal.UMAP.tsv
  • cirrHCV_vs_normal.UMAP.pdf