scButterfly.butterfly.Butterfly.data_preprocessing¶
- Butterfly.data_preprocessing(normalize_total=True, log1p=True, use_hvg=True, n_top_genes=3000, binary_data=True, filter_features=True, fpeaks=0.005, tfidf=True, normalize=True, save_data=False, file_path=None, logging_path=None)¶
Preprocessing for RNA data and ATAC data in Butterfly.
- Parameters:
normalize_total (bool) – choose use normalization or not, default True.
log1p (bool) – choose use log transformation or not, default True.
use_hvg (bool) – choose use highly variable genes or not, default True.
n_top_genes (int) – the count of highly variable genes, if not use highly variable, set use_hvg = False and n_top_genes = None, default 3000.
binary_data (bool) – choose binarized ATAC data or not, default True.
filter_features (bool) – choose use peaks filtering or not, default True.
fpeaks (float) – filter out the peaks expressed less than fpeaks*n_cells, if don’t filter peaks set it None, default 0.005.
tfidf (bool) – choose using TF-IDF transform or not, default True.
normalize (bool) – choose scale data to [0, 1] or not, default True.
save_data (bool) – choose save the processed data or not, default False.
file_path (str) – the path for saving processed data, only used if save_data is True, default None.
logging_path (str) – the path for output process logging, if not save, set it None, default None.