scButterfly.butterfly.Butterfly.data_preprocessing

Butterfly.data_preprocessing(normalize_total=True, log1p=True, use_hvg=True, n_top_genes=3000, binary_data=True, filter_features=True, fpeaks=0.005, tfidf=True, normalize=True, save_data=False, file_path=None, logging_path=None)

Preprocessing for RNA data and ATAC data in Butterfly.

Parameters:
  • normalize_total (bool) – choose use normalization or not, default True.

  • log1p (bool) – choose use log transformation or not, default True.

  • use_hvg (bool) – choose use highly variable genes or not, default True.

  • n_top_genes (int) – the count of highly variable genes, if not use highly variable, set use_hvg = False and n_top_genes = None, default 3000.

  • binary_data (bool) – choose binarized ATAC data or not, default True.

  • filter_features (bool) – choose use peaks filtering or not, default True.

  • fpeaks (float) – filter out the peaks expressed less than fpeaks*n_cells, if don’t filter peaks set it None, default 0.005.

  • tfidf (bool) – choose using TF-IDF transform or not, default True.

  • normalize (bool) – choose scale data to [0, 1] or not, default True.

  • save_data (bool) – choose save the processed data or not, default False.

  • file_path (str) – the path for saving processed data, only used if save_data is True, default None.

  • logging_path (str) – the path for output process logging, if not save, set it None, default None.