Skip to content

upgrade package to account for more than 2 omics

  1. create a branch dev_morethan2
  2. replace:
  • omic1 omic2 by omics (a list of data frames: omics[0] will correspond to omic1
  • eta1 eta2 by etas (a vector of eta)
  • H1_0, H2_0 (same as for omics)
  • Beta1_0, Beta2_0 (same as for omics)
  • H1, H2 (same as for omics)
  • Beta1, Beta2 (same as for omics)
  • margH1, margH2 by margH (they were already list of data frames so they will become list of lists [i][j]: i is the omic and j the iteration)
  • gradH1, gradH2(same as for margH)
  • distort1, distort2 (same as for margH)
  • sparsity1, sparsity2 (same as for margH)
  • pred1, pred2 (same as for margH)
  • R2adj_1, R2adj_2 by R2adj (a list of vectors)
  • BIC_1, BIC_2 (same as forR2adj`)
  • AIC_1, AIC_2 (same as for R2adj)
  • F_pval_1, F_pval_2 (same as for R2adj)
  • bet1_1, bet1_2, bet2_1, bet2_2 by bet (list of list [i][j]: i is the omic and j the iteration)
  • proj1, proj2 by a list of data frames
  1. this impacts:
  • _update_W (omics, H) for B, loop over H and sum, for C, loop over H and omic and sum
  • _computeF (omics, H, Beta) for distort and sparse and pred, loop and save in a list + sum (same loop with an option detail to keep individual components if needed)
  • _analytic_solver (omics, H, Beta, Hloss, Hgrad`) update H with a loop, and _computeMargH with a loop as well, updateBeta with a loop
  • __init__ (omics, etas) but nothing to change apart the arguments
  • __str__ generalize description and adapt arguments
  • _preprocess_data (omics): add a, outer loop over omics
  • _initialize_w_h_beta: add an outer loop over H0 and `Beta0
  • fit: (all except for proj) unchanged for initialization of H and Beta, loop for deep copy, loop for initialization of the lists, unchanged for loss_init and solver and computeF, loops for corresponding appends of loss and params, loop for sm.OLS and corresponding appends, loop for optimal values of H and Beta, loop for gradient computation (and np.hstack could probably be dealt with within the loop), loop for error terms and LDA perf and etas
  • predict: no unstack and loop for proj safe_sparse_dot
  • barplot_error: unclear for me (but needs update)
  • evolplot: might work from scratch (unsure)
  • heatmap : add an argument omic_number
  1. test on a single example to assess that you recover exactly results of the previous implementation
  2. update documentation (in nmfprofiler.py), toy_example, test_nmfprofiler.py and README
  3. publish
Edited by MERCADIE AURELIE