Probabilistic Integrative Networks

A new integrative network model that infer multifaceted regulations of genes in terms of both mRNAs and non-coding RNAs, with particular focus on microRNA-gene binding, and infer the functional implication in cancer development and progression.

Download

You can find the Python and R source code files here

You can find all the data and intermediate result files here

i

In this manual, we explain what each script does, and how to run each script.

Disclaimer: Latest versions of third-party packages can create problems. At the time of this release, we tried to use latest package version, and they didn't break.

  • add_mirna_bindings.py
  • It adds special interest of mirna--gene bindings to network models. It expects two parameters. First parameter is the path for tab separated edge list for mirna--gene interactions. Second parameter is the root path of where the network models are.

  • classification.py
  • It contains more functionality than what we used in our analysis. It calculates the Gromov-Wasserstein distances from data, and make predictions using this information. It takes 5 parameters. First parameter is genes of interest with phenotype information. Second parameter is ranks file in json format which can be calculated with "do_prob_similarity" function. Third parameter is cut-off value which is also serialized in json files. Fourth parameter is network model in "graphml" format. Last parameter is external evidence matrix which can contain particularly interested mirna--gene interactions.

  • cvae.py
  • We do data augmentation in here using CVAE model. If you want to pretrained model, we provided our serialized model in supplementary data files. It can run over multiple GPUs in parallel if num_workers parameter is increased. Its only parameter is an expression matrix with group information in the last two columns. We used final column as a class information.

  • degreeplot.py
  • It contains utility functions to plot node degrees in the network. It gets two parameters. First parameter is network in "graphml" format. Second parameter is a csv file where node type information is in "group" columns.

  • graphical_model.py
  • Contains utility functions for calculating independency relationships and model serialization.

  • mirna_degree_plot.py
  • Utility script to plot degree distribution of specific miRNAs.

  • plot_cluster_bic.py
  • Utility script to plot clustering profile and assessment.

  • startbase.py
  • Utility script to download information from starBase database.

  • utils.py
  • Contains utility functions to do graph operations

  • venn.py
  • It contains functions to plot Venn diagrams.

    Original resource:https://github.com/jiaweiM/PlotNotes/blob/431dd51c682741610d5cc8679f1f6fc7f16e9ce6/omicsplot/venn.py

  • func_analysis.R
  • It contains functions for GO/KEGG functional analysis with dot plot comparative analysis.

  • glasso.R
  • It contains functionality for learning network structure from data with different regularizers.

  • tcga-download.R
  • Contains functions for downloading expression data from TCGA database, RNA-SEQ downstream analysis (normalization, differentially gene expression call, ensembling results)

  • spia.R
  • Utility script to perform structure based KEGG functional analysis for comparison purpose.