Discovery of combinatorial regulations is a key to understand complex gene regulation machineries. Combining this scripts (chip2lamp) with a statistical analysis LAMP allows us to find statistically significant combinations by integrating ChIP-seqs and RNA-seqs. This can handle MACS1/2 result as a ChIP-seq peak caller and Cuffdiff result from RNA-seq.

Install

  • Requirements
    • Python (>= 2.7)
    • MACS1 or MACS2 results (bed files) from ChIP-seq experiments
    • A cuffdiff result (gene_exp.diff) from RNA-seq experiments
    • Genome annotation file in GTF or GFF3 format
  • Install
    • Download and just uncompress the archive
    • % wget http://seselab.org/chip2lamp/chip2lamp-1.0.zip
      % unzip chip2lamp.zip
    • Install statistical method LAMP
    • % cd chip2lamp
      % wget https://github.com/a-terada/lamp/archive/2.0.3.zip
      % unzip 2.0.3.zip
      % cd lamp-2.0.3
      % make
      % cd ..

Usage

  • "eg" directory contains example files:
    • Three bed files (SP1.bed, USF1.bed, MXI1.bed) generated by MACS2 from ENCODE data
    • Cuffdiff result (gene_exp.diff) generated by Tophat-cufflinks-cuffdiff procedure from this paper, in which expression profiles of before and after differentiation of human ES cell to Mesoderm were compared.
    • Human genome annotation (genes.gtf) from iGenomes
  • The following procedure finds the statistically significant combinations
    • Suppose that you are in chip2lamp directory.
    • run chip2lamp. This generates three files: test_dist.txt, test_exp.txt and test_peak.txt.
    • % python chip2lamp.py --peak eg/SP1.bed eg/USF1.bed eg/MXI1.bed \
      --label SP1,USF1,MXI1 --gene eg/genes.gtf  --diff eg/gene_exp.diff  \
      --macs2 --out test
    • run LAMP and store the result in test_lamp_res.txt
    • % python ./lamp-2.0.3/lamp.py test_peak.txt test_exp.txt 0.05 \
      -p "fisher" > test_lamp_res.txt
  • The output file "test_lamp_res.txt" includes the following result. Ranks from 2nd to 5th suggest the combinatorial regulations.
  • RankRaw p-valueAdjusted p-valueCombinationArity# of target rows# of positives in the targets
    12.2878e-1701.6015e-169SP1173613173
    22.1993e-701.5395e-69SP1,USF1234461502
    37.2873e-435.1011e-42USF1,MXI1225521084
    42.8048e-401.9634e-39SP1,MXI1227541145
    54.5366e-253.1756e-24SP1,USF1,MXI131606676
  • You may think that "SP1 and USF1" ranked 2nd is a minor result because SP1 is ranked 1st and has extremely small P-value, and the collaboration with "USF1" is not so important (Also 4th and 5th are minor results of the significance of SP1). To reduce such result, the following command is useful.
    % python ./lamp-2.0.2/eliminate_comb.py test_lamp_res.txt > test_lamp_res.sig.txt
    The test_lamp_res.sig.txt contains the following two TFs, in which the minor combinations are removed.
    RankRaw p-valueAdjusted p-valueCombinationArity# of target rows# of positives in the targets
    12.2878e-1701.6015e-169SP1173613173
    37.2873e-435.1011e-42USF1,MXI1225521084
    If you need, report_lamp.py produces an associations between the TFs and genes.
    % python report_lamp.py --lamp test_lamp_res.txt --dist test_peak.txt \
    --exp test_peak.txt --out test_report.txt

Detail Usage

chip2lamp.py has the following options:
OptionDescription
--gene [gene.gtf]Gene position file in GTF/GFF3 format (required)
--diff [gene_exp.diff]Differentially expressed genes generated by cuffdiff (required)
--out [prefix]Prefix of output files. Three files (_dist.txt, _exp.txt, _peak.txt) are generated. (required)
--peak [bed1 bed2 ...]Peak files from MACS1/2 in BED format. (required)
--label [label1,label2,...]Names of peaks. If you do not use this option, file names are used as peak names.
--qval [val]Minimum q-value for DEG in cuffdiff result. (default: 0.05)
--exp [val]Minimum expression level of genes to be considered. (default: 0.0)
--up [val]Maximum distance of peaks associated with genes from TSS. Upstream side (default:2000)
--down [val]Maximum distance of peaks associated with genes from TSS. Downstream side (default:300)
--macs2Use this when you use MACS2 result (default:False)
--nm-ignoreIgnore genes associated with no peaks (default:False)

Contributes and Contacts

  • ChIP2LAMP was developed by Aika Terada (U. Tokyo and JSPS) and Jun Sese (AIST) through the discussions with Dr. Mariko Morita (AIST) and Prof. Koji Tsuda (U. Tokyo)
  • Amelieff corporation supported the development of this software.
  • ChIP-seq analysis data were provided by Dr. Shinya Oki (Kyushu U.).
  • If you have any questions, please contact lamp_staff(AT)googlegroups.com

© 2015 SESE Lab. Back to Top