$ cd <path-to-mosdi-jar> $ ln -s <path-to-jopt-simple>/jopt-simple-<version>.jar jopt-simple.jar
$ java -jar mosdi_r468.jar iupac_abelian_gen -M 8,2,0,2 8 > abelian_patterns
$ java -jar mosdi_r468.jar discovery -t 1e-15 -F example-sequences abelian_patterns |grep '>>' > resultsThe switch "-t 1e-15" tells the algorithm to look for motifs with a p-value below 1e-15. Grepping for >> saves us from the software's rather detailed output. Note that, in order to parallelize the computation, we may split the file abelian_patterns into chunks and process them on different cores/machines.
>>p_value>> 3.617957e-37 LIN >>stats>> TAARASGA 1 9 9 17 5.170981e-02 >>poisson>> 5.170981e-02 1.000000e+00 >>runtimes>> 0.000000e+00 0.000000e+00 0.000000e+00
$ grep '>>p_value>' results | cut -d ' ' -f 5 > motifsThen, we re-evaluate these motifs w.r.t. an third order Markov model (M3):
$ java -jar mosdi_r468.jar calc_scores -M3 -F example-sequences motifs |grep '>>' > results_M3
tu-dortmund.de
| [1] | Tobias Marschall and Sven Rahmann. Probabilistic arithmetic automata and their application to pattern matching statistics. In Paolo Ferragina and Gad Landau, editors, Combinatorial Pattern Matching (CPM'08), volume 5029 of LNCS, pages 95-106. Springer, 2008.
|
| [2] | Tobias Marschall and Sven Rahmann. Efficient Exact Motif Discovery. Submitted. |