[Intermediate] Summarizing and reporting results across multiple files (MW)


It is not uncommon that bioinformatics programs produce highly verbose and unstructured output that is very time consuming to read manually. Quite often, only a small subset of the results are needed to evaluate the evidence.


  1. In this exercise you will work with a huge results archive that you can download from here (1st half) and here (2nd half). The data has been produced from 500 bootstrap replications and each replicate directory contains a single file of interest: psn.lst. From this file you should extract and report the following: i) Minimization: whether it was successful or not; ii) OFV-value (called OBJV in the file) and the parameter estimates for Thetas and Etas. The aim is to produce a single table in the form of a CSV-file where the columns are:

    The pairwise Eta matrix can be extracted and dumped into a separate file, one for each bootstrap replicate.

Bonus points

  1. Linearize the pairwise Eta matrix so that you print additional columns in the main CSV-file, one column per Eta pair (ETA1vsETA2, ETA1vsETA3 and so on).