2. Scalar data

1 – Basic arithmetics

A scalar in Perl is a variable that can only store a single piece of information, such as the value 42 or the text string themeaningoflife. Perl can perform mathematical operations on numeric information stored in scalars, such as adding values in scalars together, subtracting them from each other, multiplying them and so on.

Exercise

Start from this simple interactive program and modify it to solve the tasks below. Compared to many other examples among the exercises, this program will actually produce something useful when finished :-)

Arithmetics

  1. Run the program!
  2. Compute the mean of the two allele frequencies and put it in a new scalar called $pt. Print the contents of that scalar.
  3. You can actually print the result of your equation without storing it in a scalar. Print the mean directly without putting it in $pt.
  4. Print the contents of $pt only if it is larger than 0.5. Change the values of $p1 and $p2 to verify that it works.
  5. A common application of allele frequencies is to use them to estimate how different two populations are. One such statistic that capture differences among frequencies is Fst. There is a plethora of approaches and equations for calculating Fst. Insert the formula below into your program and print the Fst values you get from various different frequencies of $p1 and $p2. When do you get an Fst value of 0 (no differentiation) and when do you get an Fst value of 1 (maximum differentiation)?

2 – Length of a string

A frequently reoccurring need in processing biological data is to compute and act on the length of a data string, such as the number of nucleotides in a DNA sequence or peptides in a protein sequence. Common use cases include calculating the mean or distribution of fragment lengths from a collection of sequences or using the length of the sequence to set some upper limit for other calculations.

Perl has a built in function called length that returns the number of characters in a string of text, i.e. the length of the string. It is actually introduced as late as Chapter 6 in LP but it is such a useful function in bioinformatics that we will highlight it early.

Exercise

Start from this interactive program that reads from standard input until you hit Return. The program stores in input in a scalar simply called $input. Modify it (if needed) to solve the tasks below.

Length

  1. Does the program print what you expected?
  2. Modify it to Do The Right ThingTM 😉
  3. The program is not very user friendly at this point. Add a dialog (or message) that tells the user what she or he needs to do when the program starts.
  4. Add an additional opportunity to input a second string and store that input in a second scalar. Remember to inform the user. Print both scalars.
  5. Concatenate the two strings and print the length of the resulting string.
  6. Add a control structure that prints the length of the two concatenated strings only if it is greater than or equal to a number of your own choice.
  7. Make the program print a sad message if it is shorter than your limit.
  8. Modify the program to prompt the user for input an arbitrary number of times (e.g. 10). Concatenate the strings and evaluate the length of the final string as before. Your solution should be able to toggle between different number of input prompts by just changing the value in a single scalar. Hint: use a while loop to keep track on the number of times you have done!