Searching the local neighborhood main content.

Searching the local neighborhood

  1. In this section, we will use the command  swap. This command performs a local search using either SPR or TBR as the tree neighborhood. However, as we learned in the previous sections, we must first read in data and build a set of initial trees:
                read ("course.fasta")            
                build ()
  2. The importance of understanding the commands used to create an initial set of trees for an analysis cannot be overemphasized. A search strategy usually proceeds by building an initial set of trees, each of which is then improved through a local heuristic search. Keeping this in mind it is easy to imagine how a search on one initial tree is drastically inferior to a search on 200 initial trees.
  3. Making 10 trees seems like a rather limited number of replicates. We can request the program to build a larger number of trees by specifying an integer as the build argument. So let us now build 20 trees:
                build (20)
  4. Observe that a new build command eliminates the previous trees in memory. Every build will simply eliminate previous trees.
  5. Now we can store those trees in parenthetical notation for later use, using the  report command:
                report ("built_trees.tree", trees)
  6. To verify that we have successfully accomplished this task, we can read the tree file back in POY:
                read ("built_trees.tree")
  7. Notice that reading an input tree file does not delete previous trees in memory. This way it is possible to add new results to a set of current results. This is always the behavior of the  read command: adding either data, or trees, to the current characters and trees available for analysis.
  8. We should now verify that the input data were correctly loaded, as well as the trees:
                report (data)            
                report (treestats)
  9. In this exercise, we want to compare the effect of different swap options. In order to do this, we must always start from the very same trees and original data. So, at this point, we will introduce the command  wipethat allow us to eliminate all the contents of the program's memory. Let us use that function, and verify that indeed there are neither data nor trees currently loaded:
                wipe ()            
                report (data)            
                report (treestats)
    Observe that no data are now held in memory.
  10. Now, we read the data again, the trees, and run the default  swap () command. Then we check the cost of the trees found:
                read ("course.fasta", "built_trees.tree")            
                swap ()            
                report (treestats)
  11. Write down the tree cost for later comparison. Question: what kind of neighborhood did POY use in this search? (TBR or SPR).
  12. If we wanted to use a SPR or a TBR search, we can select it by passing the corresponding argument to  swap. Let's give SPR a try now:
                read ("course.fasta", "built_trees.tree")            
                swap (spr)            
                report (treestats)
  13. Compare the results of the default neighborhood. What do you observe? Does this match your expectations? Which worked better? which took longer time?