muscle/g" 23S.muscle > 23S.muscle.renamed
This will make a new file with a ".renamed" suffix. In this modified file, containing the Muscle alignment, all the FASTA sequence names will have "muscle" prepended to them. Sed is the UNIX stream editor. Type "man sed" for details. In this case, it is matching a ">" at the beginning of a line ("^" matches the beginning of a line), and substituting it with ">muscle". The "g" means it should make this replacement globally (throughout the entire file).
Move the aligned sequence files to you local computer. Start ClustalX. From the menu, select File... Load Sequences and the ClustalW alignment (.aln suffix).
Now we add the muscle alignment (with modified names) to the ClustalW alignment we previously loaded into ClustalX. Go back to the ClustalX screen (the ClustalW alignment should be already on the screen), and select File... Append Sequences. Choose the "23S.muscle.renamed" alignment.
You should now see both alignments on the screen. Scroll across the screen, through the alignment, and look for any differences (if any).
Are there any differences between the alignments these programs generate?
If there are differences, then which program appears to be doing a better job of reflecting homologous columns?
Did you succeed in creating an alignment in which the boarders of the selfsplicing introns are recognizable?
Can you find settings that improve the alignment around the self-splicing intron? (for clustalw2, you might want to use the menu driven version)
Perform alignments for the other 3 datasets (keep the results!).
If you never used clustalw/clustalx before, save the aligned sequences in the different available formats and look at them in MS Word (select a non-proportional font) to get familiar with the different multiple sequence file formats.
In the aln files, you can delete whole blocks using microsoft word (e.g., those corresponding to the inteins/introns) by pressing down the alt-option key while selecting with a mouse. After you saved the aln file from word, you can re-open the file in clustalx and save it in any other format you like.
To illustrate the advantages of the seaview GUI, load the 23S dataset or the archaeal ATPase dataset into seaview (on your computer). (The 23S dataset contains in intron in at least 4 of the sequences from the Thermotogales, the ATPases dataset contains sequences of archaeal ATPase catalytic subunits that were invaded by an intein, it also contains a few ATPases subunits without intein for reference). Using the align menu -> options, select muscle and then align the sequences. Using the sites menu, create a set called all sites, duplicate the set twice call the duplicate sets intron/intein and exon/extein. Then select the intron/intein set and remove the x's under the exon/extein parts of the alignment and vice versa for the intron/intein set*. (If you save the file in mase format, the site selections will be saved as well. If you want to analyze selected sites in a different program, you can select save selection as in the file menu). While we have not talked about phylogenetic trees, select all species (their names need to be black), and select distance method in the Tree menu. (HKY distance and 100 bootstrap samples work well for the RNA sequences, for the protein sequences, the best is to use phyml with the default options, but if this takes too long, a distance analysis with Poisson correction is a reasonable alternative). Then select the intron/intein sites and the intron/intein containing species, repeat the tree building. If you place a checkmark in the depicted tree window, the program displays support values in this case either bootstrap support values or aproximate Likelyhood RatioTest derived probabilities (the latter are less conservative than the former).
Are the two trees compatible? What are the possible explanations for intron/intein gain and loss?
When you are done, you might want to repeat the exercise for datasets that you consider using for your student project.
If you want to do more, consider doing the dotlet exercises form here (but do not submit the form :) ).
* from the help files of seaview:
A good strategy is to unmark the first and the last x to remove and then to shift click in the midle of the block.