MCB 3421 Computer Lab 4: Databank Search Exercise A

Your name:
Your email address:

We will focus on searches of literature databanks today. If you write scientific or other academic articles, you frequently need to cite the literature. It is a good idea to get used to using a bibliography program ASAP. This makes it easier to incorporate citations into an article, to reformat the bibliography, and to download citations from the internet. Popular choices are Endnote, Refwork, Zotero and Mendeley.
Endnote is popular, but expensive, and old version frequently stop working when the operating system is updated. Also, citations incorporated into a text document cannot be used by other citation programs. Refwork also is a commercial software, but UConn has a subscription.
Mendeley and Zotero are compatible (i.e., if you write an article together, and your co-author uses Zotero, you can read and modify the text and update the bibliography in Mendeley. Mendeley is popular, it can be used to keep track of pdf versions of articles, and it is updated within reasonable time, when Microsoft office or the operating system becomes incompatible with an older version.
Your instructor currently uses Mendeley, maintained by Elsevier. The software can be downloaded here. After you create a free account, your database of references is stored online, you can share folders with others, and you can use your references from different computers. The software comes with three important features: A) a bookmark for your browser that automatically downloads citation (and pdf, if available, from the cite you are visiting (e.g. pubmed, journal page, scopus, ...), a plug in fro microsoft word, the allows you to insert citations into the text you are writing, and the Mendeley desktop, that allows access to your references, and different bibliography style sheets.

 

1. (less than 20minutes)
Use Pubmed in NCBI's Entrez to find an article written by Carl R. Woese (famous scientist, co-discoverer of the Archaea), published in the journal Proceedings of the National Academy of Sciences of the United States of America with the words primary kingdoms in the title of the paper. Try to use Boolean operators (ANDORNOT) and field tags; if you cannot recall the tags, use the Preview/Index tool under advanced search (link below the search text window).

What query did find the 1977 article?


How many similar articles (link in right hand bar) are linked to this article?
(in the window that gives you the title authors and abstract, click on the link labeled "similar articles..." see all )
When was the most recent published (Hint: In the Display settings pull down menu set the "Sort by" option to Pub date)?   
How many Pubmed Central articles are listed on the page, that gives the Woese 1977 article, that cite this article?

2. (ca. 7 minutes) (Note: If Entrez' pulldown menus do not work well, use Firefox)
In NCBI's Entrez/pubmed find the earliest paper co-authored by Senejani and Gogarten. What is the topic of the paper?

To learn about inteins, select books as the target database to search (pulldown menu to the left of the search bar) and search for intein homing - the image in the right column is somewhat informative. For more information check Wikipedia on inteins

3. (ca. 3 minutes)
Dr. Johann Peter Gogarten seems obsessed by an important protein called ATP synthase. Is he interested in anything else? How many articles has he published that are NOT related to the ATP synthase OR ATPase? (Note, there is a proliferation of authors with the same family name)
What query did you assemble?
How many articles did you find?

4. (13 minutes)
Comparing search engines and databases (you want to open pubmed, google scholar, and Scopus (UConn has a subscription to this service) in different tabs in your browser):
For a scientist of your choice (e.g., your advisor, or someone who publishes in your field of interest), use pubmed, Google scholar, and Scopus and to search for articles by this author.
Which scientist did you choose?

How many articles did you find in pubmed, google scholar, and Scopus (comma separated, if your author of choice does not have a google scholar profile enter --)

How often was your author cited according to his Google Scholar profile?

What is the H-index for the author of your choice according to his Google Scholar profile? (At top in the right column of the Google Scholar profile)

What is the H-index?

In Scopus search for your author, click on the name of your author, wait ...
How many articles did scopus find?
How often were these articles cited according to scopus
What is the H-index for the author of your choice according to Scopus?
Click on view citation overview. Click on the number of citations on the right hand side of the header of the table.

Do you find any recent interesting article? (If yes, give the citation)            
Was this article available online?      

 

5. (5 minutes)
Using Pubmed, search for articles co-authored by Taiz and Gogarten.

a) How many articles did you retrieve?

b) Using the "find related data" pull down menu in the right bar, display all Nucleotide Links and all Protein Links.
How many did you find?

c) Do all the different protein sequences really refer to different sequences?

What might explain your finding?

Go to the nucleotide entry for gi|167559.
Click on run Blast in the column on the right, in the search form, select the nr (nonredundant) database.
Under organism type (then select) "flowering plants (taxid:3398)". Under algorithm options, increase the number of matches to 20000, and decrease the expect threshold to 0.001.
(If this takes too long, click here)
How many matches are reported

Go to the protein encoded by gi|167559 (167560). Repeat the above blast search, this time with a protein sequence as query? Under organism type (then select) "flowering plants (taxid:3398)". Under algorithm options, increase the number of matches to 20000, and decrease the expect threshold to 0.001.
(If this takes too long, click here)
How many matches are reported (the number of subject sequences above the graphic)

How do you explain the difference?


6. (10 minutes)
Using Entrez, search Protein (use drop-down box to select the Protein database) for 19888400 (this is a gi number, see historical note)

Run a BLAST search (same parameters as above) except select the taxon Archaea. Or click here for the results (stored for 2 days))

Do you notice anything interesting about the alignments?

Click on the link to the Taxonomy report in the top frame of the blast results. Scroll through the report, with focus on the Taxonomy Report (at the end of the page). In which phylum of archaea are most homologs to 19888400 reported?

7. (8 minutes)
To what domain (super kingdom), phylum (kingdom), and family does Thermoplasma belong? (Use the Taxonomy Search. In the line labeled lineage, if you hover the mouse pointer over the names, it tells you which taxonomic category you are pointing at. )

How many protein and genome sequences are available for Thermoplasma acidophilum, how many are available for the genus Thermoplasma? (In the taxonomy browser go to Thermoplasma and check protein and genome in the header, then click on <Display>)


Finished?

Check the appropriate radio button below before pressing the submit button:

Send email to your instructor (and yourself) upon submit
Send email to yourself only upon submit (as a backup)
Show summary upon submit but do not send email to anyone