BIOL 6312: Proteins
Spring 2018

Lesson 1: January 30

First Computer Lesson: the Protein DataBank (PDB) and introduction to Jmol (the applet)

1. Click the link to the PDB above. First we will take a tour of that site.

RCSB Protein Data Bank: Sustaining a living digital data resource that enables breakthroughs in scientific research and biomedical education.
Burley SK, Berman HM, Christie C, Duarte JM, Feng Z, Westbrook J, Young J, Zardecki C.
Protein Sci. 2018 Jan;27(1):316-330.

2. The Protein Data Bank stores the xyz coordinates of atoms for proteins of known 3-dimensional structure.

3. First take a look at the upper left icon, "PDB 101". You can explore this on your own, to learn more about the site.

4. Another thing to explore is the "Molecule of the Month". Click on the molecule to enter this tutorial. They have archived these monthly summaries.

5. You can open an account to save links to files of interest.

6. Tabs at the top are interesting to explore: e.g., "Visualize" and "Analyze"

7. To find a protein, use the search box at the top of the page. You can search by key word, author name, or pdb ID. For example type "1xme". That is a protein databank file name- always 4 characters. It is not case-sensitive.

8. There are several tabs at the top. Initially we are at "Structure Summary".

9. At the top is information about this structure file. Scroll down a bit. Next is the link to the "Literature", where the structure was first published. Under "Macromolecules" it tells you that there are 3 polypeptide chains in this molecule, and they are called A, B, and C. Also, various links and controls.

10. Below that, "Small Molecules" it lists the ligands or prosthetic groups. There are 2 coppers (CU and CUA) and 2 hemes (called HEM and HAS). In addition there is one detergent molecule (BNG) and one glycerol (GOL), from the crystallization conditions. Importantly for us, it tells their ID's. We need to know what to call them when we are using Jmol.

11. Back to the top: the second tab is "3D view". Click that to go to the visualization. The default is NGL. Near the lower right corner of the image you can switch to JSMol. The molecule is first shown in cartoon view, colored by structure: alpha-helices magenta and beta-strands gold. Other helices, 3-10, are purple, and loops are white. Now click the button that says "Subunit". Each of the 3 subunits will get a different color.

12. Learn to control the molecule with your mouse: If you hold the shift key down, and hold the mouse down (left click), sliding it up and down will zoom the image. Sliding it side to side will rotate the image around the axis pointing at you. It you hold the shift key down while clicking the mouse and holding it down, the image can be moved.

13. For further interaction with the molecule move the cursor to the upper part of the Jmol image and right-click (control-click). You can explore the menu that pops up. If you select "Console" the console for typing commands will open.

14. Look down the page to the "Scripting Options", find HAS under Ligands, and click "View 1XME - HAS Pocket Interaction". You are seeing a surface coloring according to potential, and a slab view when it is rotated.

15. The "Annotations" tab at the top provides links to information in other Databases

16. The "Sequence" tab shows the amino acid sequences of the subunits and their secondary structure.

17. The "Sequence Similarity" tab shows links to identical or similar proteins in the Protein Data Bank

18. The "Structure Similarity" tab will show if there are any other proteins of similar structure, without significant sequence similarity.

19. The "Experiment" tab provides information about how the sample was crystallized, and the structure determined.

20. Go back to the "Structure Summary" tab. At the right is the "Download Files" popup menu.

21. The first choice is FASTA, which will give you the amino acid sequence of the protein.

22. The second choice is the PDB file in text format. The third is a compressed version, so-called g-zipped. It will download a file called "1XME.pdb.gz"

23. Below is the "Biological Assembly". Click this one and a PDB file will download to your computer. Try to find it (in your Downloads folder?). It will be named 1XME.pdb1.gz

24. Or, the 1xme.pdb1.gz file can be expanded by a double-click, to become 1XME.pdb1.

25. In that case it could be opened with a text editor (TextEdit) or (Notepad for Windows).

26. Generally we do not need to look at the pdb files. We will use them with Jmol.

27. To quickly see what's in a PDB file, click the "Display Files" button, "PDB File"

28. A pdb file is divided into 2 sections: the "Header" and the atomic coordinates

The Header contains various information including the sequence of the protein, amino acids missing in the crystal structure, the secondary structure assignments, the so-called "heteroatoms" , i.e. not amino acids.

29. You can find the xyz coordinates for the 6180 atoms present in this structure.

30. This information is presented in a twelve column format:

31. ATOM 1 N SER A 6 67.564 56.958 35.406 1.00 87.45 N

32. Atom number 1, a nitrogen, in serine 6 of the polypeptide chain A.

33. Its x coordinate is 67.564, y is 56.958, and z is 35.406

34. It has an occupancy of 1.0, meaning that it is fully present.

35. It has a "B-factor " of 87.45, which reflects its degree of motion in the crystal.

36. Finally, the N for nitrogen is repeated

37. Atoms not in amino acids, such as copper, are listed in a similar way at the end.

38. Next time we will look at Jmol.


Comments/questions: email me

Copyright 2018, Steven B. Vik, Southern Methodist University

Last modified February 13, 2018