Here, we present links to information about homologs of pre-mRNA processing factors in the Drosophila, human, yeast (Saccharomyces) and Arabidopsis genomes. In many cases, functional data are not available, and our assignments are based on the best match among genomes. Components of the RNA processing machinery are highly conserved. Very few factors identified in yeast are absent from the Drosophila or Arabidopsis genomes. In general, proteins involved in the mechanics of RNA processing are even more conserved than proteins involved in the interpretation of RNA processing signals.

An early form of this table appears in an article by Steve Mount and Helen Salz that was published in the Journal of Cell Biology in a special issue on the Drosophila genome. The reference is:

        Stephen M. Mount and Helen K. Salz
           Pre-messenger RNA Processing Factors in the Drosophila Genome
                 J. Cell Biol. 2000 150: F37-F44. Published online Jul 24 2000. [Abstract]  [Full Text]  [PDF]  [PubMed]
This web page differs from the table in the paper in that it has additional content and will be continuously updated. Most significantly, we have added links to homologs in Arabidopsis.

This page is perpetually under revision. Our original selection and organization of spliceosomal genes was based on that presented by Burge et al. in the second edition of the RNA World (1998. Gesteland, Cech and Atkins eds., CSHL Press, pg. 525). We have completed the incorporation of links to Arabidopsis for those proteins in the original analysis. We then reviewed specific genes that have multiple paralogs in some species (some of which are noted with an asterisk below). Our analysis of these is presented with a tree icon in the left-most column that presents an unrooted tree generated using clustalw at the EBI. There are also families of related splicing factors (e.g. the Sm core proteins or the SR proteins) conserved across species, in which case we present trees of related genes within a single species (tree icon at the top of a column). We also plan to incorporate additional proteins identified by Zhou et al. 2002 (Comprehensive proteomic analysis of the human spliceosome. Nature 419:182-5) using mass spectrometry. Those using this site may want to refer to that work (particularly the supplemental information).

Comments and suggestions are very much appreciated and should be sent to Steve Mount

We thank Jonathan Roberts, Jason Martineau, Wei Xian and Chau Nguyen for help creating this page.

Links in the following table are to flybase and GenBank accessions. In the case of Arabidopsis, links to TIGR and MIPS are for the single gene with the single best BLAST score. In some cases, there are multiple genes with significant scores. BLAST output from TAIR , The Arabidopsis Information Resourse is provided to allow the reader to make their own evaluations of other matches.


Steve Mount's home page
Model Organism Links
Bioinformatic Links
The RNA Society

Table: Genes for Proteins Involved in pre-mRNA Processing

 
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
 
snRNP proteins
 
Core proteins
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
snRNP core protein SmB SmB  | NP_476921 E-38
6E-38
Smb1p, 2E-10 TIGR | MIPS | TAIR:7e-027
TIGR | MIPS
Sm core protein LSM8, (LSM1-like) LSM8 | NP_647660 LSM8, E-34
LSM1, 8E-05
Lsm8p, E-05
Lsm1p, 2E-06
TIGR | MIPS | TAIR:1e-029
snRNP core protein SmD1 SmD1 | NP_524774 E-36 Smd1p, 5E-11 TIGR | MIPS | TAIR:5e-031
TIGR | MIPS
LSM2 LSM2 | NP_648570 3E-46 Lsm2p, 8E-28 TIGR | MIPS | TAIR:3e-035
snRNP core protein SmD2 SmD2 | NP_649645 9E-46 Smd2p, 5E-25 TIGR | MIPS | TAIR:5e-039
TIGR | MIPS
LSM3 LSM3 | NP_651203 6E-29 Lsm3p, E-05 TIGR | MIPS | TAIR:3e-017
TIGR | MIPS
snRNP core protein SmD3 SmD3 | NP_725106 E-37 Smd3p, E-20 TIGR | MIPS | TAIR:2e-027
TIGR | MIPS
LSM4, U6 snRNA-associated Sm-like LSM4 | NP_572211 3E-70 Lsm4p, E-10 TIGR | MIPS | TAIR:3e-041
snRNP core protein SmE SmE | NP_609162 8E-33 Sme1p, E-17 TIGR | MIPS | TAIR:7e-029
TIGR | MIPS
LSM5 LSM5 | NP_648022 5E-35 Lsm5p, E-13 TIGR | MIPS | TAIR:3e-028
snRNP core protein SmF SmF | NP_523708 6E-37 Smx3p, 3E-16 TIGR | MIPS | TAIR:6e-030
LSM6 LSM6 | NP_611528 8E-33 Lsm6p, 7E-09 TIGR | MIPS | TAIR:2e-028
TIGR | MIPS
TIGR | MIPS
snRNP core protein SmG SmG | NP_573139 7E-27 Smx2p, 7E-16 TIGR | MIPS | TAIR:9e-021
TIGR | MIPS
LSM7 LSM7 | NP_609807 8E-38 Lsm7p, 3E-16 TIGR | MIPS | TAIR:1e-029
LSM1 | NP_611559  LSM1, E-43
LSM8, E-10
Lsm1p, 7E-19 TIGR | MIPS | TAIR:1e-021
TIGR | MIPS
 
U1snRNP
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
U1-70k Subunit U1-70K | NP_477205 2E-62 Snp1p, 7E-31 TIGR | MIPS | TAIR:3e-041
TIGR | MIPS
U1C Subunit U1C | NP_650767 8E-28 Yhc1p, 4E-06 TIGR | MIPS | TAIR:9e-018
U1A Subunit  U1A | NP_511045 7E-32 Mud1p,  6E-04 TIGR | MIPS | TAIR:2e-051
TIGR | MIPS
TIGR | MIPS
Luc7-like protein, unnamed Luc7-like | NP_648991 4E-84 Luc7p, 5E-24 TIGR | MIPS | TAIR:4e-024
TIGR | MIPS
TIGR | MIPS
related to Luc7-like protein, 
CG7564
Luc7-like | NP_572337 3E-06 Luc7p, 3E-06 TIGR | MIPS | TAIR:9e-032
Prp39/Prp42  Prp39/Prp42 | NP_651634 BAA92024, 2E-30
BAA91318, 9E-23
Prp39p, 2E-13
Prp42p, 1E-06
TIGR | MIPS | TAIR:7e-031
TIGR | MIPS
Prp40-like protein Prp40-like | NP_722868 2E-98 (aka mouse
Huntington yeast 
partner C E-100 )
Prp40p, 3E-14 TIGR | MIPS | TAIR:3e-054
TIGR | MIPS
 
U2snRNP
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
U2B" Subunit  U2B | NP_511045 7E-68 Ms11p, 3E-05 TIGR | MIPS | TAIR:2e-051
TIGR | MIPS
TIGR | MIPS
U2A' Subunit U2A | NP_610315 7E-54 Lea1p, 5E-05 TIGR | MIPS | TAIR:3e-039
SF3a60/SAP61 Subunit SF3a60/SAP61 | NP_477114 E-180 Prp9p, 3E-31 TIGR | MIPS | TAIR:1e-103
SF3a66/SAP62 Subunit SF3a66/SAP62 | NP_648603 E-100 Prp11, 1E-13 TIGR | MIPS | TAIR:6e-073
SF3a120/SAP114 Subunit SF3a120/SAP114 | NP_650583 E-155 Prp21p, 2E-11 TIGR | MIPS | TAIR:2e-068
TIGR | MIPS
TIGR | MIPS
SF3b53/SAP49 Subunit SF3b53/SAP49 | NP_511058 E-109 Hsh49p, 1E-28 TIGR | MIPS | TAIR:1e-088
TIGR | MIPS
SF3b150/SAP145 Subunit SF3b150/SAP145 | NP_608739 E-178 Cus1p, 5E-38 TIGR | MIPS | TAIR:7e-089
TIGR | MIPS
SF3b120/SAP130 Subunit SF3b120/SAP130 | NP_728546 E=0 Rse1p, 8E-78 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
TIGR | MIPS
SF3b160/SAP155 Subunit SF3b160/SAP155 | NP_608534 E=0 Hsh155p, E=0 TIGR | MIPS | TAIR:0.0
FBP21-like protein that 
associates with U2 snRNPs
FBP21-like | NP_608551 7E-28 Prp40p, 0.078 TIGR | MIPS | TAIR:2e-014
 
U5snRNP
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
220kD Subunit 220KD Subunit | NP_610735 E=0 Prp8p, E=0 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
200kD Subunit 200KD Subunit | NP_650472 E=0 Slh1p, E=0
Brr2, E=0
TIGR | MIPS | TAIR:0.0
TIGR | MIPS
40kD Subunit 40KD Subunit | NP_608501 E-120
SPF38, 7E-99
unknown TIGR | MIPS | TAIR:1e-106
TIGR | MIPS
116 kD Subunit 116KD Subunit | NP_651605 E=0
E=0 , mouse
unknown TIGR | MIPS | TAIR:0.0
TIGR | MIPS
Prp6-like Prp6-like | NP_649073 E=0 Prp6p, 6E-74 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
100kD Subunit 100KD Subunit | NP_609888 E=0 Prp28p, 2E-92 TIGR | MIPS | TAIR:1e-166
15kD Subunit 15KD Subunit | NP_608830 5E-78 Dib1p, 3E-48 TIGR | MIPS | TAIR:2e-069
TIGR | MIPS
 
TrisnRNP
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
SAP90 SAP90 | NP_649156 E-144 Prp3p, 1E-23 TIGR | MIPS | TAIR:3e-069
TIGR | MIPS
TIGR | MIPS
SAP60 SAP60 | NP_648990 E-138 Prp4p, 9E-56 TIGR | MIPS | TAIR:1e-079
TIGR | MIPS
15.5 kD Subunit 15.5KD Subunit | NP_524714 4E-39 Snu13p, 9E-32 TIGR | MIPS | TAIR:3e-035
TIGR | MIPS
TIGR | MIPS
20 kD cyclophilin  20KD Cyclophilin | AAF57375 4E-80 unknown TIGR | MIPS | TAIR:7e-069
 
Pre-spliceosome splicing factors
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
Small subunit of U2AF  U2AF Subunit | NP_477208 5E-88 unknown TIGR | MIPS | TAIR:9e-060
TIGR | MIPS
TIGR | MIPS
Related to small subunit 
of U2AF (Urp) 
U2AF(Urp) Subunit | NP_608857 1E-20 unknown TIGR | MIPS | TAIR:5e-025
TIGR | MIPS
TIGR | MIPS
Large subunit of U2AF U2AF Subunit | NP_476891 E-146 Mud2p, 0.22 TIGR | MIPS | TAIR:8e-056
TIGR | MIPS
SF1/BBP  SF1/BBP | NP_524654 E-106 Msl5p, 4E-62 TIGR | MIPS | TAIR:3e-060
small subunit of nuclear 
cap binding protein
Nuclear cap | NP_524396 4E-63 Cbc2p, 6E-37 TIGR | MIPS | TAIR:7e-052
large subunit of nuclear 
cap binding protein
Nuclear cap | NP_726938 E=0 Sto1p, 3E-23 TIGR | MIPS | TAIR:4e-072
CBP80-like CBP80-like | NP_650969 1E-96 Sto1p, 1E-10 TIGR | MIPS | TAIR:3e-049 *
UAP56 UAP56 | NP_723089 E=0 Sub2p, E-141 TIGR | MIPS | TAIR:1e-159
TIGR | MIPS
Prp5-like Prp5-like | NP_573020 E=0 Prp5p, 4E-94 TIGR | MIPS | TAIR:1e-169 *
 
Catalytically active spliceosome splicing factors
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
Prp31 Prp31 | NP_648756 E-141 Prp31p, 3E-22 TIGR | MIPS | TAIR:9e-082
TIGR | MIPS
pombe, cdc5-related protein Pombe | NP_612033 E=0 Cef1p, 4E-49 TIGR | MIPS | TAIR:1e-169
Prp2  Prp2 | NP_609946 E=0 (64%) Prp2p, E-172 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
Prp22  Prp22 | NP_610928 E=0 Prp22p, E=0 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
Prp16  Prp16 | NP_727764 E=0 (85%) Prp16p, E-160 TIGR | MIPS | TAIR:0.0 *
Prp17 Prp17 | NP_651005 E=0 Cdc40p, 1E-76 TIGR | MIPS | TAIR:1e-151
TIGR | MIPS
TIGR | MIPS
Prp43  Prp43 | NP_610269 E=0 Prp43p, E=0 TIGR | MIPS | TAIR:0.0
TIGR | MIPS
Slu7 Slu7 | NP_651659 E-128 Slu7p, 1E-12 TIGR | MIPS | TAIR:1e-097
TIGR | MIPS
TIGR | MIPS
Prp18 Prp18 | NP_650776 4E-93 Prp18p, 4E-05 TIGR | MIPS | TAIR:2e-047
TIGR | MIPS
Srm160 Srm160 | NP_648627 2E-43 unknown TIGR | MIPS | TAIR:9e-028
Srm300 Srm300 | NP_647642 2E-37 unknown TIGR | MIPS | TAIR:0.036 *
 
SR proteins
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
SC35 SC35 | NP_652612 5E-53 Npl3, 3E-12 TIGR | MIPS | TAIR:3e-020
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
ASF/SF2 ASF/SF2| NP_652611 2E-80 Npl3, 7E-15 TIGR | MIPS | TAIR:6e-041
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
B52  B52 | NP_788668 SRp55, 2E-96
SRp75, 3E-96
Srp40/HRS, E-78
Npl3, 3E-14 TIGR | MIPS | TAIR:8e-026 *
9G8   | NP_723226 4E-42 Npl3, 7E-05 TIGR | MIPS | TAIR:7e-019
TIGR | MIPS
RBP1  RBP1 | NP_524307 2E-38 Nsr1p, E-07 TIGR | MIPS | TAIR:4e-015
TIGR | MIPS
RBP1-like PBP1-like | NP_572880 4E-31 Nsr1p, 4E-14 TIGR | MIPS | TAIR:1e-014 *
SRp54 SRp54 | NP_477347 2E-41 none significant TIGR | MIPS | TAIR:7e-004
 
SR protein kinase homologs
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
SR protein kinase 
(e-values relative to SRpK1)
SR protein Kinase 1 | NP_725458
SR Protein Kinase 2 | NP_649387
SR Protein Kinase 3 | NP_573080
SRPK1, 9E-76
SRPK2, 3E-77
Sky1p, 5E-44 TIGR | MIPS | TAIR:1e-52
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
Lammer, CLK kinase Doa | NP_477275 CLK2, E-151 Kns1p, 5E-61 TIGR | MIPS | TAIR:4e-92
 
Miscellaneous proteins
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
crooked neck/CLF1  crn | NP_477118 unknown Clf1p, E-123 TIGR | MIPS | TAIR:0.0
poly-U binding protein  Poly-U binding Protein | NP_525123 1E-81 unknown TIGR | MIPS | TAIR:2e-017 *
poly-pyrimidine tract 
associated protein
Poly-pyrimidine | NP_536740 5E-62 unknown TIGR | MIPS | TAIR:2e-010 *
poly-pyrimidine tract binding
protein (PTB), hnRNP I
hnRNP I | NP_733461 E-150 none < E-4 TIGR | MIPS | TAIR:8e-027
TIGR | MIPS
TIGR | MIPS
Most similar to Survival of 
Motor Neuron Protein (SMN)
SMN | NP_524112 1E-05   TIGR | MIPS | TAIR:1.9
SPF30, SMN-related  SPF30 | AAF45352 3E-37   TIGR | MIPS | TAIR:3e-009 *
survival of motor neuron 
interacting protein (SIP1)
SIPI | NP_649092 2E-18 Brr1p, E=0.05 TIGR | MIPS | TAIR:0.013
debranching enzyme Debranching Enzyme | NP_648175 E-113 Dbr1p, 2E-62 TIGR | MIPS | TAIR:1e-101
SR protein-like RRM RSF1 | NP_477001 SR-like RRM 
matches SR 
proteins. Acts as a 
repressor of splicing. 
See Bourbon et al.
RRM domain 
matches only
TIGR | MIPS | TAIR:1e-001
 
Cleavage and polyadenylation
 
General
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
 
PolyA polymerase PolyA Polymerase | NP_536790 E-173 Pap1p, E-102 TIGR | MIPS | TAIR:1e-104
TIGR | MIPS
TIGR | MIPS
TIGR | MIPS
PolyA binding protein  pAbp | NP_725753 E-176 , mouse Pab1p, E-106 TIGR | MIPS | TAIR:1e-123
TIGR | MIPS
PolyA binding protein II Pabp2 | NP_476902 2E-54 Sgn1p, 2E-18 TIGR | MIPS | TAIR:5e-035
TIGR | MIPS
TIGR | MIPS
possible polyA binding protein PolyA binding Protein | NP_611924 9E-48 , mouse 
8E-46 , human
Pab1p, 2E-24 TIGR | MIPS | TAIR:1e-045 *
 
CPSF (cleavage and polyadenylation specificity factor)
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
CPSF-160 kd. Subunit CPSF-160KD Subunit | NP_725397 E=0 Cft1p, 4E-37 TIGR | MIPS | TAIR:1e-155
CPSF-100 kd. Subunit CPSF-100KD Subunit | NP_651658 E=0 , bovine Ysh1p, E-32 TIGR | MIPS | TAIR:1e-139
CPSF-73 kd. Subunit CPSF-73KD Subunit | NP_650738 0 Ysh1p, E-146 TIGR | MIPS | TAIR:0.0
possible variant 
CPSF-73 kd. Subunit
CPSF-73KD Subunit | NP_651721 7E-93 Ysh1p, E-73 TIGR | MIPS | TAIR:1e-155
related to CPSF-100 and -73 Related to CPSF | NP_648838 0
E-113
Ysh1p, 6E-14 TIGR | MIPS | TAIR:1e-021
CPSF-30 kd. Subunit  Clp | NP_477156 8E-89 Yth1p, 1E-34 TIGR | MIPS | TAIR:7e-021
TIGR | MIPS
 
cleavage-polyadenylation stimulation factor
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
CstF-77 kd. Subunit  | AAF45314 E=0 Rna14p, 2E-47 TIGR | MIPS | TAIR:4e-092
CstF-64 kd. Subunit CstF-64KD Subunit | NP_477453 E-71 Rna15p, 2E-18 TIGR | MIPS | TAIR:4e-040
CstF-50 kd. Subunit CstF-50KD Subunit | NP_651883 E-129 no known 
homolog.
TIGR | MIPS | TAIR:1e-066
symplekin Symplekin | NP_649580 E-173 Pta1p, 5E-04 TIGR | MIPS | TAIR:1e-052
TIGR | MIPS
 
Cleavage factors I and II
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
cleavage factor Im - 25 kd. 
Subunit
Im-25KD Subunit | NP_648308 E-102 unknown TIGR | MIPS | TAIR:2e-052
TIGR | MIPS
cleavage factor Im - 68 kd. 
Subunit
Im-68KD Subunit | NP_648206 3E-34 unknown TIGR | MIPS | TAIR:7e-009
TIGR | MIPS
Yeast cleavage and polyA 
factor IA homolog
IA Homolog | NP_611565 7E-41 Pcf11p, 3E-9 TIGR | MIPS | TAIR:7e-013 *
 
 
Miscellaneous proteins not in published tables (under development)
Protein
Drosophila
Flybase |  NCBI
Human homologs
E value
Yeast (S.c.)
E value
Arabidopsis
Splicing factor domains BcDNA:GH01073 | NP_651291 E-39 unknown TIGR | MIPS | TAIR:5e-085
Splicing factor (CC1.3) CC1.3 | NP_723243 E-123
PUF60, 3E-18
unknown TIGR | MIPS | TAIR:4e-066
TIGR | MIPS
PSI protein (P somatic inhibitor) PSI | NP_477123 unknown   unknown TIGR | MIPS | TAIR:2e-016
 
hnRNP proteins (under development)
 
Genes identified in Drosophila with roles in splicing (under development)
 
missing genes (under development)

* indicates Arabidopsis proteins that are the best match in the Arabidopsis genome to the Drosophila query but whose best match in Drosophila is not the query. These cases are currently being resolved by phylogenetic analysis, the results of which will soon be accessible from the first column via a tree icon.


University of Maryland | flybase | GenBank | TIGR | TAIR | The RNA Society | Main page