From RiceWiki
Jump to: navigation, search

Please input one-sentence summary here.

Annotated Information


  • A novel SET-domain-containing gene OsSET1 was isolated from rice (Oryza sativa L.). Its deduced protein consists of 895 amino acids. OsSET1 has a high degree of structure similarity to other SET-domain-containing genes such as CLF in higher plants and E(z) in animals[1].
  • The SET domains are conserved amino acid sequences present in chromosomal proteins that contribute to the epigenetic control of gene expression by altering regional organization of the chromatin structure. The SET domain proteins are divided into four subgroups as categorized by their Drosophila members; enhancer of zeste (E(Z)), trithorax (TRX), absent small or homeotic 1 (ASH1) and supressor of variegation (SU(VAR)3–9). Homologs of all four classes have been characterized in yeast, mammals and plants. We report here the isolation and characterization of rice (Oryza sativa L. subspecies indica) cDNA, OsiEZ1, as a monocot member of this family. The OsiEZ1 cDNA is 3133 bp long with an ORF of 2799 bp, and the predicted amino acid sequence (895 residues) corresponds to a protein of ca. 98 kDa. All the characteristic domains known to be conserved in E(Z) homologs (subgroup I) of SET domain containing proteins are present in OsiEZ1. In the rice genome, a 7499 bp long OsiEZ1 sequence is split into 17 exons interrupted by 16 introns. Southern analysis indicates that OsiEZ1 is represented as single copy in the rice genome. Expression studies revealed that the OsiEZ1 transcript level was highest in rice flowers, almost undetectable in developing seeds of 1–2 days post-fertilization but increased significantly in young seeds of 3–5 days post-fertilization. The OsiEZ1 transcript was barely detectable in mature zygotic embryos, but its levels were significantly higher in callus derived from rice scutellum, somatic embryos and young seedlings. The OsiEZ1/GUS recombinant protein was confined to the nucleus in living cells of particle-bombarded onion peels. The expression of OsiEZ1 complemented a set1Δ Saccharomyces cerevisiae mutant that is impaired in telomeric silencing. We suggest that the nuclear-localized OsiEZ1 has a role in regulating various aspects of plant development, and this control is most likely brought about by repressing the activity of downstream regulatory genes [2].


Fig. 1. Comparison of conserved regions in the SET‐domain between OsSET1 and selected representative proteins. A sequence alignment of SET‐N and SET‐C of the SET domain and the lengths of the variable insert regions (SET‐I) were indicated. Identical residues were shaded in black and conservative changes were shaded in grey. The putative conserved secondary structural elements were roughly indicated above the sequence alignment, in which arrows refer to β strands and bold bars refer to α helices. Regions putatively involved in binding to the cofactor product AdoHcy were indicated with green, and other three highly conserved sequence regions (the last two of which form the unusual knot structure) were indicated with a blue bar below the aligned sequences. The invariant tyrosine residue implicated to function as a general base for catalysis was indicated with a black triangle below the alignment. No CLC and CRC conservative sequences were found in Post‐SET region of OsSET1. DDBJ/EMBL/GenBank accession numbers: DIM‐5_Nc (AAL35215), SET7/9_Hs (AAL56579), Clr4_Sp (CAA07709), SET1_Os (AAK28975), OsSET1_Os (AAN01115), RubiscoLs (AAA69903).(from reference [1]).
Fig. 2. Subcellular localization of the OsSET1 protein. The green florescence was observed in nuclei of transformed onion epidermis (arrow heads, A and B), but not at nuclei of negative controls (arrow heads, C and D). Bars: 60μm.(from reference [1]).
  • So far, 14 rice SET-domain-containing genes can be found in the SMART database, but in contrast to the two putative OsCLFs(AP005813; AP003044), the other 11 putative SET-containing genes have low sequence similarity with OsSET1 even in the SET domain.The detailed sequence analysis revealed that the OsSET1 gene has all known conserved regions, e.g. SET-N and SET-C in the SET domain, but lacks post-SET (Fig.1). Similar to other plant SET-domain-containing genes such as CLF and MEZ1-3, OsSET1 only has a cysteine-rich region, no pre-SET domain. Based on the sequence characteristics, the OsSET1 could be grouped into theSET1 family.The expression pattern of OsSET1 was similar to that of the SET-domain-containing genes investigated in Arabidopsis and maize in terms of lacking organ specificity (data not shown). A transient expression assay revealed that the fusion protein of OsSET1 and green fluorescent protein (GFP) was located in the nuclei (Fig.2).This was also similar to other SET-containing proteins such as E(z)and CLF. To investigate the function of the OsSET1 gene, a series of transgenic Arabidopsis and rice lines were constructed. Among them, about 53.8% transgenic Arabidopsis that over-expressed the SET domain resulted in altered shoot development shown in Fig.3B,as well as large cotyledons (Fig.3A,B). No tunic-corpus structure was observed in the sections of the shoot apex of transgenic plants with abnormal shoots (Fig.3D).[1].
Fig. 3. Abnormal shoot development of the transgenic plants over‐expressing the SET domain of OsSET1 in Arabidopsis. Seedlings of wild type and transgenic Arabidopsis on day 11 after germination were shown in (A) and (B) with the same magnification. There were six true leaves observed in the wild-type plant, but no true leaf observed in the transgenic plant at this stage. The cotyledons of the transgenic plant were bigger than those of the wild-type plant. The longitudinal sections of shoot apexes of seedlings of wild type and transgenic Arabidopsis on day 11 after germination were shown in (C) and (D). Bar: 60 µm.(from reference [1]).
  • To isolate SET‐domain genes from rice, a conserved SET‐domain sequence was first isolated with RT‐PCR using the degenerated primers determined by the published sequences of CLF, E(z), MEA (CEM1: 5′‐TCTGA(TC)T(TC)(TCG)(AC)(TC)GG(TAC)TGGGG TGC‐3′; CEM2: 5′‐GC(AT)(TC)C(TAC)TCTGG(TC)(CT)C(AG) TA(GCT)C(AGT)GTA‐3′). A 344 bp PCR product was cloned in pGEM‐T easy vector (Promega). After the PCR product was confirmed as a SET domain by sequencing, it was used as a probe to screen a cDNA library constructed from young panicles. A full length cDNA containing the SET domain was obtained by conducting 5′‐RACE after the library screening. This cDNA is 2957 bp, contains an ORF that encodes a putative protein of 895 amino acids with calculated molecular mass of 99.8 kDa. This gene was designated as OsSET1 (GenBank accession number AF407010). It localizes at chromosome three in rice genome at the contig 1300 (http://www.softberry.com/berry.phtml?topic=gfind&prg=FGENESH; GenBank accession number AAAA01003815). Interestingly, five genes were predicted at this contig and the OsSET1 cDNA sequence was predicted as gene four and five by FGENESH1.1. The OsSET1 sequence data now rectified the prediction. According to the rice genome sequence data, the OsSET1 contains 17 exons (data not shown)[1].


Fig. 4. Based on the alignment of amino acid sequences of SET domain, a phylogenetic tree of SET domain-containing proteins from different organisms is shown. Sub-grouping of the proteins in I–IV subgroups is done according to a previous classification by Jenuwein et al. (1998) depending on the homology in SET domain. (from reference [2]).
  • Phylogenetic analysis done by using DNASTAR MegAlign 4.03, based on homology in SET domain, classifies SET domain-containing proteins into different subgroups and shows that OsiEZ1 belongs to subgroup I, which also includes AtMEA, AtCLF, OsCLF, AtEZA1 and AtPcG(Fig. 4). The multiple sequence alignment of these proteins in conserved regions was done using Gene Runnerversion 3.04 and DNASTAR MegAlign 4.03[2].

  • Key of sequence designations: AAB80647, AAK28967, CAB41104, AAC61820, AAD15582, AAK28966, AAD55657, AAD26896, AtEZA, AtCLF1, AtPCG, AtCLF, CAA71599, AtMEDEA, AtMEALIKE, AAC23419, CAB75815, AAC23419, AAC34358 and AAF04434, all from Arabidopsis thaliana; NtSET1; from Nicotiana tabaccum; CLR4, from Schizosaccharomyces pombe; G9a and ENX1, both from Homo sapiens; SET1 and SET2, both from S. cerevisiae; TRX, DME(Z), DMEZ2MM, DMEZA2 and DMEZH2, all from Drosophila melanogaster; CEMES2, from C. Elegans[2].

Labs working on this gene

  • PKU-Yale Joint Research Center of Agricultural and Plant Molecular Biology, National Key Laboratory of Protein Engineering and Plant Gene Engineering, College of Life Sciences, Peking University, 5 Yiheyuan Road, Beijing 100871, PR China
  • Department of Plant Molecular Biology, University of Delhi South Campus, Benito Juarez Road, New Delhi 110021, India
  • Biotechnology Section, Division of Crop Improvement, IGFRI, Jhansi 284003, India
  • International Center for Genetic Engineering and Biotechnology, New Delhi 110067, India
  • CSIRO Plant Industry, GPO BOX 1600, ACT 2601, Australia
  • Present address: Vitagrain, Uttara Model Town, Dhaka, Bangladesh
  • Present address: IRRI, Los Banos, Laguna 4031, Philippines
  • Bio-crop Development Division, Department of Agricultural Bio-resources, National Academy of Agricultural Science, RDA,Suwon 441-701, Republic of Korea


  1. 1.0 1.1 1.2 1.3 1.4 1.5 OsSET1, a novel SET‐domain‐containing gene from rice Journal of Experimental Botany, 2003, 54(389): 1995-1996
  2. 2.0 2.1 2.2 2.3 A POLYCOMB group gene of rice (Oryza sativa L. subspecies indica), OsiEZ1, codes for a nuclear-localized protein expressed preferentially in young seedlings and during reproductive development Gene, 2003, 314: 1-13

Structured Information