Recently, I was working on a UniProt parser for the next BioGroovy release, and while looking through the UniProt schema

I started to wonder why the elements in schemas weren’t annotated with references to ontologies? Let’s take a look at an example.
A typical UniProt record contains many “reference” elements like this:
<reference key="2"> <citation type="journal article" date="1998" name="Biochem. Biophys. Res. Commun." volume="244" first="285" last="292"> <title>cDNA cloning, expression, subcellular localization, and chromosomal assignment of mammalian aurora homologues, aurora-related kinase (ARK) 1 and 2.</title> <authorList> <person name="Shindo M."/> <person name="Nakano H."/> <person name="Kuroyanagi H."/> <person name="Shirasawa T."/> <person name="Mihara M."/> <person name="Gilbert D.J."/> <person name="Jenkins N.A."/> <person name="Copeland N.G."/> <person name="Yagita H."/> <person name="Okumura K."/> </authorList> <dbReference type="PubMed" id="9514916"/> <dbReference type="DOI" id="10.1006/bbrc.1998.8250"/> </citation> <scope>NUCLEOTIDE SEQUENCE [MRNA]</scope> <scope>VARIANT ILE-57</scope> </reference>
And my question was “What do the Scope elements signify”. Are they evidence of an assertion about the function of a protein similar to the GeneRIFs in EntrezGene? In the example above, does VARIANT ILE-57 mean that “there exists a variant of this protein called “ILE-57” and this document is evidence of that assertion?
In order to answer that question, I started digging around in the XML Schema file for the UniProt file format. The schema’s rather cryptic answer was this:
<xs:element name="scope" type="xs:string" maxOccurs="unbounded"><xs:annotation><xs:documentation>Describes the scope of a citation. Equivalent to the flat file RP-line.</xs:documentation></xs:annotation></xs:element>
While musing over this answer (“What’s an RP-line”), it occurred to me that some light might be cast on the situation if the elements in the schema pointed to a well-documented ontology. Is such a thing even possible? A quick Google search later, and I arrive at the Semantic Annotations for WSDL and XML Schema. Here’s a snippet of XML that shows you what embedding an ontology model reference in XML schema looks like:
<xs:simpleType name="Confirmation" sawsdl:modelReference="http://www.w3.org/2002/ws/sawsdl/spec/ontology/purchaseorder#OrderConfirmation">
The only bad thing about this approach is that if you’re trying to read the documentation found in the ontology, you’re going to be doing a lot of clicking. Or you’ll need to transform the schema into a more comprehensive human readable document that consolidates the information in the schema and the ontology into one document.

It looks like ‘scope’ is http://www.uniprot.org/core/scope. Uniprot publishes a lot (all?) of its data in RDF, so the way I got to that was from an example record, http://www.uniprot.org/uniprot/P62577, and then comparing the xml and the rdf/xml outputs. In the rdf/xml, there is:
PROTEIN SEQUENCE
In the XML namespace http://purl.uniprot.org/core/.
It sounds like you’re proving my point — if the schema were annotated to begin with, the detective work needed to understand the XML would be much easier or perhaps unnecessary to begin with.