PrimeKG Enrichment and Embedding¶
In this tutorial, we will explain how to perform multimodal enrichment and embedding of PrimeKG nodes.
We will consider the following node types
- Drugs (PubChem/DrugBank/CTD) - TEXT and SMILES
- Proteins (NCBI/Gene) - TEXT and amino-acid sequence
- Pathways (Reactome) - TEXT
- Phenotypes (HPO) - TEXT
- Protein function (GO) - TEXT
- Disease (MONDO) - TEXT
- Anatomy (UBERON) - TEXT
Prior information about the PrimeKG can be found in the following repositories:
Note that we are leveraging the PrimeKG provided in Harvard Dataverse, which is publicly available in the following link:
https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/IXA7BM
By the time we are writing this tutorial, the latest version of PrimeKG (kg.csv
) is 2.1
.
First of all, we need to import necessary libraries as follows:
# Import necessary libraries
import sys
import torch
import networkx as nx
from tqdm import tqdm
sys.path.append('../../..')
from aiagents4pharma.talk2knowledgegraphs.datasets.primekg import PrimeKG
from aiagents4pharma.talk2knowledgegraphs.utils.enrichments.uniprot_proteins import EnrichmentWithUniProt
from aiagents4pharma.talk2knowledgegraphs.utils.enrichments.ols_terms import EnrichmentWithOLS
from aiagents4pharma.talk2knowledgegraphs.utils.enrichments.reactome_pathways import EnrichmentWithReactome
from aiagents4pharma.talk2knowledgegraphs.utils.enrichments.pubchem_strings import EnrichmentWithPubChem
from aiagents4pharma.talk2knowledgegraphs.utils.embeddings.huggingface import EmbeddingWithHuggingFace
from aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils import external_id2pubchem_cid
Check device availability¶
device = "cuda:0" if torch.cuda.is_available() else "cpu"
device
'cpu'
Load the BioBERT model¶
# Using MSFT's BioBERT
biobert_model = EmbeddingWithHuggingFace(model_name='microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract',
model_cache_dir="../../../../data/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract/",
truncation=False,
device=device)
Load PrimeKG¶
The PrimeKG
dataset allows to load the data from the Harvard Dataverse server if the data is not available locally.
Otherwise, the data is loaded from the local directory as defined in the local_dir
.
# Define primekg data by providing a local directory where the data is stored
primekg_data = PrimeKG(local_dir="../../../../data/primekg/")
To load the dataframes of nodes and edges from PrimeKG, we just need to invoke a method as follows.
# Invoke a method to load the data
primekg_data.load_data()
# Get primekg_nodes and primekg_edges
primekg_nodes = primekg_data.get_nodes()
primekg_edges = primekg_data.get_edges()
Loading nodes of PrimeKG dataset ... ../../../../data/primekg/primekg_nodes.tsv.gz already exists. Loading the data from the local directory. Loading edges of PrimeKG dataset ... ../../../../data/primekg/primekg_edges.tsv.gz already exists. Loading the data from the local directory.
Check PrimeKG Dataframes¶
As mentioned before, the primekg_nodes and primekg_edges are the dataframes of nodes and edges, respectively.
We can further analyze the dataframes to extract the information we need.
For instance, we can construct a graph from the nodes and edges dataframes using the networkx library.
PrimeKG Nodes¶
primekg_nodes
is a dataframe of nodes, which has the following columns:
node_index
: the index of the nodenode
: the node namenode_id
: the id of the node (currently set as node name itself, for visualization purposes)node_uid
: the unique identifier of the node (source name + unique id)node_type
: the type of the node
We can check a sample of the primekg nodes to see the list of nodes in the PrimeKG dataset as follows.
# Check a sample of the primekg nodes
primekg_nodes.head()
node_index | node_name | node_source | node_id | node_type | |
---|---|---|---|---|---|
0 | 0 | PHYHIP | NCBI | 9796 | gene/protein |
1 | 1 | GPANK1 | NCBI | 7918 | gene/protein |
2 | 2 | ZRSR2 | NCBI | 8233 | gene/protein |
3 | 3 | NRF1 | NCBI | 4899 | gene/protein |
4 | 4 | PI4KA | NCBI | 5297 | gene/protein |
The current version of PrimeKG has about 130K of nodes in total as we can observe in the following cell.
# Check dimensions of the primekg nodes
primekg_nodes.shape
(129375, 5)
We can breakdown the statistics of the primekg nodes by their types as follows.
# Show node types and their counts
primekg_nodes['node_type'].value_counts()
node_type biological_process 28642 gene/protein 27671 disease 17080 effect/phenotype 15311 anatomy 14035 molecular_function 11169 drug 7957 cellular_component 4176 pathway 2516 exposure 818 Name: count, dtype: int64
PrimeKG was built using various sources, as we can observe from their unique node sources as follows.
# Show source of the primekg nodes
primekg_nodes['node_source'].value_counts()
node_source GO 43987 NCBI 27671 MONDO 15813 HPO 15311 UBERON 14035 DrugBank 7957 REACTOME 2516 MONDO_grouped 1267 CTD 818 Name: count, dtype: int64
primekg_nodes[primekg_nodes['node_source'] == 'CTD']
# primekg_edges.head()
node_index | node_name | node_source | node_id | node_type | |
---|---|---|---|---|---|
61677 | 61677 | 1-hydroxyphenanthrene | CTD | C092102 | exposure |
61678 | 61678 | 1-hydroxypyrene | CTD | C033146 | exposure |
61679 | 61679 | 1-naphthol | CTD | C029350 | exposure |
61680 | 61680 | 2,2',3',4,4',5-hexachlorobiphenyl | CTD | C029790 | exposure |
61681 | 61681 | 2,2',3,5,5',6-hexachlorobiphenyl | CTD | C066675 | exposure |
... | ... | ... | ... | ... | ... |
127593 | 127593 | Heptanes | CTD | D006536 | exposure |
127594 | 127594 | octane | CTD | C026728 | exposure |
127595 | 127595 | pseudocumene | CTD | C010313 | exposure |
127596 | 127596 | pentane | CTD | C033353 | exposure |
127597 | 127597 | Butylated Hydroxyanisole | CTD | D002083 | exposure |
818 rows × 5 columns
test = EnrichmentWithPubChem()
test.enrich_documents(['24667'])
INFO:aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils:Load Hydra configuration for PubChem CID description.
(["Butylated Hydroxyanisole can cause cancer according to The World Health Organization's International Agency for Research on Cancer (IARC)."], ['CC(C)(C)C1=C(C=CC(=C1)O)OC.CC(C)(C)C1=C(C=CC(=C1)OC)O'])
Create a directed graph using the egdes¶
kg = nx.DiGraph()
## Make a KG using the edgelist
G = nx.from_pandas_edgelist(
primekg_edges,
source="head_name",
target="tail_name",
edge_key="relation",
# edge_attr=["edge_id", "edge_type", "feature_value", "feature_id"],
create_using=nx.DiGraph(),
)
kg = nx.compose(G, kg)
Add additional node attributes (e.g. source, id and type)¶
# Start by extracting slicing the df to include only thge head nodes
df_head_nodes = primekg_edges[['head_name', 'head_source', 'head_id', 'head_type']]
# Rename the columns
df_head_nodes = df_head_nodes.rename(columns={
'head_name': 'node_name',
'head_source': 'node_source',
'head_id': 'node_id',
'head_type': 'node_type'
})
# Set the node_name as index
df_head_nodes = df_head_nodes.set_index('node_name')
# Add the additional attributes to graph
G.add_nodes_from((n, dict(d)) for n, d in df_head_nodes.iterrows())
# Recompose the graph
kg = nx.compose(G, kg)
CTD enrichment¶
We will map CTD IDs to their corresponding PubChem IDs, and extract their descriptions and SMILES representation using EnrichmentWithPubChem.
from dataclasses import dataclass
# Create a dataclass to hold the node attributes
@dataclass
class PubChemAttr:
"""Dataclass to hold the attributes of a node."""
pubchem_cid: str
name: str
# Make description optional
# If not provided, it will be set to None
description: str = None
smiles: str = None
Go iteratively over every CTD ID and fetch its description and SMILES rep¶
list_pubchem_attrs = []
# For the sake of space and time, we will enrich only the first 5 nodes of each DB
# Extract all gene IDs from the graph
pubchem_obj = EnrichmentWithPubChem()
pubchem_cids = []
count = 0
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_source') != 'CTD':
continue
count += 1
# Get the node attributes
node_attr = kg.nodes[node]
# Cnvert CTD ID into PubChem ID
pubchem_cid = external_id2pubchem_cid('Comparative Toxicogenomics Database', node_attr.get('node_id'))
# Save all PubChem CIDs
pubchem_cids.append(pubchem_cid)
# Create a ReactomeAttr object
pubchem_attr = PubChemAttr(
pubchem_cid=pubchem_cid,
name=node
)
list_pubchem_attrs.append(pubchem_attr)
if count == 2:
break
# Enrich PubChem attr
for pubchem_attr in list_pubchem_attrs:
# Fetch descriptions and SMILES representation
description, smiles = pubchem_obj.enrich_documents([pubchem_attr.pubchem_cid])
# Add descriptions to the corresponding Reactome attributes
pubchem_attr.description = description[0]
pubchem_attr.smiles = smiles[0]
0%| | 0/129262 [00:00<?, ?it/s]INFO:aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils:Load Hydra configuration for PubChem ID conversion. 9%|▊ | 11282/129262 [00:00<00:04, 24243.44it/s]INFO:aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils:Load Hydra configuration for PubChem ID conversion. 14%|█▍ | 18354/129262 [00:01<00:06, 18164.34it/s] INFO:aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils:Load Hydra configuration for PubChem CID description. INFO:aiagents4pharma.talk2knowledgegraphs.utils.pubchem_utils:Load Hydra configuration for PubChem CID description.
Add descrioptions to the CTD nodes and recompose the graph¶
for pubchem_attr in list_pubchem_attrs:
node = pubchem_attr.name
description = pubchem_attr.description
# print (f"node: {node}, description: {description}")
G.add_nodes_from([(node, {'description': description})])
# Recompose the graph
kg = nx.compose(G, kg)
Please follow the notebook link below to know how to generate embeddings of SMILES representation¶
Generate embedding of CTD descriptions¶
for i, node in tqdm(enumerate(kg.nodes)):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
print (node)
desc = kg.nodes[node].get('description')
# print (desc)
outputs = biobert_model.embed_documents([desc])
# print (outputs)
G.add_nodes_from([(node, {'description_embedding': outputs})])
# torch.cuda.synchronize()
# torch.cuda.empty_cache()
# Recompose the graph
kg = nx.compose(G, kg)
11282it [00:00, 93365.28it/s]
DDT Copper
129262it [00:00, 429209.15it/s]
Display a DF with results¶
import pandas as pd
dic = {'node':[],
'node_source':[],
'node_id':[],
'description':[],
'description_embedding':[]}
for node in tqdm(kg.nodes):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
dic['node'].append(node)
dic['node_source'].append(kg.nodes[node].get('node_source'))
dic['node_id'].append(kg.nodes[node].get('node_id'))
dic['description'].append(kg.nodes[node].get('description'))
dic['description_embedding'].append(kg.nodes[node].get('description_embedding'))
# print (node, kg.nodes[node].get('description'), kg.nodes[node].get('sequence'), kg.nodes[node].get('description_embedding'))
df = pd.DataFrame(dic)
df
100%|██████████| 129262/129262 [00:00<00:00, 859282.23it/s]
node | node_source | node_id | description | description_embedding | |
---|---|---|---|---|---|
0 | DDT | CTD | D003634 | DDT (Dichlorodiphenyltrichloroethane) can caus... | [[tensor(-0.1798), tensor(0.1512), tensor(0.26... |
1 | Copper | CTD | D003300 | Copper atom is a copper group element atom and... | [[tensor(-0.2902), tensor(0.0179), tensor(0.33... |
Reactome pathway enrichment¶
We will use Reactome API services to extract textual descriptions of pathways using the EnrichmentWithReactome class.
from dataclasses import dataclass
# Create a dataclass to hold the node attributes
@dataclass
class ReactomeAttr:
"""Dataclass to hold the attributes of a node."""
pathway_id: str
name: str
# Make description optional
# If not provided, it will be set to None
description: str = None
Go iteratively over every pathway and fetch its description¶
list_reactome_attrs = []
# For the sake of space and time, we will enrich only the first 5 nodes of each DB
# Extract all gene IDs from the graph
reactome_obj = EnrichmentWithReactome()
count = 0
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_source') != 'REACTOME':
continue
count += 1
# Get the node attributes
node_attr = kg.nodes[node]
# Create a ReactomeAttr object
reactome_attr = ReactomeAttr(
pathway_id=node_attr.get('node_id'),
name=node,
description=node_attr.get('description')
)
list_reactome_attrs.append(reactome_attr)
if count == 2:
break
for reactome_attr in list_reactome_attrs:
# Fetch descriptions
description = reactome_obj.enrich_documents([reactome_attr.pathway_id])
# Add descriptions to the corresponding Reactome attributes
reactome_attr.description = description[0]
print (list_reactome_attrs)
47%|████▋ | 60717/129262 [00:00<00:00, 1510221.05it/s] INFO:aiagents4pharma.talk2knowledgegraphs.utils.enrichments.reactome_pathways:Load Hydra configuration for reactome enrichment INFO:aiagents4pharma.talk2knowledgegraphs.utils.enrichments.reactome_pathways:Load Hydra configuration for reactome enrichment
[ReactomeAttr(pathway_id='R-HSA-8877627', name='Vitamin E', description='Vitamins A, D, E and K are lipophilic compounds, the so-called fat-soluble vitamins. Because of their lipophilicity, fat-soluble vitamins are solubilised and transported by intracellular carrier proteins to exert their actions. Alpha-tocopherol, the main form of vitamin E found in the body, is transported by alpha-tocopherol transfer protein (TTPA) in hepatic cells (Kono & Arai 2015, Schmolz et al. 2016).'), ReactomeAttr(pathway_id='R-HSA-5334118', name='DNA methylation', description='Methylation of cytosine is catalyzed by a family of DNA methyltransferases (DNMTs): DNMT1, DNMT3A, and DNMT3B transfer methyl groups from S-adenosylmethionine to cytosine, producing 5-methylcytosine and homocysteine (reviewed in Klose and Bird 2006, Ooi et al. 2009, Jurkowska et al. 2011, Moore et al. 2013). (DNMT2 appears to methylate RNA rather than DNA.) DNMT1, the first enzyme discovered, preferentially methylates hemimethylated CG motifs that are produced by replication (template strand methylated, synthesized strand unmethylated). Thus it maintains existing methylation through cell division. DNMT3A and DNMT3B catalyze de novo methylation at unmethylated sites that include both CG dinucleotides and non-CG motifs. DNA from adult humans contains about 0.76 to 1.00 mole percent 5-methylcytosine (Ehrlich et al. 1982, reviewed in Klose and Bird 2006, Ooi et al. 2009, Moore et al. 2013). Methylation of DNA occurs at cytosines that are mainly located in CG dinucleotides. CG dinucleotides are unevenly distributed in the genome. Promoter regions tend to have a high CG-content, forming so-called CG-islands (CGIs), while the CG-content in the remaining part of the genome is much lower. CGIs tend to be unmethylated, while the majority of CGs outside CGIs are methylated. Methylation in promoters and first exons tends to repress transcription while methylation in gene bodies (regions of genes downstream of the promoter and first exon) correlates with transcription (reviewed in Ehrlich and Lacey 2013, Kulis et al. 2013). Proteins such as MeCP2 and MBDs specifically bind 5-methylcytosine and may recruit other factors. Mammalian development has two major episodes of genome-wide demethylation and remethylation (reviewed in Zhou 2012, Guibert and Weber 2013, Hackett and Surani 2013, Dean 2014). In mice about 1 day after fertilization the paternal genome is actively demethylated by TET proteins together with thymine DNA glycosylase and the maternal genome is demethylated by passive dilution during replication, however methylation at imprinted sites is maintained. The genome has its lowest methylation level about 3.5 days post-fertilization. Remethylation occurs by 6.5 days post-fertilization. The second demethylation-remethylation event occurs in primordial germ cells of the developing embryo about 12.5 days post-fertilization. DNMT3A and DNMT3B, together with the non-catalytic DNMT3L, play major roles in the remethylation events (reviewed in Chen and Chan 2014). How the methyltransferases are directed to particular regions of the genome remains an area of active research. The mechanisms at each locus may differ in detail but a connection between histone modifications and DNA methylation has been observed (reviewed in Rose and Klose 2014).')]
Add descriptions to the Reactome nodes and recompose the graph¶
for reactome_attr in list_reactome_attrs:
node = reactome_attr.name
description = reactome_attr.description
# print (f"node: {node}, description: {description}")
G.add_nodes_from([(node, {'description': description})])
# Recompose the graph
kg = nx.compose(G, kg)
Generate embeddings of descriptions of reactome pathways¶
for i, node in tqdm(enumerate(kg.nodes)):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
print (node)
desc = kg.nodes[node].get('description')
# print (desc)
outputs = biobert_model.embed_documents([desc])
# print (outputs)
G.add_nodes_from([(node, {'description_embedding': outputs})])
# torch.cuda.synchronize()
# torch.cuda.empty_cache()
# Recompose the graph
kg = nx.compose(G, kg)
93148it [00:00, 456878.51it/s]
Vitamin E DNA methylation
129262it [00:00, 531214.21it/s]
Display the results in a DF¶
import pandas as pd
dic = {'node':[],
'node_source':[],
'node_id':[],
'description':[],
'description_embedding':[]}
for node in tqdm(kg.nodes):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
dic['node'].append(node)
dic['node_source'].append(kg.nodes[node].get('node_source'))
dic['node_id'].append(kg.nodes[node].get('node_id'))
dic['description'].append(kg.nodes[node].get('description'))
dic['description_embedding'].append(kg.nodes[node].get('description_embedding'))
# print (node, kg.nodes[node].get('description'), kg.nodes[node].get('sequence'), kg.nodes[node].get('description_embedding'))
df = pd.DataFrame(dic)
df
100%|██████████| 129262/129262 [00:00<00:00, 995538.19it/s]
node | node_source | node_id | description | description_embedding | |
---|---|---|---|---|---|
0 | Vitamin E | REACTOME | R-HSA-8877627 | Vitamins A, D, E and K are lipophilic compound... | [[tensor(-0.4649), tensor(0.2769), tensor(0.73... |
1 | DNA methylation | REACTOME | R-HSA-5334118 | Methylation of cytosine is catalyzed by a fami... | [[tensor(-0.5609), tensor(0.4334), tensor(0.49... |
OLS terms enrichments¶
OLS is the Ontology Lookup Service by EMBL/EBI. We will use their API services to extract textual descriptions of the following terms using the EnrichmentWithOLS class.
- GO
- HPO
- UBERON
- MONDO
- MONDO_grouped
from dataclasses import dataclass
# Create a dataclass to hold the node attributes
@dataclass
class OLSAttr:
"""Dataclass to hold the attributes of a node."""
term_id: str
name: str
label: str = None
# Make description optional
# If not provided, it will be set to None
description: str = None
# Define a dictionary to store DB name and its OLS code
dic_ols = {
'GO': 'GO',
'HPO': 'HP',
'UBERON': 'UBERON',
'MONDO': 'MONDO',
}
Go iteratively over every DB in OLS and store results in a dic¶
list_ols_attrs = []
term_ids = []
# For the sake of space and time, we will enrich only the first 5 nodes of each DB
# Extract all gene IDs from the graph
ols_obj = EnrichmentWithOLS()
for source in ['GO', 'MONDO', 'HPO', 'UBERON']:
count = 0
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_source') != source:
continue
count += 1
# Get the node attributes
node_attr = kg.nodes[node]
# Term ID
# OLS term must contain 7-digit integer code
# Hence, prefix with 0s such that total number
# of characters is 7
term_id = dic_ols[source] + '_' + str("{:07}".format(int(node_attr.get('node_id'))))
term_ids.append(term_id)
# Create a OLSAttr object
ols_attr = OLSAttr(
term_id=term_id,
name=node,
label=node,
description=node_attr.get('description')
)
list_ols_attrs.append(ols_attr)
if count == 2:
break
# Fetch descriptions
descriptions = ols_obj.enrich_documents(term_ids)
# Add descriptions to the corresponding OLS attributes
for ols_attr, description in zip(list_ols_attrs, descriptions):
ols_attr.description = description
46%|████▌ | 59435/129262 [00:00<00:00, 1410968.24it/s] 19%|█▉ | 24751/129262 [00:00<00:00, 1464943.46it/s] 20%|█▉ | 25525/129262 [00:00<00:00, 1565290.51it/s] 79%|███████▉ | 101813/129262 [00:00<00:00, 1408816.01it/s] INFO:aiagents4pharma.talk2knowledgegraphs.utils.enrichments.ols_terms:Load Hydra configuration for OLS enrichments.
True ['Any process that stops, prevents, or reduces the frequency, rate or extent of the directed movement of a neurotransmitter into a neuron or glial cell.'] negative regulation of neurotransmitter uptake True ['Any process that stops, prevents, or reduces the frequency, rate or extent of the directed movement of serotonin into a cell.'] negative regulation of serotonin uptake True ['Persistently high systemic arterial blood pressure. Based on multiple readings (blood pressure determination), hypertension is currently defined as when systolic pressure is consistently greater than 140 mm Hg or when diastolic pressure is consistently 90 mm Hg or more.'] hypertensive disorder True ['A condition that occurs while resting or lying in bed; it is characterized by an irresistible urgency to move the legs to obtain relief from a strange and uncomfortable sensation in the legs.'] restless legs syndrome True [] Graves disease True [] Horner syndrome True ['Nonsynovial joint in which the articulating bones are connected by bone. Examples: epiphyseal junction, manubriosternal synostosis.'] synostosis True ['A cover or envelope partly or wholly surrounding a structure. Examples: egg shell, articular capsules, renal capsules[WP].', "see also: protective structure surrounding some species of bacteria and fungi. Note that FMA has classes such as Articular capsule that are not subtypes of Capsule (which is undefined in FMA). GO has 'capsule' which is a fungal structure, and 'external encapsulating structure' which is restricted to cells"] capsule
Repeat the same for MONDO_grouped expect concatenate descriptions of all IDs in a group¶
# For the sake of space and time, we will enrich only the first 5 nodes of each DB
# Extract all gene IDs from the graph
ols_obj = EnrichmentWithOLS()
count = 0
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_source') != 'MONDO_grouped':
continue
count += 1
# Get the node attributes
node_attr = kg.nodes[node]
# MONDO_grouped contains multiple codes
# separated by a '_'
# OLS term must contain 7-digit integer code
# Hence, prefix with 0s such that total number
# of characters is 7
codes = node_attr.get('node_id')
codes = codes.split('_')
# print (codes)
term_ids = []
for code in codes:
term_id = 'MONDO_' + str("{:07}".format(int(code)))
term_ids.append(term_id)
# Create a OLSAttr object
ols_attr = OLSAttr(
term_id=node_attr.get('node_id'),
name=node,
label=node,
description=node_attr.get('description')
)
# Fetch descriptions
descriptions = ols_obj.enrich_documents(term_ids)
# Add descriptions to the corresponding OLS attributes
ols_attr.description = '\n'.join(descriptions)
list_ols_attrs.append(ols_attr)
if count == 2:
break
print (list_ols_attrs)
0%| | 0/129262 [00:00<?, ?it/s]INFO:aiagents4pharma.talk2knowledgegraphs.utils.enrichments.ols_terms:Load Hydra configuration for OLS enrichments.
True ['High blood pressure caused by an underlying medical condition.'] secondary hypertension True ['Hypertension that presents without an identifiable cause.'] essential hypertension True ["OBSOLETE. An instance of hypertension that is caused by a modification of the individual's genome."] obsolete genetic hypertension
19%|█▉ | 24716/129262 [00:00<00:02, 42644.98it/s]INFO:aiagents4pharma.talk2knowledgegraphs.utils.enrichments.ols_terms:Load Hydra configuration for OLS enrichments.
True ['Increased blood pressure in the portal venous system. It is most commonly caused by cirrhosis. Other causes include portal vein thrombosis, Budd-Chiari syndrome, and right heart failure. Complications include ascites, esophageal varices, encephalopathy, and splenomegaly.'] portal hypertension True ['A severe medical condition which is estimated to appear in 9-18% of hypertensive patients, in which treatment with 3 or more antihypertensive drugs including diuretics are ineffective.'] resistant hypertension True ['Any Parkinson disease in which the cause of the disease is a mutation in the LRRK2 gene.'] autosomal dominant Parkinson disease 8 True ['Any Parkinson disease in which the cause of the disease is a mutation in the PARK7 gene.'] autosomal recessive early-onset Parkinson disease 7 True ['Any Parkinson disease in which the cause of the disease is a mutation in the VPS35 gene.'] Parkinson disease 17 True ['A Parkinson disease that begins after around the age of 50.'] late-onset Parkinson disease True ['Any hereditary late onset Parkinson disease in which the cause of the disease is a mutation in the DNAJC13 gene.'] Parkinson disease 21 True ['Any Parkinson disease in which the cause of the disease is a mutation in the PINK1 gene.'] autosomal recessive early-onset Parkinson disease 6 True ['Any Parkinson disease in which the cause of the disease is a mutation in the SYNJ1 gene.'] early-onset Parkinson disease 20 True ['A late onset Parkinson disease that has material basis in heterozygous triplication of the alpha-synuclein gene (SNCA) on chromosome 4q22.'] autosomal dominant Parkinson disease 4 True ['Editor note: DO def states any mutation in SNCA, but this would include PARK4; def needs to state that this is het mutation'] autosomal dominant Parkinson disease 1 True ['Any young-onset Parkinson disease in which the cause of the disease is a mutation in the VPS13C gene.'] autosomal recessive early-onset Parkinson disease 23 True ['A form of Parkinson disease (PD) characterized by an age of onset between 21-45 years, rigidity, painful cramps followed by tremor, bradykinesia, dystonia, gait complaints and falls, and other non-motor symptoms. A slow disease progression and a more pronounced response to dopaminergic therapy are also observed in most YOPD forms.'] young-onset Parkinson disease True ['A progressive degenerative disorder of the central nervous system characterized by loss of dopamine producing neurons in the substantia nigra and the presence of Lewy bodies in the substantia nigra and locus coeruleus. Signs and symptoms include tremor which is most pronounced during rest, muscle rigidity, slowing of the voluntary movements, a tendency to fall back, and a mask-like facial expression.'] Parkinson disease True ['A condition with a clinical picture similar to that of Parkinson disease, but which is caused by external factors, including medication.'] secondary Parkinson disease True ['Editor notes: check onset axioms'] juvenile-onset Parkinson disease True [] Parkinson disease, mitochondrial
19%|█▉ | 24752/129262 [00:02<00:09, 11270.94it/s]
True [] parkinson disease 12 True [] parkinson disease 10 True [] parkinson disease 16 [OLSAttr(term_id='GO_0051581', name='negative regulation of neurotransmitter uptake', label='negative regulation of neurotransmitter uptake', description='Any process that stops, prevents, or reduces the frequency, rate or extent of the directed movement of a neurotransmitter into a neuron or glial cell.'), OLSAttr(term_id='GO_0051612', name='negative regulation of serotonin uptake', label='negative regulation of serotonin uptake', description='Any process that stops, prevents, or reduces the frequency, rate or extent of the directed movement of serotonin into a cell.'), OLSAttr(term_id='MONDO_0005044', name='hypertensive disorder', label='hypertensive disorder', description='Persistently high systemic arterial blood pressure. Based on multiple readings (blood pressure determination), hypertension is currently defined as when systolic pressure is consistently greater than 140 mm Hg or when diastolic pressure is consistently 90 mm Hg or more.'), OLSAttr(term_id='MONDO_0005391', name='restless legs syndrome', label='restless legs syndrome', description='A condition that occurs while resting or lying in bed; it is characterized by an irresistible urgency to move the legs to obtain relief from a strange and uncomfortable sensation in the legs.'), OLSAttr(term_id='HP_0100647', name='Graves disease', label='Graves disease', description='Graves disease'), OLSAttr(term_id='HP_0002277', name='Horner syndrome', label='Horner syndrome', description='Horner syndrome'), OLSAttr(term_id='UBERON_0010361', name='synostosis', label='synostosis', description='Nonsynovial joint in which the articulating bones are connected by bone. Examples: epiphyseal junction, manubriosternal synostosis.'), OLSAttr(term_id='UBERON_0003893', name='capsule', label='capsule', description="A cover or envelope partly or wholly surrounding a structure. Examples: egg shell, articular capsules, renal capsules[WP].\nsee also: protective structure surrounding some species of bacteria and fungi. Note that FMA has classes such as Articular capsule that are not subtypes of Capsule (which is undefined in FMA). GO has 'capsule' which is a fungal structure, and 'external encapsulating structure' which is restricted to cells"), OLSAttr(term_id='1200_1134_15512_5080_100078', name='hypertension', label='hypertension', description="High blood pressure caused by an underlying medical condition.\nHypertension that presents without an identifiable cause.\nOBSOLETE. An instance of hypertension that is caused by a modification of the individual's genome.\nIncreased blood pressure in the portal venous system. It is most commonly caused by cirrhosis. Other causes include portal vein thrombosis, Budd-Chiari syndrome, and right heart failure. Complications include ascites, esophageal varices, encephalopathy, and splenomegaly.\nA severe medical condition which is estimated to appear in 9-18% of hypertensive patients, in which treatment with 3 or more antihypertensive drugs including diuretics are ineffective."), OLSAttr(term_id='11764_11658_13625_8199_14604_11613_14233_11562_8200_14796_17279_5180_6966_828_10796_10360_11737_13167', name='Parkinson disease', label='Parkinson disease', description='Any Parkinson disease in which the cause of the disease is a mutation in the LRRK2 gene.\nAny Parkinson disease in which the cause of the disease is a mutation in the PARK7 gene.\nAny Parkinson disease in which the cause of the disease is a mutation in the VPS35 gene.\nA Parkinson disease that begins after around the age of 50.\nAny hereditary late onset Parkinson disease in which the cause of the disease is a mutation in the DNAJC13 gene.\nAny Parkinson disease in which the cause of the disease is a mutation in the PINK1 gene.\nAny Parkinson disease in which the cause of the disease is a mutation in the SYNJ1 gene.\nA late onset Parkinson disease that has material basis in heterozygous triplication of the alpha-synuclein gene (SNCA) on chromosome 4q22.\nEditor note: DO def states any mutation in SNCA, but this would include PARK4; def needs to state that this is het mutation\nAny young-onset Parkinson disease in which the cause of the disease is a mutation in the VPS13C gene.\nA form of Parkinson disease (PD) characterized by an age of onset between 21-45 years, rigidity, painful cramps followed by tremor, bradykinesia, dystonia, gait complaints and falls, and other non-motor symptoms. A slow disease progression and a more pronounced response to dopaminergic therapy are also observed in most YOPD forms.\nA progressive degenerative disorder of the central nervous system characterized by loss of dopamine producing neurons in the substantia nigra and the presence of Lewy bodies in the substantia nigra and locus coeruleus. Signs and symptoms include tremor which is most pronounced during rest, muscle rigidity, slowing of the voluntary movements, a tendency to fall back, and a mask-like facial expression.\nA condition with a clinical picture similar to that of Parkinson disease, but which is caused by external factors, including medication.\nEditor notes: check onset axioms\nParkinson disease, mitochondrial\nparkinson disease 12\nparkinson disease 10\nparkinson disease 16')]
Add descrioptions to the OLS nodes and recompose the graph¶
for ols_attr in list_ols_attrs:
node = ols_attr.name
description = ols_attr.description
# print (f"node: {node}, description: {description}")
G.add_nodes_from([(node, {'description': description})])
# Recompose the graph
kg = nx.compose(G, kg)
Generate embedding for all the nodes with textual descriptions¶
for i, node in tqdm(enumerate(kg.nodes)):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
print (node)
desc = kg.nodes[node].get('description')
outputs = biobert_model.embed_documents([desc])
G.add_nodes_from([(node, {'description_embedding': outputs})])
# torch.cuda.synchronize()
# torch.cuda.empty_cache()
# Recompose the graph
kg = nx.compose(G, kg)
24716it [00:00, 202653.96it/s]
hypertensive disorder hypertension restless legs syndrome Parkinson disease
59436it [00:00, 125677.84it/s]
Graves disease Horner syndrome synostosis negative regulation of neurotransmitter uptake negative regulation of serotonin uptake
129262it [00:00, 218477.31it/s]
capsule
Display the results in a DF¶
import pandas as pd
dic = {'node':[],
'node_source':[],
'node_id':[],
'description':[],
'description_embedding':[]}
for node in tqdm(kg.nodes):
node_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
dic['node'].append(node)
dic['node_source'].append(kg.nodes[node].get('node_source'))
dic['node_id'].append(kg.nodes[node].get('node_id'))
dic['description'].append(kg.nodes[node].get('description'))
dic['description_embedding'].append(kg.nodes[node].get('description_embedding'))
# print (node, kg.nodes[node].get('description'), kg.nodes[node].get('sequence'), kg.nodes[node].get('description_embedding'))
df = pd.DataFrame(dic)
df
100%|██████████| 129262/129262 [00:00<00:00, 953488.63it/s]
node | node_source | node_id | description | description_embedding | |
---|---|---|---|---|---|
0 | hypertensive disorder | MONDO | 5044 | Persistently high systemic arterial blood pres... | [[tensor(-0.3774), tensor(0.0026), tensor(0.33... |
1 | hypertension | MONDO_grouped | 1200_1134_15512_5080_100078 | High blood pressure caused by an underlying me... | [[tensor(-0.2972), tensor(0.1923), tensor(0.38... |
2 | restless legs syndrome | MONDO | 5391 | A condition that occurs while resting or lying... | [[tensor(-0.0092), tensor(0.1624), tensor(0.47... |
3 | Parkinson disease | MONDO_grouped | 11764_11658_13625_8199_14604_11613_14233_11562... | Any Parkinson disease in which the cause of th... | [[tensor(-0.0157), tensor(0.0820), tensor(0.17... |
4 | Graves disease | HPO | 100647 | Graves disease | [[tensor(-0.6299), tensor(0.1351), tensor(0.77... |
5 | Horner syndrome | HPO | 2277 | Horner syndrome | [[tensor(-0.5213), tensor(-0.1533), tensor(0.7... |
6 | synostosis | UBERON | 10361 | Nonsynovial joint in which the articulating bo... | [[tensor(0.0200), tensor(-0.2079), tensor(0.38... |
7 | negative regulation of neurotransmitter uptake | GO | 51581 | Any process that stops, prevents, or reduces t... | [[tensor(0.0589), tensor(0.3241), tensor(0.057... |
8 | negative regulation of serotonin uptake | GO | 51612 | Any process that stops, prevents, or reduces t... | [[tensor(-0.0227), tensor(0.3225), tensor(-0.0... |
9 | capsule | UBERON | 3893 | A cover or envelope partly or wholly surroundi... | [[tensor(-0.0116), tensor(0.2860), tensor(0.49... |
Protein enrichments¶
Now, we will encrich the protein nodes with their description and sequence. We will query the UniProt via API to get the descp and sequence. For this, we will first need to get all the node IDs.
from dataclasses import dataclass
# Create a dataclass to hold the node attributes
@dataclass
class GeneAttr:
"""Dataclass to hold the attributes of a gene node."""
id: str
name: str
# Make description optional
# If not provided, it will be set to None
description: str = None
sequence: str = None
Get node IDs¶
# Extract all gene IDs from the graph
dic_gene_ids = {}
for n in tqdm(kg.nodes):
if kg.nodes[n].get('node_type') != 'gene/protein' and kg.nodes[n].get('node_source') != 'NCBI':
continue
# Get the node attributes
node_attr = kg.nodes[n]
# Create a GeneAttr object
gene_attr = GeneAttr(
id=node_attr.get('node_id'),
name=n,
description=node_attr.get('description'),
sequence=node_attr.get('sequence')
)
# Add the gene_attr object to the dictionary
dic_gene_ids[node_attr.get('node_id')] = gene_attr
# Check the number of gene IDs
len(dic_gene_ids)
0%| | 0/129262 [00:00<?, ?it/s]
100%|██████████| 129262/129262 [00:00<00:00, 1131144.04it/s]
27609
Submit a job to UniProt to map the Gene ID to its description and sequence¶
Here we show 2 ways to get description and sequence of a gene:
- Most of the biomedical graphs offer gene names, which can be used to extract sequence and description using the EnrichmentWithUniProt class in the utils of T2KG
- Some graphs, like PrimeKG, also offer gene IDs, which can also be used to extract sequence and descriotion using the snippet defined (borrowed from UniProt)
import time
import requests
from requests.adapters import HTTPAdapter, Retry
from urllib.parse import urlparse, parse_qs, urlencode
# Define variables to perform UniProt ID mapping
# Adopted from https://www.uniprot.org/help/id_mapping
API_URL = "https://rest.uniprot.org"
POLLING_INTERVAL = 5
retries = Retry(total=5, backoff_factor=0.25, status_forcelist=[500, 502, 503, 504])
session = requests.Session()
session.mount("https://", HTTPAdapter(max_retries=retries))
def submit_id_mapping(from_db, to_db, ids) -> str:
"""
Function to submit a job to perform ID mapping.
Args:
from_db (str): The source database.
to_db (str): The target database.
ids (list): The list of IDs to map.
Returns:
str: The job ID.
"""
request = requests.post(f"{API_URL}/idmapping/run",
data={"from": from_db,
"to": to_db,
"ids": ",".join(ids)},)
try:
request.raise_for_status()
except requests.HTTPError:
print(request.json())
raise
return request.json()["jobId"]
def check_id_mapping_results_ready(job_id):
"""
Function to check if the ID mapping results are ready.
Args:
job_id (str): The job ID.
Returns:
bool: True if the results are ready, False otherwise.
"""
while True:
request = session.get(f"{API_URL}/idmapping/status/{job_id}")
try:
request.raise_for_status()
except requests.HTTPError:
print(request.json())
raise
j = request.json()
if "jobStatus" in j:
if j["jobStatus"] in ("NEW", "RUNNING"):
print(f"Retrying in {POLLING_INTERVAL}s")
time.sleep(POLLING_INTERVAL)
else:
raise Exception(j["jobStatus"])
else:
return bool(j["results"] or j["failedIds"])
def get_id_mapping_results_link(job_id):
"""
Function to get the link to the ID mapping results.
Args:
job_id (str): The job ID.
Returns:
str: The link to the ID mapping results.
"""
url = f"{API_URL}/idmapping/details/{job_id}"
request = requests.Session().get(url)
try:
request.raise_for_status()
except requests.HTTPError:
print(request.json())
raise
return request.json()["redirectURL"]
def decode_results(response, file_format, compressed):
"""
Function to decode the ID mapping results.
Args:
response (requests.Response): The response object.
file_format (str): The file format of the results.
compressed (bool): Whether the results are compressed.
Returns:
str: The ID mapping results
"""
if compressed:
decompressed = zlib.decompress(response.content, 16 + zlib.MAX_WBITS)
if file_format == "json":
j = json.loads(decompressed.decode("utf-8"))
return j
elif file_format == "tsv":
return [line for line in decompressed.decode("utf-8").split("\n") if line]
elif file_format == "xlsx":
return [decompressed]
elif file_format == "xml":
return [decompressed.decode("utf-8")]
else:
return decompressed.decode("utf-8")
elif file_format == "json":
return response.json()
elif file_format == "tsv":
return [line for line in response.text.split("\n") if line]
elif file_format == "xlsx":
return [response.content]
elif file_format == "xml":
return [response.text]
return response.text
def get_id_mapping_results_stream(url):
"""
Function to get the ID mapping results from a stream.
Args:
url (str): The URL to the ID mapping results.
Returns:
str: The ID mapping results.
"""
if "/stream/" not in url:
url = url.replace("/results/", "/results/stream/")
request = session.get(url)
try:
request.raise_for_status()
except requests.HTTPError:
print(request.json())
raise
parsed = urlparse(url)
query = parse_qs(parsed.query)
file_format = query["format"][0] if "format" in query else "json"
compressed = (
query["compressed"][0].lower() == "true" if "compressed" in query else False
)
return decode_results(request, file_format, compressed)
For the sake of time, we will use only the first 5 nodes
# Add the top 5 gene IDs to a list
inputs = list(dic_gene_ids.keys())[:5]
# Submit the job to perform ID mapping
job_id = submit_id_mapping(
from_db="GeneID", to_db="UniProtKB", ids=inputs
)
# Print the job ID
print (f"Job ID: {job_id}")
# Check the status of the job
status = check_id_mapping_results_ready(job_id)
# Print the status of the job
print (f"Job status: {status}")
Job ID: 29s22juOq2 Job status: True
Check and get the ID mapping results
if check_id_mapping_results_ready(job_id):
link = get_id_mapping_results_link(job_id)
mapping_results = get_id_mapping_results_stream(link)
print(mapping_results)
{'results': [{'from': '9796', 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)', 'primaryAccession': 'Q92561', 'secondaryAccessions': ['D3DSR1', 'Q8N4I9'], 'uniProtkbId': 'PHYIP_HUMAN', 'entryAudit': {'firstPublicDate': '1997-11-01', 'lastAnnotationUpdateDate': '2025-04-09', 'lastSequenceUpdateDate': '1997-02-01', 'entryVersion': 176, 'sequenceVersion': 1}, 'annotationScore': 4.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'recommendedName': {'fullName': {'value': 'Phytanoyl-CoA hydroxylase-interacting protein'}}, 'alternativeNames': [{'fullName': {'value': 'Phytanoyl-CoA hydroxylase-associated protein 1'}, 'shortNames': [{'value': 'PAHX-AP1'}, {'value': 'PAHXAP1'}]}]}, 'genes': [{'geneName': {'value': 'PHYHIP'}, 'synonyms': [{'value': 'DYRK1AP3'}, {'value': 'KIAA0273'}]}], 'comments': [{'texts': [{'value': 'Its interaction with PHYH suggests a role in the development of the central system'}], 'commentType': 'FUNCTION'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10686344'}], 'value': 'Interacts with PHYH and ADGRB1'}], 'commentType': 'SUBUNIT'}, {'commentType': 'INTERACTION', 'interactions': [{'interactantOne': {'uniProtKBAccession': 'Q92561', 'intActId': 'EBI-716478'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NXK6', 'geneName': 'PAQR5', 'intActId': 'EBI-10316423'}, 'numberOfExperiments': 2, 'organismDiffer': False}]}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10686344'}], 'value': 'Highly expressed in the brain'}], 'commentType': 'TISSUE SPECIFICITY'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000305'}], 'value': 'Belongs to the PHYHIP family'}], 'commentType': 'SIMILARITY'}, {'commentType': 'SEQUENCE CAUTION', 'sequenceCautionType': 'Erroneous initiation', 'sequence': 'BAA13402.2', 'evidences': [{'evidenceCode': 'ECO:0000305'}]}], 'features': [{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 330, 'modifier': 'EXACT'}}, 'description': 'Phytanoyl-CoA hydroxylase-interacting protein', 'featureId': 'PRO_0000058416'}, {'type': 'Domain', 'location': {'start': {'value': 6, 'modifier': 'EXACT'}, 'end': {'value': 115, 'modifier': 'EXACT'}}, 'description': 'Fibronectin type-III', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00316'}]}, {'type': 'Glycosylation', 'location': {'start': {'value': 14, 'modifier': 'EXACT'}, 'end': {'value': 14, 'modifier': 'EXACT'}}, 'description': 'N-linked (GlcNAc...) asparagine', 'evidences': [{'evidenceCode': 'ECO:0000255'}], 'featureId': ''}, {'type': 'Glycosylation', 'location': {'start': {'value': 325, 'modifier': 'EXACT'}, 'end': {'value': 325, 'modifier': 'EXACT'}}, 'description': 'N-linked (GlcNAc...) asparagine', 'evidences': [{'evidenceCode': 'ECO:0000255'}], 'featureId': ''}, {'type': 'Natural variant', 'location': {'start': {'value': 21, 'modifier': 'EXACT'}, 'end': {'value': 21, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs11547660', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs11547660'}], 'featureId': 'VAR_018475', 'alternativeSequence': {'originalSequence': 'R', 'alternativeSequences': ['S']}}, {'type': 'Sequence conflict', 'location': {'start': {'value': 150, 'modifier': 'EXACT'}, 'end': {'value': 150, 'modifier': 'EXACT'}}, 'description': 'in Ref. 3; AAH34034/AAH35940', 'evidences': [{'evidenceCode': 'ECO:0000305'}], 'alternativeSequence': {'originalSequence': 'Q', 'alternativeSequences': ['E']}}, {'type': 'Sequence conflict', 'location': {'start': {'value': 245, 'modifier': 'EXACT'}, 'end': {'value': 245, 'modifier': 'EXACT'}}, 'description': 'in Ref. 3; AAH34034/AAH35940', 'evidences': [{'evidenceCode': 'ECO:0000305'}], 'alternativeSequence': {'originalSequence': 'L', 'alternativeSequences': ['M']}}, {'type': 'Sequence conflict', 'location': {'start': {'value': 254, 'modifier': 'EXACT'}, 'end': {'value': 254, 'modifier': 'EXACT'}}, 'description': 'in Ref. 3; AAH34034/AAH35940', 'evidences': [{'evidenceCode': 'ECO:0000305'}], 'alternativeSequence': {'originalSequence': 'L', 'alternativeSequences': ['R']}}], 'keywords': [{'id': 'KW-0325', 'category': 'PTM', 'name': 'Glycoprotein'}, {'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '9039502', 'citationType': 'journal article', 'authors': ['Nagase T.', 'Seki N.', 'Ishikawa K.', 'Ohira M.', 'Kawarabayasi Y.', 'Ohara O.', 'Tanaka A.', 'Kotani H.', 'Miyajima N.', 'Nomura N.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '9039502'}, {'database': 'DOI', 'id': '10.1093/dnares/3.5.321'}], 'title': 'Prediction of the coding sequences of unidentified human genes. VI. The coding sequences of 80 new genes (KIAA0201-KIAA0280) deduced by analysis of cDNA clones from cell line KG-1 and brain.', 'publicationDate': '1996', 'journal': 'DNA Res.', 'firstPage': '321', 'lastPage': '329', 'volume': '3'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]'], 'referenceComments': [{'value': 'Bone marrow', 'type': 'TISSUE'}]}, {'referenceNumber': 2, 'citation': {'id': 'CI-5GBD0VIIJ7C63', 'citationType': 'submission', 'authors': ['Mural R.J.', 'Istrail S.', 'Sutton G.G.', 'Florea L.', 'Halpern A.L.', 'Mobarry C.M.', 'Lippert R.', 'Walenz B.', 'Shatkay H.', 'Dew I.', 'Miller J.R.', 'Flanigan M.J.', 'Edwards N.J.', 'Bolanos R.', 'Fasulo D.', 'Halldorsson B.V.', 'Hannenhalli S.', 'Turner R.', 'Yooseph S.', 'Lu F.', 'Nusskern D.R.', 'Shue B.C.', 'Zheng X.H.', 'Zhong F.', 'Delcher A.L.', 'Huson D.H.', 'Kravitz S.A.', 'Mouchard L.', 'Reinert K.', 'Remington K.A.', 'Clark A.G.', 'Waterman M.S.', 'Eichler E.E.', 'Adams M.D.', 'Hunkapiller M.W.', 'Myers E.W.', 'Venter J.C.'], 'publicationDate': 'SEP-2005', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 3, 'citation': {'id': '15489334', 'citationType': 'journal article', 'authoringGroup': ['The MGC Project Team'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15489334'}, {'database': 'DOI', 'id': '10.1101/gr.2596504'}], 'title': 'The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).', 'publicationDate': '2004', 'journal': 'Genome Res.', 'firstPage': '2121', 'lastPage': '2127', 'volume': '14'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]'], 'referenceComments': [{'value': 'Brain', 'type': 'TISSUE'}]}, {'referenceNumber': 4, 'citation': {'id': '10686344', 'citationType': 'journal article', 'authors': ['Lee Z.H.', 'Kim H.-H.', 'Ahn K.Y.', 'Seo K.H.', 'Kim J.K.', 'Bae C.S.', 'Kim K.K.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '10686344'}, {'database': 'DOI', 'id': '10.1016/s0169-328x(99)00304-6'}], 'title': 'Identification of a brain specific protein that associates with a Refsum disease gene product, phytanoyl-CoA alpha-hydroxylase.', 'publicationDate': '2000', 'journal': 'Brain Res. Mol. Brain Res.', 'firstPage': '237', 'lastPage': '247', 'volume': '75'}, 'referencePositions': ['TISSUE SPECIFICITY', 'INTERACTION WITH PHYH']}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'D87463', 'properties': [{'key': 'ProteinId', 'value': 'BAA13402.2'}, {'key': 'Status', 'value': 'ALT_INIT'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'CH471080', 'properties': [{'key': 'ProteinId', 'value': 'EAW63693.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'CH471080', 'properties': [{'key': 'ProteinId', 'value': 'EAW63694.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BC034034', 'properties': [{'key': 'ProteinId', 'value': 'AAH34034.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'BC035940', 'properties': [{'key': 'ProteinId', 'value': 'AAH35940.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'CCDS', 'id': 'CCDS43723.1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'RefSeq', 'id': 'NP_001092805.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001099335.2'}]}, {'database': 'RefSeq', 'id': 'NP_001350240.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001363311.2'}]}, {'database': 'RefSeq', 'id': 'NP_001350241.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001363312.2'}]}, {'database': 'RefSeq', 'id': 'NP_055574.3', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_014759.3'}]}, {'database': 'RefSeq', 'id': 'XP_006716479.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_006716416.1'}]}, {'database': 'AlphaFoldDB', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID', 'id': '115139', 'properties': [{'key': 'Interactions', 'value': '43'}]}, {'database': 'IntAct', 'id': 'Q92561', 'properties': [{'key': 'Interactions', 'value': '39'}]}, {'database': 'STRING', 'id': '9606.ENSP00000320017', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GlyCosmos', 'id': 'Q92561', 'properties': [{'key': 'glycosylation', 'value': '2 sites, No reported glycans'}]}, {'database': 'GlyGen', 'id': 'Q92561', 'properties': [{'key': 'glycosylation', 'value': '3 sites, 1 O-linked glycan (1 site)'}]}, {'database': 'iPTMnet', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhosphoSitePlus', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioMuta', 'id': 'PHYHIP', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DMDM', 'id': '2495731', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'jPOST', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MassIVE', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PaxDb', 'id': '9606-ENSP00000415491', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '75317', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '5280', 'properties': [{'key': 'antibodies', 'value': '142 antibodies from 23 providers'}]}, {'database': 'DNASU', 'id': '9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000321613.7', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000320017.3'}, {'key': 'GeneId', 'value': 'ENSG00000168490.14'}]}, {'database': 'Ensembl', 'id': 'ENST00000454243.7', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000415491.2'}, {'key': 'GeneId', 'value': 'ENSG00000168490.14'}]}, {'database': 'GeneID', 'id': '9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MANE-Select', 'id': 'ENST00000454243.7', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000415491.2'}, {'key': 'RefSeqNucleotideId', 'value': 'NM_014759.5'}, {'key': 'RefSeqProteinId', 'value': 'NP_055574.3'}]}, {'database': 'UCSC', 'id': 'uc003xbj.5', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'AGR', 'id': 'HGNC:16865', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneCards', 'id': 'PHYHIP', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:16865', 'properties': [{'key': 'GeneName', 'value': 'PHYHIP'}]}, {'database': 'HPA', 'id': 'ENSG00000168490', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Tissue enriched (brain)'}]}, {'database': 'MIM', 'id': '608511', 'properties': [{'key': 'Type', 'value': 'gene'}]}, {'database': 'neXtProt', 'id': 'NX_Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OpenTargets', 'id': 'ENSG00000168490', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PharmGKB', 'id': 'PA33281', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000168490', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'eggNOG', 'id': 'ENOG502QQIT', 'properties': [{'key': 'ToxonomicScope', 'value': 'Eukaryota'}]}, {'database': 'GeneTree', 'id': 'ENSGT00390000014563', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_054218_1_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'InParanoid', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'SPGDHFC', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '6101761at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhylomeDB', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'TreeFam', 'id': 'TF314485', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PathwayCommons', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SignaLink', 'id': 'Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '9796', 'properties': [{'key': 'hits', 'value': '15 hits in 1155 CRISPR screens'}]}, {'database': 'CD-CODE', 'id': 'FB4E32DD', 'properties': [{'key': 'EntryName', 'value': 'Presynaptic clusters and postsynaptic densities'}]}, {'database': 'ChiTaRS', 'id': 'PHYHIP', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'GenomeRNAi', 'id': '9796', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pharos', 'id': 'Q92561', 'properties': [{'key': 'DevelopmentLevel', 'value': 'Tdark'}]}, {'database': 'PRO', 'id': 'PR:Q92561', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome 8'}]}, {'database': 'RNAct', 'id': 'Q92561', 'properties': [{'key': 'moleculeType', 'value': 'protein'}]}, {'database': 'Bgee', 'id': 'ENSG00000168490', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in right hemisphere of cerebellum and 132 other cell types or tissues'}]}, {'database': 'ExpressionAtlas', 'id': 'Q92561', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0005737', 'properties': [{'key': 'GoTerm', 'value': 'C:cytoplasm'}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '15694837'}]}, {'database': 'GO', 'id': 'GO:1990782', 'properties': [{'key': 'GoTerm', 'value': 'F:protein tyrosine kinase binding'}, {'key': 'GoEvidenceType', 'value': 'IPI:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000353', 'source': 'PubMed', 'id': '15694837'}]}, {'database': 'GO', 'id': 'GO:0008104', 'properties': [{'key': 'GoTerm', 'value': 'P:protein localization'}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '15694837'}]}, {'database': 'CDD', 'id': 'cd00063', 'properties': [{'key': 'EntryName', 'value': 'FN3'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'FunFam', 'id': '2.60.40.10:FF:000277', 'properties': [{'key': 'EntryName', 'value': 'Phytanoyl-CoA hydroxylase-interacting protein-like protein'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Gene3D', 'id': '2.60.40.10', 'properties': [{'key': 'EntryName', 'value': 'Immunoglobulins'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR003961', 'properties': [{'key': 'EntryName', 'value': 'FN3_dom'}]}, {'database': 'InterPro', 'id': 'IPR036116', 'properties': [{'key': 'EntryName', 'value': 'FN3_sf'}]}, {'database': 'InterPro', 'id': 'IPR013783', 'properties': [{'key': 'EntryName', 'value': 'Ig-like_fold'}]}, {'database': 'InterPro', 'id': 'IPR042868', 'properties': [{'key': 'EntryName', 'value': 'PHYHIP/PHYHIPL'}]}, {'database': 'InterPro', 'id': 'IPR045545', 'properties': [{'key': 'EntryName', 'value': 'PHYIP/PHIPL_C'}]}, {'database': 'PANTHER', 'id': 'PTHR15698:SF9', 'properties': [{'key': 'EntryName', 'value': 'PHYTANOYL-COA HYDROXYLASE-INTERACTING PROTEIN'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR15698', 'properties': [{'key': 'EntryName', 'value': 'PROTEIN CBG15099'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF19281', 'properties': [{'key': 'EntryName', 'value': 'PHYHIP_C'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF49265', 'properties': [{'key': 'EntryName', 'value': 'Fibronectin type III'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50853', 'properties': [{'key': 'EntryName', 'value': 'FN3'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MELLSTPHSIEINNITCDSFRISWAMEDSDLERVTHYFIDLNKKENKNSNKFKHRDVPTKLVAKAVPLPMTVRGHWFLSPRTEYSVAVQTAVKQSDGEYLVSGWSETVEFCTGDYAKEHLAQLQEKAEQIAGRMLRFSVFYRNHHKEYFQHARTHCGNMLQPYLKDNSGSHGSPTSGMLHGVFFSCNTEFNTGQPPQDSPYGRWRFQIPAQRLFNPSTNLYFADFYCMYTAYHYAILVLAPKGSLGDRFCRDRLPLLDIACNKFLTCSVEDGELVFRHAQDLILEIIYTEPVDLSLGTLGEISGHQLMSLSTADAKKDPSCKTCNISVGR', 'length': 330, 'molWeight': 37573, 'crc64': '777199E6BB071D1C', 'md5': 'DE19CBC5CB9CD947C17A4920C019DFC8'}, 'extraAttributes': {'countByCommentType': {'FUNCTION': 1, 'SUBUNIT': 1, 'INTERACTION': 1, 'TISSUE SPECIFICITY': 1, 'SIMILARITY': 1, 'SEQUENCE CAUTION': 1}, 'countByFeatureType': {'Chain': 1, 'Domain': 1, 'Glycosylation': 2, 'Natural variant': 1, 'Sequence conflict': 3}, 'uniParcId': 'UPI0000139557'}}}, {'from': '56992', 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)', 'primaryAccession': 'Q9NS87', 'secondaryAccessions': ['Q17RV9', 'Q69YL6', 'Q96JX7', 'Q9H280'], 'uniProtkbId': 'KIF15_HUMAN', 'entryAudit': {'firstPublicDate': '2008-04-08', 'lastAnnotationUpdateDate': '2025-04-09', 'lastSequenceUpdateDate': '2000-10-01', 'entryVersion': 177, 'sequenceVersion': 1}, 'annotationScore': 5.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'recommendedName': {'fullName': {'value': 'Kinesin-like protein KIF15'}}, 'alternativeNames': [{'fullName': {'value': 'Kinesin-like protein 2'}, 'shortNames': [{'value': 'hKLP2'}]}, {'fullName': {'value': 'Kinesin-like protein 7'}}, {'fullName': {'value': 'Serologically defined breast cancer antigen NY-BR-62'}}]}, 'genes': [{'geneName': {'value': 'KIF15'}, 'synonyms': [{'value': 'KLP2'}, {'value': 'KNSL7'}]}], 'comments': [{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000250'}], 'value': 'Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly'}], 'commentType': 'FUNCTION'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10878014'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '12612055'}], 'value': 'Interacts with MKI67 and TPX2'}], 'commentType': 'SUBUNIT'}, {'commentType': 'INTERACTION', 'interactions': [{'interactantOne': {'uniProtKBAccession': 'Q9NS87', 'intActId': 'EBI-712159'}, 'interactantTwo': {'uniProtKBAccession': 'P46013', 'geneName': 'MKI67', 'intActId': 'EBI-876367'}, 'numberOfExperiments': 3, 'organismDiffer': False}]}, {'commentType': 'SUBCELLULAR LOCATION', 'note': {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000250'}], 'value': 'Detected during the interphase in the cytoplasm as finely punctuate pattern and irregularly shaped dots. Detected during mitosis on the mitotic spindle. Colocalizes with TPX2 in mitosis. Localizes at the central spindle at anaphase (By similarity). Localizes at the sites of invaginating cell membranes, a position that corresponds to the location of the contractile actomyosin ring of dividing cells (By similarity). Colocalizes with actin in interphase (By similarity). Colocalizes in dendrites and in growth cone of axons with microtubules (By similarity)'}]}, 'subcellularLocations': [{'location': {'value': 'Cytoplasm', 'id': 'SL-0086'}}, {'location': {'value': 'Cytoplasm, cytoskeleton, spindle', 'id': 'SL-0251'}}]}, {'commentType': 'ALTERNATIVE PRODUCTS', 'events': ['Alternative splicing'], 'isoforms': [{'name': {'value': '1'}, 'isoformIds': ['Q9NS87-1'], 'isoformSequenceStatus': 'Displayed'}, {'name': {'value': '2'}, 'isoformIds': ['Q9NS87-2'], 'sequenceIds': ['VSP_032753'], 'isoformSequenceStatus': 'Described'}, {'name': {'value': '3'}, 'isoformIds': ['Q9NS87-3'], 'sequenceIds': ['VSP_032752'], 'isoformSequenceStatus': 'Described'}, {'name': {'value': '4'}, 'isoformIds': ['Q9NS87-4'], 'sequenceIds': ['VSP_032754', 'VSP_032755'], 'isoformSequenceStatus': 'Described'}]}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '12747765'}], 'value': 'Expressed in testis, colon, thymus and in breast cancer'}], 'commentType': 'TISSUE SPECIFICITY'}, {'commentType': 'DISEASE', 'disease': {'diseaseId': 'Braddock-Carey syndrome 2', 'diseaseAccession': 'DI-06453', 'acronym': 'BRDCS2', 'description': 'An autosomal recessive disease characterized by microcephaly, congenital thrombocytopenia, and facial dysmorphisms including Pierre-Robin sequence.', 'diseaseCrossReference': {'database': 'MIM', 'id': '619981'}, 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '28150392'}]}, 'note': {'texts': [{'value': 'The disease may be caused by variants affecting the gene represented in this entry'}]}}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00283'}], 'value': 'Belongs to the TRAFAC class myosin-kinesin ATPase superfamily. Kinesin family. KLP2 subfamily'}], 'commentType': 'SIMILARITY'}, {'commentType': 'SEQUENCE CAUTION', 'sequenceCautionType': 'Frameshift', 'sequence': 'AAG48261.1', 'evidences': [{'evidenceCode': 'ECO:0000305'}]}], 'features': [{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 1388, 'modifier': 'EXACT'}}, 'description': 'Kinesin-like protein KIF15', 'featureId': 'PRO_0000328684'}, {'type': 'Domain', 'location': {'start': {'value': 26, 'modifier': 'EXACT'}, 'end': {'value': 363, 'modifier': 'EXACT'}}, 'description': 'Kinesin motor', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00283'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 25, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Region', 'location': {'start': {'value': 1228, 'modifier': 'EXACT'}, 'end': {'value': 1250, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 368, 'modifier': 'EXACT'}, 'end': {'value': 1388, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000255'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 10, 'modifier': 'EXACT'}, 'end': {'value': 22, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Binding site', 'location': {'start': {'value': 109, 'modifier': 'EXACT'}, 'end': {'value': 116, 'modifier': 'EXACT'}}, 'description': '', 'featureCrossReferences': [{'database': 'ChEBI', 'id': 'CHEBI:30616'}], 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00283'}], 'ligand': {'name': 'ATP', 'id': 'ChEBI:CHEBI:30616'}}, {'type': 'Modified residue', 'location': {'start': {'value': 399, 'modifier': 'EXACT'}, 'end': {'value': 399, 'modifier': 'EXACT'}}, 'description': 'Phosphothreonine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '23186163'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 568, 'modifier': 'EXACT'}, 'end': {'value': 568, 'modifier': 'EXACT'}}, 'description': 'Phosphoserine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '18669648'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 1009, 'modifier': 'EXACT'}, 'end': {'value': 1009, 'modifier': 'EXACT'}}, 'description': 'N6-acetyllysine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '19608861'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 1141, 'modifier': 'EXACT'}, 'end': {'value': 1141, 'modifier': 'EXACT'}}, 'description': 'Phosphoserine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '23186163'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 1169, 'modifier': 'EXACT'}, 'end': {'value': 1169, 'modifier': 'EXACT'}}, 'description': 'Phosphoserine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '20068231'}]}, {'type': 'Alternative sequence', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 952, 'modifier': 'EXACT'}}, 'description': 'in isoform 3', 'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '14702039'}], 'featureId': 'VSP_032752', 'alternativeSequence': {}}, {'type': 'Alternative sequence', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 97, 'modifier': 'EXACT'}}, 'description': 'in isoform 2', 'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '15489334'}], 'featureId': 'VSP_032753', 'alternativeSequence': {}}, {'type': 'Alternative sequence', 'location': {'start': {'value': 1092, 'modifier': 'EXACT'}, 'end': {'value': 1119, 'modifier': 'EXACT'}}, 'description': 'in isoform 4', 'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '12747765'}], 'featureId': 'VSP_032754', 'alternativeSequence': {'originalSequence': 'ELTKKEALIQELQHKLNQKKEEVEQKKN', 'alternativeSequences': ['RTDQERSPDSGTSAQAKPKERGSRTEEE']}}, {'type': 'Alternative sequence', 'location': {'start': {'value': 1200, 'modifier': 'EXACT'}, 'end': {'value': 1388, 'modifier': 'EXACT'}}, 'description': 'in isoform 4', 'evidences': [{'evidenceCode': 'ECO:0000303', 'source': 'PubMed', 'id': '12747765'}], 'featureId': 'VSP_032755', 'alternativeSequence': {}}, {'type': 'Natural variant', 'location': {'start': {'value': 211, 'modifier': 'EXACT'}, 'end': {'value': 211, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs34862960', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs34862960'}], 'featureId': 'VAR_042464', 'alternativeSequence': {'originalSequence': 'A', 'alternativeSequences': ['V']}}, {'type': 'Natural variant', 'location': {'start': {'value': 501, 'modifier': 'EXACT'}, 'end': {'value': 1388, 'modifier': 'EXACT'}}, 'description': 'in BRDCS2; uncertain significance; dbSNP:rs1002572191', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs1002572191'}], 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '28150392'}], 'featureId': 'VAR_087453', 'alternativeSequence': {}}, {'type': 'Natural variant', 'location': {'start': {'value': 996, 'modifier': 'EXACT'}, 'end': {'value': 996, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs11710339', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs11710339'}], 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '12747765'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '17974005'}], 'featureId': 'VAR_042465', 'alternativeSequence': {'originalSequence': 'T', 'alternativeSequences': ['S']}}, {'type': 'Natural variant', 'location': {'start': {'value': 1206, 'modifier': 'EXACT'}, 'end': {'value': 1206, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs3804583', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs3804583'}], 'featureId': 'VAR_042466', 'alternativeSequence': {'originalSequence': 'L', 'alternativeSequences': ['M']}}, {'type': 'Natural variant', 'location': {'start': {'value': 1272, 'modifier': 'EXACT'}, 'end': {'value': 1272, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs17076986', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs17076986'}], 'featureId': 'VAR_042467', 'alternativeSequence': {'originalSequence': 'E', 'alternativeSequences': ['D']}}, {'type': 'Sequence conflict', 'location': {'start': {'value': 1057, 'modifier': 'EXACT'}, 'end': {'value': 1057, 'modifier': 'EXACT'}}, 'description': 'in Ref. 5; AAG48261', 'evidences': [{'evidenceCode': 'ECO:0000305'}], 'alternativeSequence': {'originalSequence': 'E', 'alternativeSequences': ['K']}}, {'type': 'Beta strand', 'location': {'start': {'value': 28, 'modifier': 'EXACT'}, 'end': {'value': 33, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 50, 'modifier': 'EXACT'}, 'end': {'value': 55, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 58, 'modifier': 'EXACT'}, 'end': {'value': 61, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 64, 'modifier': 'EXACT'}, 'end': {'value': 66, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 68, 'modifier': 'EXACT'}, 'end': {'value': 71, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 73, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 82, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 103, 'modifier': 'EXACT'}, 'end': {'value': 108, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 115, 'modifier': 'EXACT'}, 'end': {'value': 119, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 135, 'modifier': 'EXACT'}, 'end': {'value': 150, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 158, 'modifier': 'EXACT'}, 'end': {'value': 172, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 174, 'modifier': 'EXACT'}, 'end': {'value': 176, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 186, 'modifier': 'EXACT'}, 'end': {'value': 189, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 195, 'modifier': 'EXACT'}, 'end': {'value': 198, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 208, 'modifier': 'EXACT'}, 'end': {'value': 226, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 228, 'modifier': 'EXACT'}, 'end': {'value': 231, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 236, 'modifier': 'EXACT'}, 'end': {'value': 250, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 253, 'modifier': 'EXACT'}, 'end': {'value': 265, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 290, 'modifier': 'EXACT'}, 'end': {'value': 306, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 307, 'modifier': 'EXACT'}, 'end': {'value': 309, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 315, 'modifier': 'EXACT'}, 'end': {'value': 317, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 319, 'modifier': 'EXACT'}, 'end': {'value': 323, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 325, 'modifier': 'EXACT'}, 'end': {'value': 327, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Beta strand', 'location': {'start': {'value': 330, 'modifier': 'EXACT'}, 'end': {'value': 340, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 344, 'modifier': 'EXACT'}, 'end': {'value': 346, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}, {'type': 'Helix', 'location': {'start': {'value': 347, 'modifier': 'EXACT'}, 'end': {'value': 360, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PDB', 'id': '4BN2'}]}], 'keywords': [{'id': 'KW-0002', 'category': 'Technical term', 'name': '3D-structure'}, {'id': 'KW-0007', 'category': 'PTM', 'name': 'Acetylation'}, {'id': 'KW-0025', 'category': 'Coding sequence diversity', 'name': 'Alternative splicing'}, {'id': 'KW-0067', 'category': 'Ligand', 'name': 'ATP-binding'}, {'id': 'KW-0175', 'category': 'Domain', 'name': 'Coiled coil'}, {'id': 'KW-0963', 'category': 'Cellular component', 'name': 'Cytoplasm'}, {'id': 'KW-0206', 'category': 'Cellular component', 'name': 'Cytoskeleton'}, {'id': 'KW-0493', 'category': 'Cellular component', 'name': 'Microtubule'}, {'id': 'KW-0505', 'category': 'Molecular function', 'name': 'Motor protein'}, {'id': 'KW-0547', 'category': 'Ligand', 'name': 'Nucleotide-binding'}, {'id': 'KW-0597', 'category': 'PTM', 'name': 'Phosphoprotein'}, {'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '10878014', 'citationType': 'journal article', 'authors': ['Sueishi M.', 'Takagi M.', 'Yoneda Y.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '10878014'}, {'database': 'DOI', 'id': '10.1074/jbc.m003879200'}], 'title': 'The forkhead-associated domain of Ki-67 antigen interacts with the novel kinesin-like protein Hklp2.', 'publicationDate': '2000', 'journal': 'J. Biol. Chem.', 'firstPage': '28888', 'lastPage': '28892', 'volume': '275'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [MRNA] (ISOFORM 1)', 'INTERACTION WITH MKI67', 'SUBCELLULAR LOCATION']}, {'referenceNumber': 2, 'citation': {'id': '14702039', 'citationType': 'journal article', 'authors': ['Ota T.', 'Suzuki Y.', 'Nishikawa T.', 'Otsuki T.', 'Sugiyama T.', 'Irie R.', 'Wakamatsu A.', 'Hayashi K.', 'Sato H.', 'Nagai K.', 'Kimura K.', 'Makita H.', 'Sekine M.', 'Obayashi M.', 'Nishi T.', 'Shibahara T.', 'Tanaka T.', 'Ishii S.', 'Yamamoto J.', 'Saito K.', 'Kawai Y.', 'Isono Y.', 'Nakamura Y.', 'Nagahari K.', 'Murakami K.', 'Yasuda T.', 'Iwayanagi T.', 'Wagatsuma M.', 'Shiratori A.', 'Sudo H.', 'Hosoiri T.', 'Kaku Y.', 'Kodaira H.', 'Kondo H.', 'Sugawara M.', 'Takahashi M.', 'Kanda K.', 'Yokoi T.', 'Furuya T.', 'Kikkawa E.', 'Omura Y.', 'Abe K.', 'Kamihara K.', 'Katsuta N.', 'Sato K.', 'Tanikawa M.', 'Yamazaki M.', 'Ninomiya K.', 'Ishibashi T.', 'Yamashita H.', 'Murakawa K.', 'Fujimori K.', 'Tanai H.', 'Kimata M.', 'Watanabe M.', 'Hiraoka S.', 'Chiba Y.', 'Ishida S.', 'Ono Y.', 'Takiguchi S.', 'Watanabe S.', 'Yosida M.', 'Hotuta T.', 'Kusano J.', 'Kanehori K.', 'Takahashi-Fujii A.', 'Hara H.', 'Tanase T.-O.', 'Nomura Y.', 'Togiya S.', 'Komai F.', 'Hara R.', 'Takeuchi K.', 'Arita M.', 'Imose N.', 'Musashino K.', 'Yuuki H.', 'Oshima A.', 'Sasaki N.', 'Aotsuka S.', 'Yoshikawa Y.', 'Matsunawa H.', 'Ichihara T.', 'Shiohata N.', 'Sano S.', 'Moriya S.', 'Momiyama H.', 'Satoh N.', 'Takami S.', 'Terashima Y.', 'Suzuki O.', 'Nakagawa S.', 'Senoh A.', 'Mizoguchi H.', 'Goto Y.', 'Shimizu F.', 'Wakebe H.', 'Hishigaki H.', 'Watanabe T.', 'Sugiyama A.', 'Takemoto M.', 'Kawakami B.', 'Yamazaki M.', 'Watanabe K.', 'Kumagai A.', 'Itakura S.', 'Fukuzumi Y.', 'Fujimori Y.', 'Komiyama M.', 'Tashiro H.', 'Tanigami A.', 'Fujiwara T.', 'Ono T.', 'Yamada K.', 'Fujii Y.', 'Ozaki K.', 'Hirao M.', 'Ohmori Y.', 'Kawabata A.', 'Hikiji T.', 'Kobatake N.', 'Inagaki H.', 'Ikema Y.', 'Okamoto S.', 'Okitani R.', 'Kawakami T.', 'Noguchi S.', 'Itoh T.', 'Shigeta K.', 'Senba T.', 'Matsumura K.', 'Nakajima Y.', 'Mizuno T.', 'Morinaga M.', 'Sasaki M.', 'Togashi T.', 'Oyama M.', 'Hata H.', 'Watanabe M.', 'Komatsu T.', 'Mizushima-Sugano J.', 'Satoh T.', 'Shirai Y.', 'Takahashi Y.', 'Nakagawa K.', 'Okumura K.', 'Nagase T.', 'Nomura N.', 'Kikuchi H.', 'Masuho Y.', 'Yamashita R.', 'Nakai K.', 'Yada T.', 'Nakamura Y.', 'Ohara O.', 'Isogai T.', 'Sugano S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '14702039'}, {'database': 'DOI', 'id': '10.1038/ng1285'}], 'title': 'Complete sequencing and characterization of 21,243 full-length human cDNAs.', 'publicationDate': '2004', 'journal': 'Nat. Genet.', 'firstPage': '40', 'lastPage': '45', 'volume': '36'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 3)'], 'referenceComments': [{'value': 'Placenta', 'type': 'TISSUE'}]}, {'referenceNumber': 3, 'citation': {'id': '15489334', 'citationType': 'journal article', 'authoringGroup': ['The MGC Project Team'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15489334'}, {'database': 'DOI', 'id': '10.1101/gr.2596504'}], 'title': 'The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).', 'publicationDate': '2004', 'journal': 'Genome Res.', 'firstPage': '2121', 'lastPage': '2127', 'volume': '14'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] (ISOFORM 2)'], 'referenceComments': [{'value': 'Cerebellum', 'type': 'TISSUE'}]}, {'referenceNumber': 4, 'citation': {'id': '17974005', 'citationType': 'journal article', 'authors': ['Bechtel S.', 'Rosenfelder H.', 'Duda A.', 'Schmidt C.P.', 'Ernst U.', 'Wellenreuther R.', 'Mehrle A.', 'Schuster C.', 'Bahr A.', 'Bloecker H.', 'Heubner D.', 'Hoerlein A.', 'Michel G.', 'Wedler H.', 'Koehrer K.', 'Ottenwaelder B.', 'Poustka A.', 'Wiemann S.', 'Schupp I.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '17974005'}, {'database': 'DOI', 'id': '10.1186/1471-2164-8-399'}], 'title': 'The full-ORF clone resource of the German cDNA consortium.', 'publicationDate': '2007', 'journal': 'BMC Genomics', 'firstPage': '399', 'lastPage': '399', 'volume': '8'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 339-1388 (ISOFORMS 1/2)', 'VARIANT SER-996'], 'referenceComments': [{'value': 'Melanoma', 'type': 'TISSUE'}]}, {'referenceNumber': 5, 'citation': {'id': '12747765', 'citationType': 'journal article', 'authors': ['Scanlan M.J.', 'Gout I.', 'Gordon C.M.', 'Williamson B.', 'Stockert E.', 'Gure A.O.', 'Jaeger D.', 'Chen Y.-T.', 'Mackay A.', "O'Hare M.J.", 'Old L.J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '12747765'}], 'title': 'Humoral immunity to human breast cancer: antigen definition and quantitative analysis of mRNA expression.', 'publicationDate': '2001', 'journal': 'Cancer Immun.', 'firstPage': '4', 'lastPage': '4', 'volume': '1'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [MRNA] OF 786-1388 (ISOFORM 4)', 'TISSUE SPECIFICITY', 'VARIANT SER-996'], 'referenceComments': [{'value': 'Mammary gland', 'type': 'TISSUE'}]}, {'referenceNumber': 6, 'citation': {'id': '12612055', 'citationType': 'journal article', 'authors': ['Heidebrecht H.-J.', 'Adam-Klages S.', 'Szczepanowski M.', 'Pollmann M.', 'Buck F.', 'Endl E.', 'Kruse M.-L.', 'Rudolph P.', 'Parwaresch R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '12612055'}], 'title': 'repp86: a human protein associated in the progression of mitosis.', 'publicationDate': '2003', 'journal': 'Mol. Cancer Res.', 'firstPage': '271', 'lastPage': '279', 'volume': '1'}, 'referencePositions': ['INTERACTION WITH TPX2', 'SUBCELLULAR LOCATION']}, {'referenceNumber': 7, 'citation': {'id': '18669648', 'citationType': 'journal article', 'authors': ['Dephoure N.', 'Zhou C.', 'Villen J.', 'Beausoleil S.A.', 'Bakalarski C.E.', 'Elledge S.J.', 'Gygi S.P.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '18669648'}, {'database': 'DOI', 'id': '10.1073/pnas.0805139105'}], 'title': 'A quantitative atlas of mitotic phosphorylation.', 'publicationDate': '2008', 'journal': 'Proc. Natl. Acad. Sci. U.S.A.', 'firstPage': '10762', 'lastPage': '10767', 'volume': '105'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-568', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 8, 'citation': {'id': '19608861', 'citationType': 'journal article', 'authors': ['Choudhary C.', 'Kumar C.', 'Gnad F.', 'Nielsen M.L.', 'Rehman M.', 'Walther T.C.', 'Olsen J.V.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '19608861'}, {'database': 'DOI', 'id': '10.1126/science.1175371'}], 'title': 'Lysine acetylation targets protein complexes and co-regulates major cellular functions.', 'publicationDate': '2009', 'journal': 'Science', 'firstPage': '834', 'lastPage': '840', 'volume': '325'}, 'referencePositions': ['ACETYLATION [LARGE SCALE ANALYSIS] AT LYS-1009', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]']}, {'referenceNumber': 9, 'citation': {'id': '20068231', 'citationType': 'journal article', 'authors': ['Olsen J.V.', 'Vermeulen M.', 'Santamaria A.', 'Kumar C.', 'Miller M.L.', 'Jensen L.J.', 'Gnad F.', 'Cox J.', 'Jensen T.S.', 'Nigg E.A.', 'Brunak S.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '20068231'}, {'database': 'DOI', 'id': '10.1126/scisignal.2000475'}], 'title': 'Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis.', 'publicationDate': '2010', 'journal': 'Sci. Signal.', 'firstPage': 'RA3', 'lastPage': 'RA3', 'volume': '3'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-1169', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 10, 'citation': {'id': '21269460', 'citationType': 'journal article', 'authors': ['Burkard T.R.', 'Planyavsky M.', 'Kaupe I.', 'Breitwieser F.P.', 'Buerckstuemmer T.', 'Bennett K.L.', 'Superti-Furga G.', 'Colinge J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '21269460'}, {'database': 'DOI', 'id': '10.1186/1752-0509-5-17'}], 'title': 'Initial characterization of the human central proteome.', 'publicationDate': '2011', 'journal': 'BMC Syst. Biol.', 'firstPage': '17', 'lastPage': '17', 'volume': '5'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]']}, {'referenceNumber': 11, 'citation': {'id': '23186163', 'citationType': 'journal article', 'authors': ['Zhou H.', 'Di Palma S.', 'Preisinger C.', 'Peng M.', 'Polat A.N.', 'Heck A.J.', 'Mohammed S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '23186163'}, {'database': 'DOI', 'id': '10.1021/pr300630k'}], 'title': 'Toward a comprehensive characterization of a human cancer cell phosphoproteome.', 'publicationDate': '2013', 'journal': 'J. Proteome Res.', 'firstPage': '260', 'lastPage': '271', 'volume': '12'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT THR-399 AND SER-1141', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}, {'value': 'Erythroleukemia', 'type': 'TISSUE'}]}, {'referenceNumber': 12, 'citation': {'id': '28150392', 'citationType': 'journal article', 'authors': ['Sleiman P.M.A.', 'March M.', 'Nguyen K.', 'Tian L.', 'Pellegrino R.', 'Hou C.', 'Dridi W.', 'Sager M.', 'Housawi Y.H.', 'Hakonarson H.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '28150392'}, {'database': 'DOI', 'id': '10.1002/humu.23188'}], 'title': 'Loss-of-Function Mutations in KIF15 Underlying a Braddock-Carey Genocopy.', 'publicationDate': '2017', 'journal': 'Hum. Mutat.', 'firstPage': '507', 'lastPage': '510', 'volume': '38'}, 'referencePositions': ['INVOLVEMENT IN BRDCS2', 'VARIANT BRDCS2 501-ARG--SER-1388 DEL']}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AB035898', 'properties': [{'key': 'ProteinId', 'value': 'BAB03309.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AK027816', 'properties': [{'key': 'ProteinId', 'value': 'BAB55389.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'BC117174', 'properties': [{'key': 'ProteinId', 'value': 'AAI17175.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AL832908', 'properties': [{'key': 'ProteinId', 'value': 'CAH10635.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AF308294', 'properties': [{'key': 'ProteinId', 'value': 'AAG48261.1'}, {'key': 'Status', 'value': 'ALT_FRAME'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'CCDS', 'id': 'CCDS33744.1', 'properties': [{'key': 'Description', 'value': '-'}], 'isoformId': 'Q9NS87-1'}, {'database': 'RefSeq', 'id': 'NP_064627.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_020242.3'}], 'isoformId': 'Q9NS87-1'}, {'database': 'PDB', 'id': '4BN2', 'properties': [{'key': 'Method', 'value': 'X-ray'}, {'key': 'Resolution', 'value': '2.70 A'}, {'key': 'Chains', 'value': 'A/B/C=19-375'}]}, {'database': 'PDB', 'id': '6ZPH', 'properties': [{'key': 'Method', 'value': 'EM'}, {'key': 'Resolution', 'value': '6.90 A'}, {'key': 'Chains', 'value': 'B=1-374'}]}, {'database': 'PDB', 'id': '6ZPI', 'properties': [{'key': 'Method', 'value': 'EM'}, {'key': 'Resolution', 'value': '4.50 A'}, {'key': 'Chains', 'value': 'C=1-374'}]}, {'database': 'PDB', 'id': '7RYP', 'properties': [{'key': 'Method', 'value': 'EM'}, {'key': 'Resolution', 'value': '4.80 A'}, {'key': 'Chains', 'value': 'A=1-375'}]}, {'database': 'PDBsum', 'id': '4BN2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PDBsum', 'id': '6ZPH', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PDBsum', 'id': '6ZPI', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PDBsum', 'id': '7RYP', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'AlphaFoldDB', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'EMDB', 'id': 'EMD-11339', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'EMDB', 'id': 'EMD-11340', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'EMDB', 'id': 'EMD-24744', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID', 'id': '121307', 'properties': [{'key': 'Interactions', 'value': '92'}]}, {'database': 'DIP', 'id': 'DIP-28133N', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'IntAct', 'id': 'Q9NS87', 'properties': [{'key': 'Interactions', 'value': '34'}]}, {'database': 'MINT', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'STRING', 'id': '9606.ENSP00000324020', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BindingDB', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ChEMBL', 'id': 'CHEMBL3632454', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CarbonylDB', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GlyGen', 'id': 'Q9NS87', 'properties': [{'key': 'glycosylation', 'value': '3 sites, 2 N-linked glycans (2 sites), 1 O-linked glycan (1 site)'}]}, {'database': 'iPTMnet', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MetOSite', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhosphoSitePlus', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioMuta', 'id': 'KIF15', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DMDM', 'id': '74752937', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'jPOST', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MassIVE', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PaxDb', 'id': '9606-ENSP00000324020', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '82511', 'properties': [{'key': 'Description', 'value': '-'}], 'isoformId': 'Q9NS87-1'}, {'database': 'ProteomicsDB', 'id': '82512', 'properties': [{'key': 'Description', 'value': '-'}], 'isoformId': 'Q9NS87-2'}, {'database': 'ProteomicsDB', 'id': '82513', 'properties': [{'key': 'Description', 'value': '-'}], 'isoformId': 'Q9NS87-3'}, {'database': 'ProteomicsDB', 'id': '82514', 'properties': [{'key': 'Description', 'value': '-'}], 'isoformId': 'Q9NS87-4'}, {'database': 'Pumba', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '29484', 'properties': [{'key': 'antibodies', 'value': '234 antibodies from 26 providers'}]}, {'database': 'DNASU', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000326047.9', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000324020.4'}, {'key': 'GeneId', 'value': 'ENSG00000163808.17'}], 'isoformId': 'Q9NS87-1'}, {'database': 'Ensembl', 'id': 'ENST00000627272.3', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000486004.1'}, {'key': 'GeneId', 'value': 'ENSG00000280610.3'}], 'isoformId': 'Q9NS87-1'}, {'database': 'GeneID', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MANE-Select', 'id': 'ENST00000326047.9', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000324020.4'}, {'key': 'RefSeqNucleotideId', 'value': 'NM_020242.3'}, {'key': 'RefSeqProteinId', 'value': 'NP_064627.1'}]}, {'database': 'UCSC', 'id': 'uc003cnx.5', 'properties': [{'key': 'OrganismName', 'value': 'human'}], 'isoformId': 'Q9NS87-1'}, {'database': 'AGR', 'id': 'HGNC:17273', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneCards', 'id': 'KIF15', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:17273', 'properties': [{'key': 'GeneName', 'value': 'KIF15'}]}, {'database': 'HPA', 'id': 'ENSG00000163808', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Group enriched (bone marrow, lymphoid tissue, testis)'}]}, {'database': 'MalaCards', 'id': 'KIF15', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MIM', 'id': '617569', 'properties': [{'key': 'Type', 'value': 'gene'}]}, {'database': 'MIM', 'id': '619981', 'properties': [{'key': 'Type', 'value': 'phenotype'}]}, {'database': 'neXtProt', 'id': 'NX_Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OpenTargets', 'id': 'ENSG00000163808', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Orphanet', 'id': '261323', 'properties': [{'key': 'Disease', 'value': '21q22.11q22.12 microdeletion syndrome'}]}, {'database': 'PharmGKB', 'id': 'PA30183', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000163808', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'eggNOG', 'id': 'KOG4280', 'properties': [{'key': 'ToxonomicScope', 'value': 'Eukaryota'}]}, {'database': 'GeneTree', 'id': 'ENSGT00940000156463', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_005295_0_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'InParanoid', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'CVKKEKF', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '3176171at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhylomeDB', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'TreeFam', 'id': 'TF320478', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PathwayCommons', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Reactome', 'id': 'R-HSA-2132295', 'properties': [{'key': 'PathwayName', 'value': 'MHC class II antigen presentation'}]}, {'database': 'Reactome', 'id': 'R-HSA-6811434', 'properties': [{'key': 'PathwayName', 'value': 'COPI-dependent Golgi-to-ER retrograde traffic'}]}, {'database': 'Reactome', 'id': 'R-HSA-983189', 'properties': [{'key': 'PathwayName', 'value': 'Kinesins'}]}, {'database': 'SignaLink', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '56992', 'properties': [{'key': 'hits', 'value': '61 hits in 1159 CRISPR screens'}]}, {'database': 'CD-CODE', 'id': '8C2F96ED', 'properties': [{'key': 'EntryName', 'value': 'Centrosome'}]}, {'database': 'ChiTaRS', 'id': 'KIF15', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'EvolutionaryTrace', 'id': 'Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneWiki', 'id': 'KIF15', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GenomeRNAi', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pharos', 'id': 'Q9NS87', 'properties': [{'key': 'DevelopmentLevel', 'value': 'Tchem'}]}, {'database': 'PRO', 'id': 'PR:Q9NS87', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome 3'}]}, {'database': 'RNAct', 'id': 'Q9NS87', 'properties': [{'key': 'moleculeType', 'value': 'protein'}]}, {'database': 'Bgee', 'id': 'ENSG00000163808', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in ventricular zone and 103 other cell types or tissues'}]}, {'database': 'ExpressionAtlas', 'id': 'Q9NS87', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0005813', 'properties': [{'key': 'GoTerm', 'value': 'C:centrosome'}, {'key': 'GoEvidenceType', 'value': 'TAS:ProtInc'}], 'evidences': [{'evidenceCode': 'ECO:0000304', 'source': 'PubMed', 'id': '10878014'}]}, {'database': 'GO', 'id': 'GO:0005737', 'properties': [{'key': 'GoTerm', 'value': 'C:cytoplasm'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0005829', 'properties': [{'key': 'GoTerm', 'value': 'C:cytosol'}, {'key': 'GoEvidenceType', 'value': 'TAS:Reactome'}]}, {'database': 'GO', 'id': 'GO:0005871', 'properties': [{'key': 'GoTerm', 'value': 'C:kinesin complex'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0016020', 'properties': [{'key': 'GoTerm', 'value': 'C:membrane'}, {'key': 'GoEvidenceType', 'value': 'HDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0007005', 'source': 'PubMed', 'id': '19946888'}]}, {'database': 'GO', 'id': 'GO:0005874', 'properties': [{'key': 'GoTerm', 'value': 'C:microtubule'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0005873', 'properties': [{'key': 'GoTerm', 'value': 'C:plus-end kinesin complex'}, {'key': 'GoEvidenceType', 'value': 'TAS:ProtInc'}], 'evidences': [{'evidenceCode': 'ECO:0000304', 'source': 'PubMed', 'id': '10878014'}]}, {'database': 'GO', 'id': 'GO:0000922', 'properties': [{'key': 'GoTerm', 'value': 'C:spindle pole'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}]}, {'database': 'GO', 'id': 'GO:0005524', 'properties': [{'key': 'GoTerm', 'value': 'F:ATP binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-KW'}]}, {'database': 'GO', 'id': 'GO:0016887', 'properties': [{'key': 'GoTerm', 'value': 'F:ATP hydrolysis activity'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0003774', 'properties': [{'key': 'GoTerm', 'value': 'F:cytoskeletal motor activity'}, {'key': 'GoEvidenceType', 'value': 'TAS:ProtInc'}], 'evidences': [{'evidenceCode': 'ECO:0000304', 'source': 'PubMed', 'id': '10878014'}]}, {'database': 'GO', 'id': 'GO:0008017', 'properties': [{'key': 'GoTerm', 'value': 'F:microtubule binding'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0003777', 'properties': [{'key': 'GoTerm', 'value': 'F:microtubule motor activity'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0008574', 'properties': [{'key': 'GoTerm', 'value': 'F:plus-end-directed microtubule motor activity'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}]}, {'database': 'GO', 'id': 'GO:0051299', 'properties': [{'key': 'GoTerm', 'value': 'P:centrosome separation'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}]}, {'database': 'GO', 'id': 'GO:0007018', 'properties': [{'key': 'GoTerm', 'value': 'P:microtubule-based movement'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0000278', 'properties': [{'key': 'GoTerm', 'value': 'P:mitotic cell cycle'}, {'key': 'GoEvidenceType', 'value': 'TAS:ProtInc'}], 'evidences': [{'evidenceCode': 'ECO:0000304', 'source': 'PubMed', 'id': '10878014'}]}, {'database': 'GO', 'id': 'GO:0090307', 'properties': [{'key': 'GoTerm', 'value': 'P:mitotic spindle assembly'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}]}, {'database': 'CDD', 'id': 'cd01373', 'properties': [{'key': 'EntryName', 'value': 'KISc_KLP2_like'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'FunFam', 'id': '3.40.850.10:FF:000034', 'properties': [{'key': 'EntryName', 'value': 'Kinesin family member 15'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Gene3D', 'id': '3.40.850.10', 'properties': [{'key': 'EntryName', 'value': 'Kinesin motor domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR031794', 'properties': [{'key': 'EntryName', 'value': 'HMMR_C'}]}, {'database': 'InterPro', 'id': 'IPR044986', 'properties': [{'key': 'EntryName', 'value': 'KIF15/KIN-12'}]}, {'database': 'InterPro', 'id': 'IPR001752', 'properties': [{'key': 'EntryName', 'value': 'Kinesin_motor_dom'}]}, {'database': 'InterPro', 'id': 'IPR036961', 'properties': [{'key': 'EntryName', 'value': 'Kinesin_motor_dom_sf'}]}, {'database': 'InterPro', 'id': 'IPR027417', 'properties': [{'key': 'EntryName', 'value': 'P-loop_NTPase'}]}, {'database': 'PANTHER', 'id': 'PTHR37739', 'properties': [{'key': 'EntryName', 'value': 'KINESIN-LIKE PROTEIN KIN-12D'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR37739:SF8', 'properties': [{'key': 'EntryName', 'value': 'KINESIN-LIKE PROTEIN KIN-12D'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF15908', 'properties': [{'key': 'EntryName', 'value': 'HMMR_C'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF00225', 'properties': [{'key': 'EntryName', 'value': 'Kinesin'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PRINTS', 'id': 'PR00380', 'properties': [{'key': 'EntryName', 'value': 'KINESINHEAVY'}]}, {'database': 'SMART', 'id': 'SM00129', 'properties': [{'key': 'EntryName', 'value': 'KISc'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF52540', 'properties': [{'key': 'EntryName', 'value': 'P-loop containing nucleoside triphosphate hydrolases'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50067', 'properties': [{'key': 'EntryName', 'value': 'KINESIN_MOTOR_2'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MAPGCKTELRSVTNGQSNQPSNEGDAIKVFVRIRPPAERSGSADGEQNLCLSVLSSTSLRLHSNPEPKTFTFDHVADVDTTQESVFATVAKSIVESCMSGYNGTIFAYGQTGSGKTFTMMGPSESDNFSHNLRGVIPRSFEYLFSLIDREKEKAGAGKSFLCKCSFIEIYNEQIYDLLDSASAGLYLREHIKKGVFVVGAVEQVVTSAAEAYQVLSGGWRNRRVASTSMNRESSRSHAVFTITIESMEKSNEIVNIRTSLLNLVDLAGSERQKDTHAEGMRLKEAGNINRSLSCLGQVITALVDVGNGKQRHVCYRDSKLTFLLRDSLGGNAKTAIIANVHPGSRCFGETLSTLNFAQRAKLIKNKAVVNEDTQGNVSQLQAEVKRLKEQLAELASGQTPPESFLTRDKKKTNYMEYFQEAMLFFKKSEQEKKSLIEKVTQLEDLTLKKEKFIQSNKMIVKFREDQIIRLEKLHKESRGGFLPEEQDRLLSELRNEIQTLREQIEHHPRVAKYAMENHSLREENRRLRLLEPVKRAQEMDAQTIAKLEKAFSEISGMEKSDKNQQGFSPKAQKEPCLFANTEKLKAQLLQIQTELNNSKQEYEEFKELTRKRQLELESELQSLQKANLNLENLLEATKACKRQEVSQLNKIHAETLKIITTPTKAYQLHSRPVPKLSPEMGSFGSLYTQNSSILDNDILNEPVPPEMNEQAFEAISEELRTVQEQMSALQAKLDEEEHKNLKLQQHVDKLEHHSTQMQELFSSERIDWTKQQEELLSQLNVLEKQLQETQTKNDFLKSEVHDLRVVLHSADKELSSVKLEYSSFKTNQEKEFNKLSERHMHVQLQLDNLRLENEKLLESKACLQDSYDNLQEIMKFEIDQLSRNLQNFKKENETLKSDLNNLMELLEAEKERNNKLSLQFEEDKENSSKEILKVLEAVRQEKQKETAKCEQQMAKVQKLEESLLATEKVISSLEKSRDSDKKVVADLMNQIQELRTSVCEKTETIDTLKQELKDINCKYNSALVDREESRVLIKKQEVDILDLKETLRLRILSEDIERDMLCEDLAHATEQLNMLTEASKKHSGLLQSAQEELTKKEALIQELQHKLNQKKEEVEQKKNEYNFKMRQLEHVMDSAAEDPQSPKTPPHFQTHLAKLLETQEQEIEDGRASKTSLEHLVTKLNEDREVKNAEILRMKEQLREMENLRLESQQLIEKNWLLQGQLDDIKRQKENSDQNHPDNQQLKNEQEESIKERLAKSKIVEEMLKMKADLEEVQSALYNKEMECLRMTDEVERTQTLESKAFQEKEQLRSKLEEMYEERERTSQEMEMLRKQVECLAEENGKLVGHQNLHQKIQYVVRLKKENVRLAEETEKLRAENVFLKEKKRSES', 'length': 1388, 'molWeight': 160160, 'crc64': 'E127EB4B991CA83A', 'md5': '18B5E449B1DA7BB17C0011DC56489ABE'}, 'extraAttributes': {'countByCommentType': {'FUNCTION': 1, 'SUBUNIT': 1, 'INTERACTION': 1, 'SUBCELLULAR LOCATION': 1, 'ALTERNATIVE PRODUCTS': 4, 'TISSUE SPECIFICITY': 1, 'DISEASE': 1, 'SIMILARITY': 1, 'SEQUENCE CAUTION': 1}, 'countByFeatureType': {'Chain': 1, 'Domain': 1, 'Region': 2, 'Coiled coil': 1, 'Compositional bias': 1, 'Binding site': 1, 'Modified residue': 5, 'Alternative sequence': 4, 'Natural variant': 5, 'Sequence conflict': 1, 'Beta strand': 16, 'Helix': 10}, 'uniParcId': 'UPI000006DB0E'}}}, {'from': '56992', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'C9JKA9', 'uniProtkbId': 'C9JKA9_HUMAN', 'entryAudit': {'firstPublicDate': '2009-11-03', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2009-11-03', 'entryVersion': 111, 'sequenceVersion': 1}, 'annotationScore': 1.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}, {'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}], 'value': 'Kinesin family member 15'}}]}, 'genes': [{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}], 'value': 'KIF15'}}], 'comments': [{'commentType': 'SUBCELLULAR LOCATION', 'subcellularLocations': [{'location': {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00004186'}], 'value': 'Cytoplasm, cytoskeleton, spindle', 'id': 'SL-0251'}}]}], 'features': [{'type': 'Domain', 'location': {'start': {'value': 914, 'modifier': 'EXACT'}, 'end': {'value': 1013, 'modifier': 'EXACT'}}, 'description': 'Hyaluronan-mediated motility receptor C-terminal', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'Pfam', 'id': 'PF15908'}]}, {'type': 'Region', 'location': {'start': {'value': 863, 'modifier': 'EXACT'}, 'end': {'value': 885, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 5, 'modifier': 'EXACT'}, 'end': {'value': 32, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 216, 'modifier': 'EXACT'}, 'end': {'value': 278, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 347, 'modifier': 'EXACT'}, 'end': {'value': 434, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 481, 'modifier': 'EXACT'}, 'end': {'value': 611, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 721, 'modifier': 'EXACT'}, 'end': {'value': 766, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 812, 'modifier': 'EXACT'}, 'end': {'value': 849, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}, {'type': 'Coiled coil', 'location': {'start': {'value': 933, 'modifier': 'EXACT'}, 'end': {'value': 1018, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}]}], 'keywords': [{'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00023054'}, {'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'Coils'}], 'id': 'KW-0175', 'category': 'Domain', 'name': 'Coiled coil'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022490'}], 'id': 'KW-0963', 'category': 'Cellular component', 'name': 'Cytoplasm'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00023212'}], 'id': 'KW-0206', 'category': 'Cellular component', 'name': 'Cytoskeleton'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022701'}], 'id': 'KW-0493', 'category': 'Cellular component', 'name': 'Microtubule'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00023175'}], 'id': 'KW-0505', 'category': 'Molecular function', 'name': 'Motor protein'}, {'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PeptideAtlas', 'id': 'C9JKA9'}, {'evidenceCode': 'ECO:0007829', 'source': 'ProteomicsDB', 'id': 'C9JKA9'}], 'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '11237011', 'citationType': 'journal article', 'authoringGroup': ['International Human Genome Sequencing Consortium'], 'authors': ['Lander E.S.', 'Linton L.M.', 'Birren B.', 'Nusbaum C.', 'Zody M.C.', 'Baldwin J.', 'Devon K.', 'Dewar K.', 'Doyle M.', 'FitzHugh W.', 'Funke R.', 'Gage D.', 'Harris K.', 'Heaford A.', 'Howland J.', 'Kann L.', 'Lehoczky J.', 'LeVine R.', 'McEwan P.', 'McKernan K.', 'Meldrim J.', 'Mesirov J.P.', 'Miranda C.', 'Morris W.', 'Naylor J.', 'Raymond C.', 'Rosetti M.', 'Santos R.', 'Sheridan A.', 'Sougnez C.', 'Stange-Thomann N.', 'Stojanovic N.', 'Subramanian A.', 'Wyman D.', 'Rogers J.', 'Sulston J.', 'Ainscough R.', 'Beck S.', 'Bentley D.', 'Burton J.', 'Clee C.', 'Carter N.', 'Coulson A.', 'Deadman R.', 'Deloukas P.', 'Dunham A.', 'Dunham I.', 'Durbin R.', 'French L.', 'Grafham D.', 'Gregory S.', 'Hubbard T.', 'Humphray S.', 'Hunt A.', 'Jones M.', 'Lloyd C.', 'McMurray A.', 'Matthews L.', 'Mercer S.', 'Milne S.', 'Mullikin J.C.', 'Mungall A.', 'Plumb R.', 'Ross M.', 'Shownkeen R.', 'Sims S.', 'Waterston R.H.', 'Wilson R.K.', 'Hillier L.W.', 'McPherson J.D.', 'Marra M.A.', 'Mardis E.R.', 'Fulton L.A.', 'Chinwalla A.T.', 'Pepin K.H.', 'Gish W.R.', 'Chissoe S.L.', 'Wendl M.C.', 'Delehaunty K.D.', 'Miner T.L.', 'Delehaunty A.', 'Kramer J.B.', 'Cook L.L.', 'Fulton R.S.', 'Johnson D.L.', 'Minx P.J.', 'Clifton S.W.', 'Hawkins T.', 'Branscomb E.', 'Predki P.', 'Richardson P.', 'Wenning S.', 'Slezak T.', 'Doggett N.', 'Cheng J.F.', 'Olsen A.', 'Lucas S.', 'Elkin C.', 'Uberbacher E.', 'Frazier M.', 'Gibbs R.A.', 'Muzny D.M.', 'Scherer S.E.', 'Bouck J.B.', 'Sodergren E.J.', 'Worley K.C.', 'Rives C.M.', 'Gorrell J.H.', 'Metzker M.L.', 'Naylor S.L.', 'Kucherlapati R.S.', 'Nelson D.L.', 'Weinstock G.M.', 'Sakaki Y.', 'Fujiyama A.', 'Hattori M.', 'Yada T.', 'Toyoda A.', 'Itoh T.', 'Kawagoe C.', 'Watanabe H.', 'Totoki Y.', 'Taylor T.', 'Weissenbach J.', 'Heilig R.', 'Saurin W.', 'Artiguenave F.', 'Brottier P.', 'Bruls T.', 'Pelletier E.', 'Robert C.', 'Wincker P.', 'Smith D.R.', 'Doucette-Stamm L.', 'Rubenfield M.', 'Weinstock K.', 'Lee H.M.', 'Dubois J.', 'Rosenthal A.', 'Platzer M.', 'Nyakatura G.', 'Taudien S.', 'Rump A.', 'Yang H.', 'Yu J.', 'Wang J.', 'Huang G.', 'Gu J.', 'Hood L.', 'Rowen L.', 'Madan A.', 'Qin S.', 'Davis R.W.', 'Federspiel N.A.', 'Abola A.P.', 'Proctor M.J.', 'Myers R.M.', 'Schmutz J.', 'Dickson M.', 'Grimwood J.', 'Cox D.R.', 'Olson M.V.', 'Kaul R.', 'Raymond C.', 'Shimizu N.', 'Kawasaki K.', 'Minoshima S.', 'Evans G.A.', 'Athanasiou M.', 'Schultz R.', 'Roe B.A.', 'Chen F.', 'Pan H.', 'Ramser J.', 'Lehrach H.', 'Reinhardt R.', 'McCombie W.R.', 'de la Bastide M.', 'Dedhia N.', 'Blocker H.', 'Hornischer K.', 'Nordsiek G.', 'Agarwala R.', 'Aravind L.', 'Bailey J.A.', 'Bateman A.', 'Batzoglou S.', 'Birney E.', 'Bork P.', 'Brown D.G.', 'Burge C.B.', 'Cerutti L.', 'Chen H.C.', 'Church D.', 'Clamp M.', 'Copley R.R.', 'Doerks T.', 'Eddy S.R.', 'Eichler E.E.', 'Furey T.S.', 'Galagan J.', 'Gilbert J.G.', 'Harmon C.', 'Hayashizaki Y.', 'Haussler D.', 'Hermjakob H.', 'Hokamp K.', 'Jang W.', 'Johnson L.S.', 'Jones T.A.', 'Kasif S.', 'Kaspryzk A.', 'Kennedy S.', 'Kent W.J.', 'Kitts P.', 'Koonin E.V.', 'Korf I.', 'Kulp D.', 'Lancet D.', 'Lowe T.M.', 'McLysaght A.', 'Mikkelsen T.', 'Moran J.V.', 'Mulder N.', 'Pollara V.J.', 'Ponting C.P.', 'Schuler G.', 'Schultz J.', 'Slater G.', 'Smit A.F.', 'Stupka E.', 'Szustakowski J.', 'Thierry-Mieg D.', 'Thierry-Mieg J.', 'Wagner L.', 'Wallis J.', 'Wheeler R.', 'Williams A.', 'Wolf Y.I.', 'Wolfe K.H.', 'Yang S.P.', 'Yeh R.F.', 'Collins F.', 'Guyer M.S.', 'Peterson J.', 'Felsenfeld A.', 'Wetterstrand K.A.', 'Patrinos A.', 'Morgan M.J.', 'de Jong P.', 'Catanese J.J.', 'Osoegawa K.', 'Shizuya H.', 'Choi S.', 'Chen Y.J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '11237011'}, {'database': 'DOI', 'id': '10.1038/35057062'}], 'title': 'Initial sequencing and analysis of the human genome.', 'publicationDate': '2001', 'journal': 'Nature', 'firstPage': '860', 'lastPage': '921', 'volume': '409'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}]}, {'referenceNumber': 2, 'citation': {'id': '15496913', 'citationType': 'journal article', 'authoringGroup': ['International Human Genome Sequencing Consortium'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15496913'}, {'database': 'DOI', 'id': '10.1038/nature03001'}], 'title': 'Finishing the euchromatic sequence of the human genome.', 'publicationDate': '2004', 'journal': 'Nature', 'firstPage': '931', 'lastPage': '945', 'volume': '431'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}]}, {'referenceNumber': 3, 'citation': {'id': '16641997', 'citationType': 'journal article', 'authors': ['Muzny D.M.', 'Scherer S.E.', 'Kaul R.', 'Wang J.', 'Yu J.', 'Sudbrak R.', 'Buhay C.J.', 'Chen R.', 'Cree A.', 'Ding Y.', 'Dugan-Rocha S.', 'Gill R.', 'Gunaratne P.', 'Harris R.A.', 'Hawes A.C.', 'Hernandez J.', 'Hodgson A.V.', 'Hume J.', 'Jackson A.', 'Khan Z.M.', 'Kovar-Smith C.', 'Lewis L.R.', 'Lozado R.J.', 'Metzker M.L.', 'Milosavljevic A.', 'Miner G.R.', 'Morgan M.B.', 'Nazareth L.V.', 'Scott G.', 'Sodergren E.', 'Song X.Z.', 'Steffen D.', 'Wei S.', 'Wheeler D.A.', 'Wright M.W.', 'Worley K.C.', 'Yuan Y.', 'Zhang Z.', 'Adams C.Q.', 'Ansari-Lari M.A.', 'Ayele M.', 'Brown M.J.', 'Chen G.', 'Chen Z.', 'Clendenning J.', 'Clerc-Blankenburg K.P.', 'Chen R.', 'Chen Z.', 'Davis C.', 'Delgado O.', 'Dinh H.H.', 'Dong W.', 'Draper H.', 'Ernst S.', 'Fu G.', 'Gonzalez-Garay M.L.', 'Garcia D.K.', 'Gillett W.', 'Gu J.', 'Hao B.', 'Haugen E.', 'Havlak P.', 'He X.', 'Hennig S.', 'Hu S.', 'Huang W.', 'Jackson L.R.', 'Jacob L.S.', 'Kelly S.H.', 'Kube M.', 'Levy R.', 'Li Z.', 'Liu B.', 'Liu J.', 'Liu W.', 'Lu J.', 'Maheshwari M.', 'Nguyen B.V.', 'Okwuonu G.O.', 'Palmeiri A.', 'Pasternak S.', 'Perez L.M.', 'Phelps K.A.', 'Plopper F.J.', 'Qiang B.', 'Raymond C.', 'Rodriguez R.', 'Saenphimmachak C.', 'Santibanez J.', 'Shen H.', 'Shen Y.', 'Subramanian S.', 'Tabor P.E.', 'Verduzco D.', 'Waldron L.', 'Wang J.', 'Wang J.', 'Wang Q.', 'Williams G.A.', 'Wong G.K.', 'Yao Z.', 'Zhang J.', 'Zhang X.', 'Zhao G.', 'Zhou J.', 'Zhou Y.', 'Nelson D.', 'Lehrach H.', 'Reinhardt R.', 'Naylor S.L.', 'Yang H.', 'Olson M.', 'Weinstock G.', 'Gibbs R.A.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '16641997'}, {'database': 'DOI', 'id': '10.1038/nature04728'}], 'title': 'The DNA sequence, annotation and analysis of human chromosome 3.', 'publicationDate': '2006', 'journal': 'Nature', 'firstPage': '1194', 'lastPage': '1198', 'volume': '440'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}, {'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}]}, {'referenceNumber': 4, 'citation': {'id': '18669648', 'citationType': 'journal article', 'authors': ['Dephoure N.', 'Zhou C.', 'Villen J.', 'Beausoleil S.A.', 'Bakalarski C.E.', 'Elledge S.J.', 'Gygi S.P.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '18669648'}, {'database': 'DOI', 'id': '10.1073/pnas.0805139105'}], 'title': 'A quantitative atlas of mitotic phosphorylation.', 'publicationDate': '2008', 'journal': 'Proc. Natl. Acad. Sci. U.S.A.', 'firstPage': '10762', 'lastPage': '10767', 'volume': '105'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '18669648'}]}, {'referenceNumber': 5, 'citation': {'id': '19608861', 'citationType': 'journal article', 'authors': ['Choudhary C.', 'Kumar C.', 'Gnad F.', 'Nielsen M.L.', 'Rehman M.', 'Walther T.C.', 'Olsen J.V.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '19608861'}, {'database': 'DOI', 'id': '10.1126/science.1175371'}], 'title': 'Lysine acetylation targets protein complexes and co-regulates major cellular functions.', 'publicationDate': '2009', 'journal': 'Science', 'firstPage': '834', 'lastPage': '840', 'volume': '325'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '19608861'}]}, {'referenceNumber': 6, 'citation': {'id': '20068231', 'citationType': 'journal article', 'authors': ['Olsen J.V.', 'Vermeulen M.', 'Santamaria A.', 'Kumar C.', 'Miller M.L.', 'Jensen L.J.', 'Gnad F.', 'Cox J.', 'Jensen T.S.', 'Nigg E.A.', 'Brunak S.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '20068231'}, {'database': 'DOI', 'id': '10.1126/scisignal.2000475'}], 'title': 'Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis.', 'publicationDate': '2010', 'journal': 'Sci. Signal.', 'firstPage': 'RA3', 'lastPage': 'RA3', 'volume': '3'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '20068231'}]}, {'referenceNumber': 7, 'citation': {'id': '21269460', 'citationType': 'journal article', 'authors': ['Burkard T.R.', 'Planyavsky M.', 'Kaupe I.', 'Breitwieser F.P.', 'Burckstummer T.', 'Bennett K.L.', 'Superti-Furga G.', 'Colinge J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '21269460'}, {'database': 'DOI', 'id': '10.1186/1752-0509-5-17'}], 'title': 'Initial characterization of the human central proteome.', 'publicationDate': '2011', 'journal': 'BMC Syst. Biol.', 'firstPage': '17', 'lastPage': '17', 'volume': '5'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '21269460'}]}, {'referenceNumber': 8, 'citation': {'id': '23186163', 'citationType': 'journal article', 'authors': ['Zhou H.', 'Di Palma S.', 'Preisinger C.', 'Peng M.', 'Polat A.N.', 'Heck A.J.', 'Mohammed S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '23186163'}], 'title': 'Toward a comprehensive characterization of a human cancer cell phosphoproteome.', 'publicationDate': '2013', 'journal': 'J. Proteome Res.', 'firstPage': '260', 'lastPage': '271', 'volume': '12'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '23186163'}]}, {'referenceNumber': 9, 'citation': {'id': 'CI-20IMBF1U3E5V5', 'citationType': 'submission', 'authoringGroup': ['Ensembl'], 'publicationDate': 'DEC-2024', 'submissionDatabase': 'UniProtKB'}, 'referencePositions': ['IDENTIFICATION'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000389982.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AC098649', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'RefSeq', 'id': 'XP_006713327.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_006713264.4'}]}, {'database': 'RefSeq', 'id': 'XP_054187530.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331555.1'}]}, {'database': 'SMR', 'id': 'C9JKA9', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '10563', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '29484', 'properties': [{'key': 'antibodies', 'value': '234 antibodies from 26 providers'}]}, {'database': 'DNASU', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000425755.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000389982.1'}, {'key': 'GeneId', 'value': 'ENSG00000163808.17'}]}, {'database': 'Ensembl', 'id': 'ENST00000627705.2', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000487255.1'}, {'key': 'GeneId', 'value': 'ENSG00000280610.3'}]}, {'database': 'GeneID', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'UCSC', 'id': 'uc062iwb.1', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'CTD', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '56992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:17273', 'properties': [{'key': 'GeneName', 'value': 'KIF15'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000163808', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneTree', 'id': 'ENSGT00940000156463', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '3176171at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '56992', 'properties': [{'key': 'hits', 'value': '61 hits in 1159 CRISPR screens'}]}, {'database': 'ChiTaRS', 'id': 'KIF15', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome 3'}]}, {'database': 'Bgee', 'id': 'ENSG00000163808', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in ventricular zone and 103 other cell types or tissues'}]}, {'database': 'ExpressionAtlas', 'id': 'C9JKA9', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0005737', 'properties': [{'key': 'GoTerm', 'value': 'C:cytoplasm'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-KW'}]}, {'database': 'GO', 'id': 'GO:0005874', 'properties': [{'key': 'GoTerm', 'value': 'C:microtubule'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-KW'}]}, {'database': 'GO', 'id': 'GO:0005819', 'properties': [{'key': 'GoTerm', 'value': 'C:spindle'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-SubCell'}]}, {'database': 'InterPro', 'id': 'IPR031794', 'properties': [{'key': 'EntryName', 'value': 'HMMR_C'}]}, {'database': 'InterPro', 'id': 'IPR044986', 'properties': [{'key': 'EntryName', 'value': 'KIF15/KIN-12'}]}, {'database': 'PANTHER', 'id': 'PTHR37739', 'properties': [{'key': 'EntryName', 'value': 'KINESIN-LIKE PROTEIN KIN-12D'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR37739:SF8', 'properties': [{'key': 'EntryName', 'value': 'KINESIN-LIKE PROTEIN KIN-12D'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF15908', 'properties': [{'key': 'EntryName', 'value': 'HMMR_C'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MAVVNEDTQGNVSQLQAEVKRLKEQLAELASGQTPPESFLTRDKKKTNYMEYFQEAMLFFKKSEQEKKSLIEKVTQLEDLTLKKEKFIQSNKMIVKFREDQIIRLEKLHKESRGGFLPEEQDRLLSELRNEIQTLREQIEHHPRVAKYAMENHSLREENRRLRLLEPVKRAQEMDAQTIAKLEKAFSEISGMEKSDKNQQGFSPKAQKEPCLFANTEKLKAQLLQIQTELNNSKQEYEEFKELTRKRQLELESELQSLQKANLNLENLLEATKACKRQEVSQLNKIHAETLKIITTPTKAYQLHSRPVPKLSPEMGSFGSLYTQNSSILDNDILNEPVPPEMNEQAFEAISEELRTVQEQMSALQAKLDEEEHKNLKLQQHVDKLEHHSTQMQELFSSERIDWTKQQEELLSQLNVLEKQLQETQTKNDFLKSEVHDLRVVLHSADKELSSVKLEYSSFKTNQEKEFNKLSERHMHVQLQLDNLRLENEKLLESKACLQDSYDNLQEIMKFEIDQLSRNLQNFKKENETLKSDLNNLMELLEAEKERNNKLSLQFEEDKENSSKEILKVLEAVRQEKQKETAKCEQQMAKVQKLEESLLATEKVISSLEKSRDSDKKVVADLMNQIQELRTSVCEKTETIDTLKQELKDINCKYNSALVDREESRVLIKKQEVDILDLKETLRLRILSEDIERDMLCEDLAHATEQLNMLTEASKKHSGLLQSAQEELTKKEALIQELQHKLNQKKEEVEQKKNEYNFKMRQLEHVMDSAAEDPQSPKTPPHFQTHLAKLLETQEQEIEDGRASKTSLEHLVTKLNEDREVKNAEILRMKEQLREMENLRLESQQLIEKNWLLQGQLDDIKRQKENSDQNHPDNQQLKNEQEESIKERLAKSKIVEEMLKMKADLEEVQSALYNKEMECLRMTDEVERTQTLESKAFQEKEQLRSKLEEMYEERERTSQEMEMLRKQVECLAEENGKLVGHQNLHQKIQYVVRLKKENVRLAEETEKLRAENVFLKEKKRSES', 'length': 1023, 'molWeight': 120339, 'crc64': '5A81F027F9B7A8D1', 'md5': 'D0885F17A99796E04E17F9552306118A'}, 'extraAttributes': {'countByCommentType': {'SUBCELLULAR LOCATION': 1}, 'countByFeatureType': {'Domain': 1, 'Region': 1, 'Coiled coil': 7}, 'uniParcId': 'UPI000198C9AA'}}}, {'from': '7918', 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)', 'primaryAccession': 'O95872', 'secondaryAccessions': ['A6NG25', 'B0UXA2', 'Q5SQ49'], 'uniProtkbId': 'GPAN1_HUMAN', 'entryAudit': {'firstPublicDate': '2002-10-25', 'lastAnnotationUpdateDate': '2025-04-09', 'lastSequenceUpdateDate': '1999-05-01', 'entryVersion': 187, 'sequenceVersion': 1}, 'annotationScore': 5.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'recommendedName': {'fullName': {'value': 'G patch domain and ankyrin repeat-containing protein 1'}}, 'alternativeNames': [{'fullName': {'value': 'Ankyrin repeat domain-containing protein 59'}}, {'fullName': {'value': 'G patch domain-containing protein 10'}}, {'fullName': {'value': 'HLA-B-associated transcript 4'}}, {'fullName': {'value': 'Protein G5'}}]}, 'genes': [{'geneName': {'value': 'GPANK1'}, 'synonyms': [{'value': 'ANKRD59'}, {'value': 'BAT4'}, {'value': 'G5'}, {'value': 'GPATCH10'}]}], 'comments': [{'commentType': 'INTERACTION', 'interactions': [{'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NYB9-2', 'geneName': 'ABI2', 'intActId': 'EBI-11096309'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9ULX6', 'geneName': 'AKAP8L', 'intActId': 'EBI-357530'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NYG5-2', 'geneName': 'ANAPC11', 'intActId': 'EBI-12224467'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q10567-3', 'geneName': 'AP1B1', 'intActId': 'EBI-11978055'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P63010-2', 'geneName': 'AP2B1', 'intActId': 'EBI-11529439'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O95429', 'geneName': 'BAG4', 'intActId': 'EBI-2949658'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BXJ5', 'geneName': 'C1QTNF2', 'intActId': 'EBI-2817707'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'A2RRN7', 'geneName': 'CADPS', 'intActId': 'EBI-10179719'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q52MB2', 'geneName': 'CCDC184', 'intActId': 'EBI-10179526'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P40199', 'geneName': 'CEACAM6', 'intActId': 'EBI-4314501'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'A8MTA8-2', 'geneName': 'CIMIP2B', 'intActId': 'EBI-12160437'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'A8MQ03', 'geneName': 'CYSRT1', 'intActId': 'EBI-3867333'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8WTU0', 'geneName': 'DDI1', 'intActId': 'EBI-748248'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O43143', 'geneName': 'DHX15', 'intActId': 'EBI-1237044'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NRI5-2', 'geneName': 'DISC1', 'intActId': 'EBI-11988027'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q16610', 'geneName': 'ECM1', 'intActId': 'EBI-947964'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O43281-2', 'geneName': 'EFS', 'intActId': 'EBI-11525448'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9C0B1-2', 'geneName': 'FTO', 'intActId': 'EBI-18138793'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q08379', 'geneName': 'GOLGA2', 'intActId': 'EBI-618309'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'A6NEM1', 'geneName': 'GOLGA6L9', 'intActId': 'EBI-5916454'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q6PI77', 'geneName': 'GPRASP3', 'intActId': 'EBI-11519926'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P31943', 'geneName': 'HNRNPH1', 'intActId': 'EBI-351590'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O75031', 'geneName': 'HSF2BP', 'intActId': 'EBI-7116203'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P04792', 'geneName': 'HSPB1', 'intActId': 'EBI-352682'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q96EL1', 'geneName': 'INKA1', 'intActId': 'EBI-10285157'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q1MX18', 'geneName': 'INSC', 'intActId': 'EBI-12081118'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q63ZY3', 'geneName': 'KANK2', 'intActId': 'EBI-2556193'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H079', 'geneName': 'KATNBL1', 'intActId': 'EBI-715394'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q7L273', 'geneName': 'KCTD9', 'intActId': 'EBI-4397613'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O60333-2', 'geneName': 'KIF1B', 'intActId': 'EBI-10975473'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O95678', 'geneName': 'KRT75', 'intActId': 'EBI-2949715'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P60410', 'geneName': 'KRTAP10-8', 'intActId': 'EBI-10171774'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P59991', 'geneName': 'KRTAP12-2', 'intActId': 'EBI-10176379'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q3SY46', 'geneName': 'KRTAP13-3', 'intActId': 'EBI-10241252'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q3LI72', 'geneName': 'KRTAP19-5', 'intActId': 'EBI-1048945'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q3LI70', 'geneName': 'KRTAP19-6', 'intActId': 'EBI-12805508'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O95751', 'geneName': 'LDOC1', 'intActId': 'EBI-740738'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BRK4', 'geneName': 'LZTS2', 'intActId': 'EBI-741037'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y5V3', 'geneName': 'MAGED1', 'intActId': 'EBI-716006'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q96A72', 'geneName': 'MAGOHB', 'intActId': 'EBI-746778'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q99750', 'geneName': 'MDFI', 'intActId': 'EBI-724076'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P50221', 'geneName': 'MEOX1', 'intActId': 'EBI-2864512'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q6FHY5', 'geneName': 'MEOX2', 'intActId': 'EBI-16439278'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8N6F8', 'geneName': 'METTL27', 'intActId': 'EBI-8487781'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TCB7', 'geneName': 'METTL6', 'intActId': 'EBI-17861723'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UJV3-2', 'geneName': 'MID2', 'intActId': 'EBI-10172526'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q6PF18', 'geneName': 'MORN3', 'intActId': 'EBI-9675802'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q6IA69', 'geneName': 'NADSYN1', 'intActId': 'EBI-748610'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8NI38', 'geneName': 'NFKBID', 'intActId': 'EBI-10271199'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q13133-3', 'geneName': 'NR1H3', 'intActId': 'EBI-11952806'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O43482', 'geneName': 'OIP5', 'intActId': 'EBI-536879'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q96CV9', 'geneName': 'OPTN', 'intActId': 'EBI-748974'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9HBE1-4', 'geneName': 'PATZ1', 'intActId': 'EBI-11022007'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P26367', 'geneName': 'PAX6', 'intActId': 'EBI-747278'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BYU1', 'geneName': 'PBX4', 'intActId': 'EBI-10302990'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q99471', 'geneName': 'PFDN5', 'intActId': 'EBI-357275'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9HDD0', 'geneName': 'PLAAT1', 'intActId': 'EBI-12387058'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8ND90', 'geneName': 'PNMA1', 'intActId': 'EBI-302345'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P60510', 'geneName': 'PPP4C', 'intActId': 'EBI-1046072'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NQX0', 'geneName': 'PRDM6', 'intActId': 'EBI-11320284'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q03431', 'geneName': 'PTH1R', 'intActId': 'EBI-2860297'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UFD9', 'geneName': 'RIMBP3', 'intActId': 'EBI-10182375'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y3C5', 'geneName': 'RNF11', 'intActId': 'EBI-396669'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'A6ZKI3', 'geneName': 'RTL8C', 'intActId': 'EBI-10174072'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O60504', 'geneName': 'SORBS3', 'intActId': 'EBI-741237'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UM82', 'geneName': 'SPATA2', 'intActId': 'EBI-744066'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NZ72', 'geneName': 'STMN3', 'intActId': 'EBI-725557'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q13190', 'geneName': 'STX5', 'intActId': 'EBI-714206'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q96N21', 'geneName': 'TEPSIN', 'intActId': 'EBI-11139477'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q12800', 'geneName': 'TFCP2', 'intActId': 'EBI-717422'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q08117-2', 'geneName': 'TLE5', 'intActId': 'EBI-11741437'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q12933', 'geneName': 'TRAF2', 'intActId': 'EBI-355744'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UL33-2', 'geneName': 'TRAPPC2L', 'intActId': 'EBI-11119202'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P36406', 'geneName': 'TRIM23', 'intActId': 'EBI-740098'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P14373', 'geneName': 'TRIM27', 'intActId': 'EBI-719493'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q86WV8', 'geneName': 'TSC1', 'intActId': 'EBI-12806590'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q6DKK2', 'geneName': 'TTC19', 'intActId': 'EBI-948354'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q5W5X9-3', 'geneName': 'TTC23', 'intActId': 'EBI-9090990'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P61758', 'geneName': 'VBP1', 'intActId': 'EBI-357430'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'O76024', 'geneName': 'WFS1', 'intActId': 'EBI-720609'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q05516', 'geneName': 'ZBTB16', 'intActId': 'EBI-711925'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q8IWT0-2', 'geneName': 'ZBTB8OS', 'intActId': 'EBI-12956041'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UDW3', 'geneName': 'ZMAT5', 'intActId': 'EBI-7850213'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'P0C7X2', 'geneName': 'ZNF688', 'intActId': 'EBI-4395732'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UGI0', 'geneName': 'ZRANB1', 'intActId': 'EBI-527853'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'O95872', 'intActId': 'EBI-751540'}, 'interactantTwo': {'uniProtKBAccession': 'G4XUV3', 'intActId': 'EBI-10177989'}, 'numberOfExperiments': 3, 'organismDiffer': False}]}], 'features': [{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 356, 'modifier': 'EXACT'}}, 'description': 'G patch domain and ankyrin repeat-containing protein 1', 'featureId': 'PRO_0000066972'}, {'type': 'Repeat', 'location': {'start': {'value': 111, 'modifier': 'EXACT'}, 'end': {'value': 141, 'modifier': 'EXACT'}}, 'description': 'ANK 1'}, {'type': 'Repeat', 'location': {'start': {'value': 142, 'modifier': 'EXACT'}, 'end': {'value': 172, 'modifier': 'EXACT'}}, 'description': 'ANK 2'}, {'type': 'Domain', 'location': {'start': {'value': 255, 'modifier': 'EXACT'}, 'end': {'value': 301, 'modifier': 'EXACT'}}, 'description': 'G-patch', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00092'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 63, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 67, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 77, 'modifier': 'EXACT'}, 'end': {'value': 88, 'modifier': 'EXACT'}}, 'description': 'Low complexity', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Cross-link', 'location': {'start': {'value': 290, 'modifier': 'EXACT'}, 'end': {'value': 290, 'modifier': 'EXACT'}}, 'description': 'Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '28112733'}]}, {'type': 'Natural variant', 'location': {'start': {'value': 41, 'modifier': 'EXACT'}, 'end': {'value': 41, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs3130618', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs3130618'}], 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '14574404'}], 'featureId': 'VAR_048291', 'alternativeSequence': {'originalSequence': 'R', 'alternativeSequences': ['L']}}, {'type': 'Natural variant', 'location': {'start': {'value': 112, 'modifier': 'EXACT'}, 'end': {'value': 112, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs35265780', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs35265780'}], 'featureId': 'VAR_048292', 'alternativeSequence': {'originalSequence': 'A', 'alternativeSequences': ['V']}}, {'type': 'Natural variant', 'location': {'start': {'value': 210, 'modifier': 'EXACT'}, 'end': {'value': 210, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs34082689', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs34082689'}], 'featureId': 'VAR_048293', 'alternativeSequence': {'originalSequence': 'S', 'alternativeSequences': ['A']}}, {'type': 'Natural variant', 'location': {'start': {'value': 235, 'modifier': 'EXACT'}, 'end': {'value': 235, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs2295666', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs2295666'}], 'featureId': 'VAR_020096', 'alternativeSequence': {'originalSequence': 'A', 'alternativeSequences': ['V']}}, {'type': 'Natural variant', 'location': {'start': {'value': 314, 'modifier': 'EXACT'}, 'end': {'value': 314, 'modifier': 'EXACT'}}, 'description': 'found in a clear cell renal carcinoma case; somatic mutation', 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '21248752'}], 'featureId': 'VAR_064698', 'alternativeSequence': {'originalSequence': 'T', 'alternativeSequences': ['N']}}], 'keywords': [{'id': 'KW-0040', 'category': 'Domain', 'name': 'ANK repeat'}, {'id': 'KW-1017', 'category': 'PTM', 'name': 'Isopeptide bond'}, {'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}, {'id': 'KW-0677', 'category': 'Domain', 'name': 'Repeat'}, {'id': 'KW-0832', 'category': 'PTM', 'name': 'Ubl conjugation'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '14656967', 'citationType': 'journal article', 'authors': ['Xie T.', 'Rowen L.', 'Aguado B.', 'Ahearn M.E.', 'Madan A.', 'Qin S.', 'Campbell R.D.', 'Hood L.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '14656967'}, {'database': 'DOI', 'id': '10.1101/gr.1736803'}], 'title': 'Analysis of the gene-dense major histocompatibility complex class III region and its comparison to mouse.', 'publicationDate': '2003', 'journal': 'Genome Res.', 'firstPage': '2621', 'lastPage': '2636', 'volume': '13'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 2, 'citation': {'id': 'CI-5GBDQ6B103N1E', 'citationType': 'submission', 'authors': ['Mural R.J.', 'Istrail S.', 'Sutton G.G.', 'Florea L.', 'Halpern A.L.', 'Mobarry C.M.', 'Lippert R.', 'Walenz B.', 'Shatkay H.', 'Dew I.', 'Miller J.R.', 'Flanigan M.J.', 'Edwards N.J.', 'Bolanos R.', 'Fasulo D.', 'Halldorsson B.V.', 'Hannenhalli S.', 'Turner R.', 'Yooseph S.', 'Lu F.', 'Nusskern D.R.', 'Shue B.C.', 'Zheng X.H.', 'Zhong F.', 'Delcher A.L.', 'Huson D.H.', 'Kravitz S.A.', 'Mouchard L.', 'Reinert K.', 'Remington K.A.', 'Clark A.G.', 'Waterman M.S.', 'Eichler E.E.', 'Adams M.D.', 'Hunkapiller M.W.', 'Myers E.W.', 'Venter J.C.'], 'publicationDate': 'JUL-2005', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 3, 'citation': {'id': 'CI-C28MMRF9FQI79', 'citationType': 'submission', 'authors': ['Shiina S.', 'Tamiya G.', 'Oka A.', 'Inoko H.'], 'title': 'Homo sapiens 2,229,817bp genomic DNA of 6p21.3 HLA class I region.', 'publicationDate': 'SEP-1999', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 4, 'citation': {'id': '14574404', 'citationType': 'journal article', 'authors': ['Mungall A.J.', 'Palmer S.A.', 'Sims S.K.', 'Edwards C.A.', 'Ashurst J.L.', 'Wilming L.', 'Jones M.C.', 'Horton R.', 'Hunt S.E.', 'Scott C.E.', 'Gilbert J.G.R.', 'Clamp M.E.', 'Bethel G.', 'Milne S.', 'Ainscough R.', 'Almeida J.P.', 'Ambrose K.D.', 'Andrews T.D.', 'Ashwell R.I.S.', 'Babbage A.K.', 'Bagguley C.L.', 'Bailey J.', 'Banerjee R.', 'Barker D.J.', 'Barlow K.F.', 'Bates K.', 'Beare D.M.', 'Beasley H.', 'Beasley O.', 'Bird C.P.', 'Blakey S.E.', 'Bray-Allen S.', 'Brook J.', 'Brown A.J.', 'Brown J.Y.', 'Burford D.C.', 'Burrill W.', 'Burton J.', 'Carder C.', 'Carter N.P.', 'Chapman J.C.', 'Clark S.Y.', 'Clark G.', 'Clee C.M.', 'Clegg S.', 'Cobley V.', 'Collier R.E.', 'Collins J.E.', 'Colman L.K.', 'Corby N.R.', 'Coville G.J.', 'Culley K.M.', 'Dhami P.', 'Davies J.', 'Dunn M.', 'Earthrowl M.E.', 'Ellington A.E.', 'Evans K.A.', 'Faulkner L.', 'Francis M.D.', 'Frankish A.', 'Frankland J.', 'French L.', 'Garner P.', 'Garnett J.', 'Ghori M.J.', 'Gilby L.M.', 'Gillson C.J.', 'Glithero R.J.', 'Grafham D.V.', 'Grant M.', 'Gribble S.', 'Griffiths C.', 'Griffiths M.N.D.', 'Hall R.', 'Halls K.S.', 'Hammond S.', 'Harley J.L.', 'Hart E.A.', 'Heath P.D.', 'Heathcott R.', 'Holmes S.J.', 'Howden P.J.', 'Howe K.L.', 'Howell G.R.', 'Huckle E.', 'Humphray S.J.', 'Humphries M.D.', 'Hunt A.R.', 'Johnson C.M.', 'Joy A.A.', 'Kay M.', 'Keenan S.J.', 'Kimberley A.M.', 'King A.', 'Laird G.K.', 'Langford C.', 'Lawlor S.', 'Leongamornlert D.A.', 'Leversha M.', 'Lloyd C.R.', 'Lloyd D.M.', 'Loveland J.E.', 'Lovell J.', 'Martin S.', 'Mashreghi-Mohammadi M.', 'Maslen G.L.', 'Matthews L.', 'McCann O.T.', 'McLaren S.J.', 'McLay K.', 'McMurray A.', 'Moore M.J.F.', 'Mullikin J.C.', 'Niblett D.', 'Nickerson T.', 'Novik K.L.', 'Oliver K.', 'Overton-Larty E.K.', 'Parker A.', 'Patel R.', 'Pearce A.V.', 'Peck A.I.', 'Phillimore B.J.C.T.', 'Phillips S.', 'Plumb R.W.', 'Porter K.M.', 'Ramsey Y.', 'Ranby S.A.', 'Rice C.M.', 'Ross M.T.', 'Searle S.M.', 'Sehra H.K.', 'Sheridan E.', 'Skuce C.D.', 'Smith S.', 'Smith M.', 'Spraggon L.', 'Squares S.L.', 'Steward C.A.', 'Sycamore N.', 'Tamlyn-Hall G.', 'Tester J.', 'Theaker A.J.', 'Thomas D.W.', 'Thorpe A.', 'Tracey A.', 'Tromans A.', 'Tubby B.', 'Wall M.', 'Wallis J.M.', 'West A.P.', 'White S.S.', 'Whitehead S.L.', 'Whittaker H.', 'Wild A.', 'Willey D.J.', 'Wilmer T.E.', 'Wood J.M.', 'Wray P.W.', 'Wyatt J.C.', 'Young L.', 'Younger R.M.', 'Bentley D.R.', 'Coulson A.', 'Durbin R.M.', 'Hubbard T.', 'Sulston J.E.', 'Dunham I.', 'Rogers J.', 'Beck S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '14574404'}, {'database': 'DOI', 'id': '10.1038/nature02055'}], 'title': 'The DNA sequence and analysis of human chromosome 6.', 'publicationDate': '2003', 'journal': 'Nature', 'firstPage': '805', 'lastPage': '811', 'volume': '425'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]', 'VARIANT LEU-41']}, {'referenceNumber': 5, 'citation': {'id': '15489334', 'citationType': 'journal article', 'authoringGroup': ['The MGC Project Team'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15489334'}, {'database': 'DOI', 'id': '10.1101/gr.2596504'}], 'title': 'The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).', 'publicationDate': '2004', 'journal': 'Genome Res.', 'firstPage': '2121', 'lastPage': '2127', 'volume': '14'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]'], 'referenceComments': [{'value': 'Placenta', 'type': 'TISSUE'}]}, {'referenceNumber': 6, 'citation': {'id': '18669648', 'citationType': 'journal article', 'authors': ['Dephoure N.', 'Zhou C.', 'Villen J.', 'Beausoleil S.A.', 'Bakalarski C.E.', 'Elledge S.J.', 'Gygi S.P.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '18669648'}, {'database': 'DOI', 'id': '10.1073/pnas.0805139105'}], 'title': 'A quantitative atlas of mitotic phosphorylation.', 'publicationDate': '2008', 'journal': 'Proc. Natl. Acad. Sci. U.S.A.', 'firstPage': '10762', 'lastPage': '10767', 'volume': '105'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 7, 'citation': {'id': '28112733', 'citationType': 'journal article', 'authors': ['Hendriks I.A.', 'Lyon D.', 'Young C.', 'Jensen L.J.', 'Vertegaal A.C.', 'Nielsen M.L.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '28112733'}, {'database': 'DOI', 'id': '10.1038/nsmb.3366'}], 'title': 'Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation.', 'publicationDate': '2017', 'journal': 'Nat. Struct. Mol. Biol.', 'firstPage': '325', 'lastPage': '336', 'volume': '24'}, 'referencePositions': ['SUMOYLATION [LARGE SCALE ANALYSIS] AT LYS-290', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]']}, {'referenceNumber': 8, 'citation': {'id': '21248752', 'citationType': 'journal article', 'authors': ['Varela I.', 'Tarpey P.', 'Raine K.', 'Huang D.', 'Ong C.K.', 'Stephens P.', 'Davies H.', 'Jones D.', 'Lin M.L.', 'Teague J.', 'Bignell G.', 'Butler A.', 'Cho J.', 'Dalgliesh G.L.', 'Galappaththige D.', 'Greenman C.', 'Hardy C.', 'Jia M.', 'Latimer C.', 'Lau K.W.', 'Marshall J.', 'McLaren S.', 'Menzies A.', 'Mudie L.', 'Stebbings L.', 'Largaespada D.A.', 'Wessels L.F.A.', 'Richard S.', 'Kahnoski R.J.', 'Anema J.', 'Tuveson D.A.', 'Perez-Mancera P.A.', 'Mustonen V.', 'Fischer A.', 'Adams D.J.', 'Rust A.', 'Chan-On W.', 'Subimerb C.', 'Dykema K.', 'Furge K.', 'Campbell P.J.', 'Teh B.T.', 'Stratton M.R.', 'Futreal P.A.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '21248752'}, {'database': 'DOI', 'id': '10.1038/nature09639'}], 'title': 'Exome sequencing identifies frequent mutation of the SWI/SNF complex gene PBRM1 in renal carcinoma.', 'publicationDate': '2011', 'journal': 'Nature', 'firstPage': '539', 'lastPage': '542', 'volume': '469'}, 'referencePositions': ['VARIANT ASN-314']}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AF129756', 'properties': [{'key': 'ProteinId', 'value': 'AAD18082.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BA000025', 'properties': [{'key': 'ProteinId', 'value': 'BAB63387.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'AL670886', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'AL805934', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'AL662899', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BX511262', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'CR354443', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'CR759761', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'CH471081', 'properties': [{'key': 'ProteinId', 'value': 'EAX03467.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BC008783', 'properties': [{'key': 'ProteinId', 'value': 'AAH08783.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'CCDS', 'id': 'CCDS4711.1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'RefSeq', 'id': 'NP_001186166.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199237.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186167.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199238.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186168.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199239.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186169.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199240.1'}]}, {'database': 'RefSeq', 'id': 'NP_149417.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_033177.4'}]}, {'database': 'RefSeq', 'id': 'XP_005249460.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_005249403.4'}]}, {'database': 'RefSeq', 'id': 'XP_006715267.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_006715204.2'}]}, {'database': 'RefSeq', 'id': 'XP_011513211.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_011514909.2'}]}, {'database': 'RefSeq', 'id': 'XP_011513212.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_011514910.2'}]}, {'database': 'RefSeq', 'id': 'XP_024302317.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_024446549.2'}]}, {'database': 'RefSeq', 'id': 'XP_047275306.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_047419350.1'}]}, {'database': 'RefSeq', 'id': 'XP_047275307.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_047419351.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186648.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330673.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186649.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330674.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186650.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330675.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186651.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330676.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186652.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330677.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186653.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330678.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186654.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330679.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186857.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330882.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186858.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330883.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186859.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330884.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186860.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330885.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186861.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330886.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186862.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330887.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186863.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330888.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212377.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356402.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212378.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356403.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212379.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356404.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212380.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356405.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212381.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356406.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212382.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356407.1'}]}, {'database': 'RefSeq', 'id': 'XP_054212383.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054356408.1'}]}, {'database': 'AlphaFoldDB', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID', 'id': '113648', 'properties': [{'key': 'Interactions', 'value': '100'}]}, {'database': 'IntAct', 'id': 'O95872', 'properties': [{'key': 'Interactions', 'value': '97'}]}, {'database': 'MINT', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'STRING', 'id': '9606.ENSP00000365071', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'iPTMnet', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhosphoSitePlus', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioMuta', 'id': 'GPANK1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'jPOST', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MassIVE', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PaxDb', 'id': '9606-ENSP00000365071', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '51112', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pumba', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '27457', 'properties': [{'key': 'antibodies', 'value': '71 antibodies from 22 providers'}]}, {'database': 'DNASU', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000211377.7', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000211377.3'}, {'key': 'GeneId', 'value': 'ENSG00000236011.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000375893.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365057.2'}, {'key': 'GeneId', 'value': 'ENSG00000204438.11'}]}, {'database': 'Ensembl', 'id': 'ENST00000375895.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365059.2'}, {'key': 'GeneId', 'value': 'ENSG00000204438.11'}]}, {'database': 'Ensembl', 'id': 'ENST00000375896.9', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365060.4'}, {'key': 'GeneId', 'value': 'ENSG00000204438.11'}]}, {'database': 'Ensembl', 'id': 'ENST00000375900.8', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365065.4'}, {'key': 'GeneId', 'value': 'ENSG00000204438.11'}]}, {'database': 'Ensembl', 'id': 'ENST00000375906.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365071.1'}, {'key': 'GeneId', 'value': 'ENSG00000204438.11'}]}, {'database': 'Ensembl', 'id': 'ENST00000383434.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000372926.2'}, {'key': 'GeneId', 'value': 'ENSG00000206408.8'}]}, {'database': 'Ensembl', 'id': 'ENST00000383436.8', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000372928.4'}, {'key': 'GeneId', 'value': 'ENSG00000206408.8'}]}, {'database': 'Ensembl', 'id': 'ENST00000400129.7', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000382994.3'}, {'key': 'GeneId', 'value': 'ENSG00000206408.8'}]}, {'database': 'Ensembl', 'id': 'ENST00000400134.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000382999.1'}, {'key': 'GeneId', 'value': 'ENSG00000206408.8'}]}, {'database': 'Ensembl', 'id': 'ENST00000400139.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000383005.1'}, {'key': 'GeneId', 'value': 'ENSG00000206408.8'}]}, {'database': 'Ensembl', 'id': 'ENST00000414528.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000389339.2'}, {'key': 'GeneId', 'value': 'ENSG00000228605.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000416384.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000401376.1'}, {'key': 'GeneId', 'value': 'ENSG00000223932.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000417890.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000415855.2'}, {'key': 'GeneId', 'value': 'ENSG00000232312.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000418502.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000412321.2'}, {'key': 'GeneId', 'value': 'ENSG00000223932.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000422131.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000410690.2'}, {'key': 'GeneId', 'value': 'ENSG00000223932.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000426326.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000398927.2'}, {'key': 'GeneId', 'value': 'ENSG00000228605.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000427527.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000391604.1'}, {'key': 'GeneId', 'value': 'ENSG00000236011.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000429266.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000412884.1'}, {'key': 'GeneId', 'value': 'ENSG00000223932.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000433857.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000406895.2'}, {'key': 'GeneId', 'value': 'ENSG00000236011.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000433880.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000401272.1'}, {'key': 'GeneId', 'value': 'ENSG00000228605.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000439710.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000409206.1'}, {'key': 'GeneId', 'value': 'ENSG00000232312.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000440777.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000399890.1'}, {'key': 'GeneId', 'value': 'ENSG00000223932.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000447705.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000403829.2'}, {'key': 'GeneId', 'value': 'ENSG00000236011.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000448091.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000411691.1'}, {'key': 'GeneId', 'value': 'ENSG00000232312.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000449868.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000393567.2'}, {'key': 'GeneId', 'value': 'ENSG00000232312.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000453746.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000390746.1'}, {'key': 'GeneId', 'value': 'ENSG00000232312.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000453899.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000413662.1'}, {'key': 'GeneId', 'value': 'ENSG00000228605.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000456191.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000401276.1'}, {'key': 'GeneId', 'value': 'ENSG00000236011.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000457505.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000406507.1'}, {'key': 'GeneId', 'value': 'ENSG00000228605.6'}]}, {'database': 'GeneID', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MANE-Select', 'id': 'ENST00000375896.9', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000365060.4'}, {'key': 'RefSeqNucleotideId', 'value': 'NM_033177.4'}, {'key': 'RefSeqProteinId', 'value': 'NP_149417.1'}]}, {'database': 'UCSC', 'id': 'uc003nvn.4', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'AGR', 'id': 'HGNC:13920', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneCards', 'id': 'GPANK1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:13920', 'properties': [{'key': 'GeneName', 'value': 'GPANK1'}]}, {'database': 'HPA', 'id': 'ENSG00000204438', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Low tissue specificity'}]}, {'database': 'MIM', 'id': '142610', 'properties': [{'key': 'Type', 'value': 'gene'}]}, {'database': 'neXtProt', 'id': 'NX_O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OpenTargets', 'id': 'ENSG00000204438', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PharmGKB', 'id': 'PA25265', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000204438', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'eggNOG', 'id': 'KOG2384', 'properties': [{'key': 'ToxonomicScope', 'value': 'Eukaryota'}]}, {'database': 'GeneTree', 'id': 'ENSGT00390000003292', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_048068_0_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'InParanoid', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'QGWDQEH', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '4735278at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhylomeDB', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'TreeFam', 'id': 'TF315384', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PathwayCommons', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SignaLink', 'id': 'O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '7918', 'properties': [{'key': 'hits', 'value': '11 hits in 1141 CRISPR screens'}]}, {'database': 'ChiTaRS', 'id': 'GPANK1', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'GeneWiki', 'id': 'BAT4', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GenomeRNAi', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pharos', 'id': 'O95872', 'properties': [{'key': 'DevelopmentLevel', 'value': 'Tbio'}]}, {'database': 'PRO', 'id': 'PR:O95872', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome 6'}]}, {'database': 'RNAct', 'id': 'O95872', 'properties': [{'key': 'moleculeType', 'value': 'protein'}]}, {'database': 'Bgee', 'id': 'ENSG00000204438', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in left testis and 100 other cell types or tissues'}]}, {'database': 'ExpressionAtlas', 'id': 'O95872', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0003676', 'properties': [{'key': 'GoTerm', 'value': 'F:nucleic acid binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'Gene3D', 'id': '1.25.40.20', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat-containing domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR002110', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt'}]}, {'database': 'InterPro', 'id': 'IPR036770', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt-contain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000467', 'properties': [{'key': 'EntryName', 'value': 'G_patch_dom'}]}, {'database': 'InterPro', 'id': 'IPR039146', 'properties': [{'key': 'EntryName', 'value': 'GPANK1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923', 'properties': [{'key': 'EntryName', 'value': 'BAT4 PROTEIN-RELATED'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923:SF1', 'properties': [{'key': 'EntryName', 'value': 'G PATCH DOMAIN AND ANKYRIN REPEAT-CONTAINING PROTEIN 1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF13637', 'properties': [{'key': 'EntryName', 'value': 'Ank_4'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF01585', 'properties': [{'key': 'EntryName', 'value': 'G-patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00443', 'properties': [{'key': 'EntryName', 'value': 'G_patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF48403', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50297', 'properties': [{'key': 'EntryName', 'value': 'ANK_REP_REGION'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50174', 'properties': [{'key': 'EntryName', 'value': 'G_PATCH'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF', 'length': 356, 'molWeight': 39314, 'crc64': 'AA57AA938A8C10E5', 'md5': 'B0A190D7F490ABC19AE961D96CE57026'}, 'extraAttributes': {'countByCommentType': {'INTERACTION': 86}, 'countByFeatureType': {'Chain': 1, 'Repeat': 2, 'Domain': 1, 'Region': 1, 'Compositional bias': 3, 'Cross-link': 1, 'Natural variant': 5}, 'uniParcId': 'UPI00001267A1'}}}, {'from': '7918', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'A0A024RCU2', 'uniProtkbId': 'A0A024RCU2_HUMAN', 'entryAudit': {'firstPublicDate': '2014-07-09', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2014-07-09', 'entryVersion': 75, 'sequenceVersion': 1}, 'annotationScore': 1.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76834.1'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '4: Predicted', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76834.1'}], 'value': 'GPANK1'}}]}, 'features': [{'type': 'Domain', 'location': {'start': {'value': 255, 'modifier': 'EXACT'}, 'end': {'value': 301, 'modifier': 'EXACT'}}, 'description': 'G-patch', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50174'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 63, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 67, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 77, 'modifier': 'EXACT'}, 'end': {'value': 88, 'modifier': 'EXACT'}}, 'description': 'Low complexity', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}], 'references': [{'referenceNumber': 1, 'citation': {'id': 'CI-33EK5LUFA44A1', 'citationType': 'submission', 'authors': ['Norman P.J.', 'Norberg S.J.', 'Nemat-Gorgani N.', 'Ronaghi M.', 'Parham P.'], 'title': 'CDS alleles of MHC region genes derived from homozygous individuals.', 'publicationDate': 'APR-2017', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76834.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'KY500344', 'properties': [{'key': 'ProteinId', 'value': 'AQY76832.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'KY500345', 'properties': [{'key': 'ProteinId', 'value': 'AQY76833.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'KY500346', 'properties': [{'key': 'ProteinId', 'value': 'AQY76834.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'KY500351', 'properties': [{'key': 'ProteinId', 'value': 'AQY76838.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'RefSeq', 'id': 'NP_001186166.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199237.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186167.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199238.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186168.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199239.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186169.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199240.1'}]}, {'database': 'RefSeq', 'id': 'NP_149417.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_033177.3'}]}, {'database': 'RefSeq', 'id': 'XP_005249460.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_005249403.3'}]}, {'database': 'RefSeq', 'id': 'XP_006715267.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_006715204.1'}]}, {'database': 'RefSeq', 'id': 'XP_011513211.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_011514909.1'}]}, {'database': 'RefSeq', 'id': 'XP_011513212.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_011514910.1'}]}, {'database': 'AlphaFoldDB', 'id': 'A0A024RCU2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'A0A024RCU2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '27457', 'properties': [{'key': 'antibodies', 'value': '71 antibodies from 22 providers'}]}, {'database': 'DNASU', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneID', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000204438', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_048068_0_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'QGWDQEH', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '4735278at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '7918', 'properties': [{'key': 'hits', 'value': '11 hits in 1141 CRISPR screens'}]}, {'database': 'ChiTaRS', 'id': 'GPANK1', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'ExpressionAtlas', 'id': 'A0A024RCU2', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0003676', 'properties': [{'key': 'GoTerm', 'value': 'F:nucleic acid binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'Gene3D', 'id': '1.25.40.20', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat-containing domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR002110', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt'}]}, {'database': 'InterPro', 'id': 'IPR036770', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt-contain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000467', 'properties': [{'key': 'EntryName', 'value': 'G_patch_dom'}]}, {'database': 'InterPro', 'id': 'IPR039146', 'properties': [{'key': 'EntryName', 'value': 'GPANK1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923', 'properties': [{'key': 'EntryName', 'value': 'BAT4 PROTEIN-RELATED'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923:SF1', 'properties': [{'key': 'EntryName', 'value': 'G PATCH DOMAIN AND ANKYRIN REPEAT-CONTAINING PROTEIN 1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF13637', 'properties': [{'key': 'EntryName', 'value': 'Ank_4'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF01585', 'properties': [{'key': 'EntryName', 'value': 'G-patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00443', 'properties': [{'key': 'EntryName', 'value': 'G_patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF48403', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50174', 'properties': [{'key': 'EntryName', 'value': 'G_PATCH'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF', 'length': 356, 'molWeight': 39314, 'crc64': 'AA57AA938A8C10E5', 'md5': 'B0A190D7F490ABC19AE961D96CE57026'}, 'extraAttributes': {'countByFeatureType': {'Domain': 1, 'Region': 1, 'Compositional bias': 3}, 'uniParcId': 'UPI00001267A1'}}}, {'from': '7918', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'A0A0G2JHL1', 'uniProtkbId': 'A0A0G2JHL1_HUMAN', 'entryAudit': {'firstPublicDate': '2015-07-22', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2015-07-22', 'entryVersion': 55, 'sequenceVersion': 1}, 'annotationScore': 1.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000390293.1'}, {'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000390293.1'}], 'value': 'G-patch domain and ankyrin repeats 1'}}, {'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76836.1'}], 'value': 'GPANK1'}}]}, 'genes': [{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000390293.1'}], 'value': 'GPANK1'}}], 'features': [{'type': 'Domain', 'location': {'start': {'value': 255, 'modifier': 'EXACT'}, 'end': {'value': 301, 'modifier': 'EXACT'}}, 'description': 'G-patch', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50174'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 63, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 67, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 77, 'modifier': 'EXACT'}, 'end': {'value': 88, 'modifier': 'EXACT'}}, 'description': 'Low complexity', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}], 'keywords': [{'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PeptideAtlas', 'id': 'A0A0G2JHL1'}, {'evidenceCode': 'ECO:0007829', 'source': 'ProteomicsDB', 'id': 'A0A0G2JHL1'}], 'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '18669648', 'citationType': 'journal article', 'authors': ['Dephoure N.', 'Zhou C.', 'Villen J.', 'Beausoleil S.A.', 'Bakalarski C.E.', 'Elledge S.J.', 'Gygi S.P.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '18669648'}, {'database': 'DOI', 'id': '10.1073/pnas.0805139105'}], 'title': 'A quantitative atlas of mitotic phosphorylation.', 'publicationDate': '2008', 'journal': 'Proc. Natl. Acad. Sci. U.S.A.', 'firstPage': '10762', 'lastPage': '10767', 'volume': '105'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '18669648'}]}, {'referenceNumber': 2, 'citation': {'id': '28112733', 'citationType': 'journal article', 'authors': ['Hendriks I.A.', 'Lyon D.', 'Young C.', 'Jensen L.J.', 'Vertegaal A.C.', 'Nielsen M.L.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '28112733'}], 'title': 'Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation.', 'publicationDate': '2017', 'journal': 'Nat. Struct. Mol. Biol.', 'firstPage': '325', 'lastPage': '336', 'volume': '24'}, 'referencePositions': ['IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PubMed', 'id': '28112733'}]}, {'referenceNumber': 3, 'citation': {'id': 'CI-33EK5LUFA44A1', 'citationType': 'submission', 'authors': ['Norman P.J.', 'Norberg S.J.', 'Nemat-Gorgani N.', 'Ronaghi M.', 'Parham P.'], 'title': 'CDS alleles of MHC region genes derived from homozygous individuals.', 'publicationDate': 'APR-2017', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76836.1'}]}, {'referenceNumber': 4, 'citation': {'id': 'CI-20IMBF1U3E5V5', 'citationType': 'submission', 'authoringGroup': ['Ensembl'], 'publicationDate': 'DEC-2024', 'submissionDatabase': 'UniProtKB'}, 'referencePositions': ['IDENTIFICATION'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000390293.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'KY500349', 'properties': [{'key': 'ProteinId', 'value': 'AQY76836.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'RefSeq', 'id': 'XP_054186367.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330392.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186368.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330393.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186369.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330394.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186370.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330395.1'}]}, {'database': 'RefSeq', 'id': 'XP_054186371.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054330396.1'}]}, {'database': 'SMR', 'id': 'A0A0G2JHL1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000419139.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000390293.1'}, {'key': 'GeneId', 'value': 'ENSG00000233210.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000419261.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000388586.1'}, {'key': 'GeneId', 'value': 'ENSG00000233210.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000423886.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000409855.1'}, {'key': 'GeneId', 'value': 'ENSG00000233210.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000426206.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000414105.2'}, {'key': 'GeneId', 'value': 'ENSG00000233210.6'}]}, {'database': 'Ensembl', 'id': 'ENST00000430994.6', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000388687.2'}, {'key': 'GeneId', 'value': 'ENSG00000233210.6'}]}, {'database': 'GeneID', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:13920', 'properties': [{'key': 'GeneName', 'value': 'GPANK1'}]}, {'database': 'OrthoDB', 'id': '4735278at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ChiTaRS', 'id': 'GPANK1', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Unplaced'}]}, {'database': 'GO', 'id': 'GO:0003676', 'properties': [{'key': 'GoTerm', 'value': 'F:nucleic acid binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'Gene3D', 'id': '1.25.40.20', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat-containing domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR002110', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt'}]}, {'database': 'InterPro', 'id': 'IPR036770', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt-contain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000467', 'properties': [{'key': 'EntryName', 'value': 'G_patch_dom'}]}, {'database': 'InterPro', 'id': 'IPR039146', 'properties': [{'key': 'EntryName', 'value': 'GPANK1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923', 'properties': [{'key': 'EntryName', 'value': 'BAT4 PROTEIN-RELATED'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923:SF1', 'properties': [{'key': 'EntryName', 'value': 'G PATCH DOMAIN AND ANKYRIN REPEAT-CONTAINING PROTEIN 1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF13637', 'properties': [{'key': 'EntryName', 'value': 'Ank_4'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF01585', 'properties': [{'key': 'EntryName', 'value': 'G-patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00443', 'properties': [{'key': 'EntryName', 'value': 'G_patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF48403', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50174', 'properties': [{'key': 'EntryName', 'value': 'G_PATCH'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRRGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF', 'length': 356, 'molWeight': 39413, 'crc64': 'C127BED69EDD55F1', 'md5': '468ECB9618759253C435021E77600D95'}, 'extraAttributes': {'countByFeatureType': {'Domain': 1, 'Region': 1, 'Compositional bias': 3}, 'uniParcId': 'UPI000165DCFB'}}}, {'from': '7918', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'A0A1U9X7R4', 'uniProtkbId': 'A0A1U9X7R4_HUMAN', 'entryAudit': {'firstPublicDate': '2017-06-07', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2017-06-07', 'entryVersion': 24, 'sequenceVersion': 1}, 'annotationScore': 1.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76837.1'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '4: Predicted', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76837.1'}], 'value': 'GPANK1'}}]}, 'features': [{'type': 'Domain', 'location': {'start': {'value': 255, 'modifier': 'EXACT'}, 'end': {'value': 301, 'modifier': 'EXACT'}}, 'description': 'G-patch', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50174'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 63, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 67, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 77, 'modifier': 'EXACT'}, 'end': {'value': 88, 'modifier': 'EXACT'}}, 'description': 'Low complexity', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}], 'references': [{'referenceNumber': 1, 'citation': {'id': 'CI-33EK5LUFA44A1', 'citationType': 'submission', 'authors': ['Norman P.J.', 'Norberg S.J.', 'Nemat-Gorgani N.', 'Ronaghi M.', 'Parham P.'], 'title': 'CDS alleles of MHC region genes derived from homozygous individuals.', 'publicationDate': 'APR-2017', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'AQY76837.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'KY500348', 'properties': [{'key': 'ProteinId', 'value': 'AQY76835.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'KY500350', 'properties': [{'key': 'ProteinId', 'value': 'AQY76837.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'RefSeq', 'id': 'XP_054185875.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054329900.1'}]}, {'database': 'RefSeq', 'id': 'XP_054185876.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054329901.1'}]}, {'database': 'RefSeq', 'id': 'XP_054185877.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054329902.1'}]}, {'database': 'RefSeq', 'id': 'XP_054185878.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054329903.1'}]}, {'database': 'RefSeq', 'id': 'XP_054185879.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054329904.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187143.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331168.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187144.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331169.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187145.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331170.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187146.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331171.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187147.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331172.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187382.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331407.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187383.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331408.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187384.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331409.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187385.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331410.1'}]}, {'database': 'RefSeq', 'id': 'XP_054187386.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_054331411.1'}]}, {'database': 'AlphaFoldDB', 'id': 'A0A1U9X7R4', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'A0A1U9X7R4', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'A0A1U9X7R4', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneID', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GO', 'id': 'GO:0003676', 'properties': [{'key': 'GoTerm', 'value': 'F:nucleic acid binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'Gene3D', 'id': '1.25.40.20', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat-containing domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR002110', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt'}]}, {'database': 'InterPro', 'id': 'IPR036770', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt-contain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000467', 'properties': [{'key': 'EntryName', 'value': 'G_patch_dom'}]}, {'database': 'InterPro', 'id': 'IPR039146', 'properties': [{'key': 'EntryName', 'value': 'GPANK1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923', 'properties': [{'key': 'EntryName', 'value': 'BAT4 PROTEIN-RELATED'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923:SF1', 'properties': [{'key': 'EntryName', 'value': 'G PATCH DOMAIN AND ANKYRIN REPEAT-CONTAINING PROTEIN 1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF13637', 'properties': [{'key': 'EntryName', 'value': 'Ank_4'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF01585', 'properties': [{'key': 'EntryName', 'value': 'G-patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00443', 'properties': [{'key': 'EntryName', 'value': 'G_patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF48403', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50174', 'properties': [{'key': 'EntryName', 'value': 'G_PATCH'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAALAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF', 'length': 356, 'molWeight': 39271, 'crc64': '1131A058AD6A1709', 'md5': 'ECF63F1772A883C03C79ACCF1840B0FB'}, 'extraAttributes': {'countByFeatureType': {'Domain': 1, 'Region': 1, 'Compositional bias': 3}, 'uniParcId': 'UPI00001AFEB8'}}}, {'from': '7918', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'B2RA66', 'uniProtkbId': 'B2RA66_HUMAN', 'entryAudit': {'firstPublicDate': '2008-07-01', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2008-07-01', 'entryVersion': 77, 'sequenceVersion': 1}, 'annotationScore': 1.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'BAG36763.1'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '2: Evidence at transcript level', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'BAG36763.1'}], 'value': 'cDNA, FLJ94721, highly similar to Homo sapiens HLA-B associated transcript 4 (BAT4), mRNA'}}]}, 'features': [{'type': 'Domain', 'location': {'start': {'value': 255, 'modifier': 'EXACT'}, 'end': {'value': 301, 'modifier': 'EXACT'}}, 'description': 'G-patch', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50174'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 34, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Region', 'location': {'start': {'value': 49, 'modifier': 'EXACT'}, 'end': {'value': 98, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 63, 'modifier': 'EXACT'}}, 'description': 'Polar residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 67, 'modifier': 'EXACT'}, 'end': {'value': 76, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 77, 'modifier': 'EXACT'}, 'end': {'value': 88, 'modifier': 'EXACT'}}, 'description': 'Low complexity', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}], 'references': [{'referenceNumber': 1, 'citation': {'id': 'CI-6MEA4GF1I55C2', 'citationType': 'submission', 'authors': ['Wakamatsu A.', 'Yamamoto J.', 'Kimura K.', 'Kaida T.', 'Tsuchiya K.', 'Iida Y.', 'Takayama Y.', 'Murakawa K.', 'Kanehori K.', 'Andoh T.', 'Kagawa N.', 'Sato R.', 'Kawamura Y.', 'Tanaka S.', 'Kisu Y.', 'Sugano S.', 'Goshima N.', 'Nomura N.', 'Isogai T.'], 'title': 'NEDO functional analysis of protein and research application project.', 'publicationDate': 'JAN-2008', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE'], 'referenceComments': [{'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'BAG36763.1'}], 'value': 'Thymus', 'type': 'TISSUE'}], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'EMBL', 'id': 'BAG36763.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AK314056', 'properties': [{'key': 'ProteinId', 'value': 'BAG36763.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'RefSeq', 'id': 'NP_001186167.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199238.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186168.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199239.1'}]}, {'database': 'RefSeq', 'id': 'NP_001186169.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_001199240.1'}]}, {'database': 'AlphaFoldDB', 'id': 'B2RA66', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'B2RA66', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DNASU', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneID', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '7918', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '4735278at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '7918', 'properties': [{'key': 'hits', 'value': '11 hits in 1141 CRISPR screens'}]}, {'database': 'GO', 'id': 'GO:0003676', 'properties': [{'key': 'GoTerm', 'value': 'F:nucleic acid binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'Gene3D', 'id': '1.25.40.20', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat-containing domain'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR002110', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt'}]}, {'database': 'InterPro', 'id': 'IPR036770', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin_rpt-contain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000467', 'properties': [{'key': 'EntryName', 'value': 'G_patch_dom'}]}, {'database': 'InterPro', 'id': 'IPR039146', 'properties': [{'key': 'EntryName', 'value': 'GPANK1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923', 'properties': [{'key': 'EntryName', 'value': 'BAT4 PROTEIN-RELATED'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR20923:SF1', 'properties': [{'key': 'EntryName', 'value': 'G PATCH DOMAIN AND ANKYRIN REPEAT-CONTAINING PROTEIN 1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF13637', 'properties': [{'key': 'EntryName', 'value': 'Ank_4'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF01585', 'properties': [{'key': 'EntryName', 'value': 'G-patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00443', 'properties': [{'key': 'EntryName', 'value': 'G_patch'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SUPFAM', 'id': 'SSF48403', 'properties': [{'key': 'EntryName', 'value': 'Ankyrin repeat'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50174', 'properties': [{'key': 'EntryName', 'value': 'G_PATCH'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MSRPLLITFTPATDPSDLWKDGQQQLQPEKPESTLDGAAALAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPGVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF', 'length': 356, 'molWeight': 39215, 'crc64': '998E45ADE23B6DA6', 'md5': '109226E5E3C0918B36B99441BC299B92'}, 'extraAttributes': {'countByFeatureType': {'Domain': 1, 'Region': 2, 'Compositional bias': 3}, 'uniParcId': 'UPI0001750696'}}}, {'from': '9240', 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)', 'primaryAccession': 'Q8ND90', 'secondaryAccessions': ['A8K4L5', 'O95144', 'Q8NG07'], 'uniProtkbId': 'PNMA1_HUMAN', 'entryAudit': {'firstPublicDate': '2003-10-24', 'lastAnnotationUpdateDate': '2025-04-09', 'lastSequenceUpdateDate': '2003-10-24', 'entryVersion': 165, 'sequenceVersion': 2}, 'annotationScore': 5.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'recommendedName': {'fullName': {'value': 'Paraneoplastic antigen Ma1'}}, 'alternativeNames': [{'fullName': {'value': '37 kDa neuronal protein'}}, {'fullName': {'value': 'Neuron- and testis-specific protein 1'}}]}, 'genes': [{'geneName': {'value': 'PNMA1'}, 'synonyms': [{'value': 'MA1'}]}], 'comments': [{'commentType': 'INTERACTION', 'interactions': [{'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NX04', 'geneName': 'AIRIM', 'intActId': 'EBI-8643161'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96IX9', 'geneName': 'ANKRD36BP1', 'intActId': 'EBI-744859'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z2E3', 'geneName': 'APTX', 'intActId': 'EBI-847814'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z2E3-7', 'geneName': 'APTX', 'intActId': 'EBI-12298187'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q15052', 'geneName': 'ARHGEF6', 'intActId': 'EBI-1642523'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q03989', 'geneName': 'ARID5A', 'intActId': 'EBI-948603'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O95671', 'geneName': 'ASMTL', 'intActId': 'EBI-743231'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P48047', 'geneName': 'ATP5PO', 'intActId': 'EBI-355815'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O75348', 'geneName': 'ATP6V1G1', 'intActId': 'EBI-711802'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BXY8', 'geneName': 'BEX2', 'intActId': 'EBI-745073'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q13895', 'geneName': 'BYSL', 'intActId': 'EBI-358049'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q0VAL7', 'geneName': 'C21orf58', 'intActId': 'EBI-10226774'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H7E9', 'geneName': 'C8orf33', 'intActId': 'EBI-715389'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y2V2', 'geneName': 'CARHSP1', 'intActId': 'EBI-718719'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9HC52', 'geneName': 'CBX8', 'intActId': 'EBI-712912'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8IYE0-2', 'geneName': 'CCDC146', 'intActId': 'EBI-10247802'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'E9PSE9', 'geneName': 'CCDC198', 'intActId': 'EBI-11748295'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NVL8', 'geneName': 'CCDC198', 'intActId': 'EBI-10238351'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P51959', 'geneName': 'CCNG1', 'intActId': 'EBI-3905829'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q07002', 'geneName': 'CDK18', 'intActId': 'EBI-746238'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96M91', 'geneName': 'CFAP53', 'intActId': 'EBI-742422'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UKJ5', 'geneName': 'CHIC2', 'intActId': 'EBI-741528'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8NE01', 'geneName': 'CNNM3', 'intActId': 'EBI-741032'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8NE01-3', 'geneName': 'CNNM3', 'intActId': 'EBI-10269984'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P10606', 'geneName': 'COX5B', 'intActId': 'EBI-1053725'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9GZU7', 'geneName': 'CTDSP1', 'intActId': 'EBI-751587'}, 'numberOfExperiments': 2, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UI47-2', 'geneName': 'CTNNA3', 'intActId': 'EBI-11962928'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UER7', 'geneName': 'DAXX', 'intActId': 'EBI-77321'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UJW0', 'geneName': 'DCTN4', 'intActId': 'EBI-2134033'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q08426', 'geneName': 'EHHADH', 'intActId': 'EBI-2339219'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H0I2', 'geneName': 'ENKD1', 'intActId': 'EBI-744099'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O95990-3', 'geneName': 'FAM107A', 'intActId': 'EBI-10192902'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BQ89', 'geneName': 'FAM110A', 'intActId': 'EBI-1752811'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q3B820', 'geneName': 'FAM161A', 'intActId': 'EBI-719941'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96MY7', 'geneName': 'FAM161B', 'intActId': 'EBI-7225287'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TES7-6', 'geneName': 'FBF1', 'intActId': 'EBI-10244131'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96D16', 'geneName': 'FBXL18', 'intActId': 'EBI-744419'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96JP0', 'geneName': 'FEM1C', 'intActId': 'EBI-2515330'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NU39', 'geneName': 'FOXD4L1', 'intActId': 'EBI-11320806'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6VB84', 'geneName': 'FOXD4L3', 'intActId': 'EBI-11961494'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TAE8', 'geneName': 'GADD45GIP1', 'intActId': 'EBI-372506'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P55040', 'geneName': 'GEM', 'intActId': 'EBI-744104'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O95872', 'geneName': 'GPANK1', 'intActId': 'EBI-751540'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O00155', 'geneName': 'GPR25', 'intActId': 'EBI-10178951'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'A4D1E9', 'geneName': 'GTPBP10', 'intActId': 'EBI-5453796'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9GZV7', 'geneName': 'HAPLN2', 'intActId': 'EBI-11956675'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P08631-2', 'geneName': 'HCK', 'intActId': 'EBI-9834454'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P56524', 'geneName': 'HDAC4', 'intActId': 'EBI-308629'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'A0A0S2Z4Q4', 'geneName': 'HGS', 'intActId': 'EBI-16429135'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P17482', 'geneName': 'HOXB9', 'intActId': 'EBI-745290'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q86VF2-5', 'geneName': 'IGFN1', 'intActId': 'EBI-11955401'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9C086', 'geneName': 'INO80B', 'intActId': 'EBI-715611'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z3B3', 'geneName': 'KANSL1', 'intActId': 'EBI-740244'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q14657', 'geneName': 'LAGE3', 'intActId': 'EBI-1052105'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96GY3', 'geneName': 'LIN37', 'intActId': 'EBI-748884'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TAP4-4', 'geneName': 'LMO3', 'intActId': 'EBI-11742507'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TBB1', 'geneName': 'LNX1', 'intActId': 'EBI-739832'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y586', 'geneName': 'MAB21L2', 'intActId': 'EBI-6659161'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96GV9', 'geneName': 'MACIR', 'intActId': 'EBI-2350695'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96L34', 'geneName': 'MARK4', 'intActId': 'EBI-302319'}, 'numberOfExperiments': 2, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P33993', 'geneName': 'MCM7', 'intActId': 'EBI-355924'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96G25-2', 'geneName': 'MED8', 'intActId': 'EBI-10286267'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H7H0', 'geneName': 'METTL17', 'intActId': 'EBI-749353'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H7H0-2', 'geneName': 'METTL17', 'intActId': 'EBI-11098807'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8IVT4', 'geneName': 'MGC50722', 'intActId': 'EBI-14086479'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z7H8', 'geneName': 'MRPL10', 'intActId': 'EBI-723524'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y3B7', 'geneName': 'MRPL11', 'intActId': 'EBI-5453723'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q16540', 'geneName': 'MRPL23', 'intActId': 'EBI-1046141'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96JP2', 'geneName': 'MYO15B', 'intActId': 'EBI-7950783'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NP98', 'geneName': 'MYOZ1', 'intActId': 'EBI-744402'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P48146', 'geneName': 'NPBWR2', 'intActId': 'EBI-10210114'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6X4W1-2', 'geneName': 'NSMF', 'intActId': 'EBI-12028784'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96HA8', 'geneName': 'NTAQ1', 'intActId': 'EBI-741158'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BRJ7', 'geneName': 'NUDT16L1', 'intActId': 'EBI-2949792'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TEW0', 'geneName': 'PARD3', 'intActId': 'EBI-81968'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NPB6', 'geneName': 'PARD6A', 'intActId': 'EBI-81876'}, 'numberOfExperiments': 2, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BYG5', 'geneName': 'PARD6B', 'intActId': 'EBI-295391'}, 'numberOfExperiments': 2, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'J3QSH9', 'geneName': 'PER1', 'intActId': 'EBI-10178671'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q13526', 'geneName': 'PIN1', 'intActId': 'EBI-714158'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y237', 'geneName': 'PIN4', 'intActId': 'EBI-714599'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q16512', 'geneName': 'PKN1', 'intActId': 'EBI-602382'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6P5Z2', 'geneName': 'PKN3', 'intActId': 'EBI-1384335'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96PV4', 'geneName': 'PNMA5', 'intActId': 'EBI-10171633'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P0CW24', 'geneName': 'PNMA6A', 'intActId': 'EBI-721270'}, 'numberOfExperiments': 10, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8WUT1', 'geneName': 'POLDIP3', 'intActId': 'EBI-10276663'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UGP5-2', 'geneName': 'POLL', 'intActId': 'EBI-10320765'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6PIY2', 'geneName': 'POLM', 'intActId': 'EBI-10253863'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P41743', 'geneName': 'PRKCI', 'intActId': 'EBI-286199'}, 'numberOfExperiments': 2, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q99633', 'geneName': 'PRPF18', 'intActId': 'EBI-2798416'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8WWY3', 'geneName': 'PRPF31', 'intActId': 'EBI-1567797'}, 'numberOfExperiments': 11, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P25786', 'geneName': 'PSMA1', 'intActId': 'EBI-359352'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O75771', 'geneName': 'RAD51D', 'intActId': 'EBI-1055693'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96IZ5', 'geneName': 'RBM41', 'intActId': 'EBI-740773'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BUL9', 'geneName': 'RPP25', 'intActId': 'EBI-366570'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8N5L8', 'geneName': 'RPP25L', 'intActId': 'EBI-10189722'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H1X1', 'geneName': 'RSPH9', 'intActId': 'EBI-10305303'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7L4I2-2', 'geneName': 'RSRC2', 'intActId': 'EBI-10256202'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q14D33', 'geneName': 'RTP5', 'intActId': 'EBI-10217913'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BVN2', 'geneName': 'RUSC1', 'intActId': 'EBI-6257312'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P57086', 'geneName': 'SCAND1', 'intActId': 'EBI-745846'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BWG6', 'geneName': 'SCNM1', 'intActId': 'EBI-748391'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O00560', 'geneName': 'SDCBP', 'intActId': 'EBI-727004'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9C0C4', 'geneName': 'SEMA4C', 'intActId': 'EBI-10303490'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H788', 'geneName': 'SH2D4A', 'intActId': 'EBI-747035'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q92529', 'geneName': 'SHC3', 'intActId': 'EBI-79084'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'A0PJX4', 'geneName': 'SHISA3', 'intActId': 'EBI-10171518'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6ZT89', 'geneName': 'SLC25A48', 'intActId': 'EBI-10255185'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9GZT3', 'geneName': 'SLIRP', 'intActId': 'EBI-1050793'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9BZL3', 'geneName': 'SMIM3', 'intActId': 'EBI-741850'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P14678-2', 'geneName': 'SNRPB', 'intActId': 'EBI-372475'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P08579', 'geneName': 'SNRPB2', 'intActId': 'EBI-1053651'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H0A9', 'geneName': 'SPATC1L', 'intActId': 'EBI-372911'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H0A9-2', 'geneName': 'SPATC1L', 'intActId': 'EBI-11995806'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UQ90', 'geneName': 'SPG7', 'intActId': 'EBI-717201'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96FJ0', 'geneName': 'STAMBPL1', 'intActId': 'EBI-745021'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H5I1', 'geneName': 'SUV39H2', 'intActId': 'EBI-723127'}, 'numberOfExperiments': 5, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q5VWN6', 'geneName': 'TASOR2', 'intActId': 'EBI-745958'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q15560', 'geneName': 'TCEA2', 'intActId': 'EBI-710310'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8N8B7', 'geneName': 'TCEANC', 'intActId': 'EBI-954696'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8N8B7-2', 'geneName': 'TCEANC', 'intActId': 'EBI-11955057'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O95988', 'geneName': 'TCL1B', 'intActId': 'EBI-727338'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'A0A0S2Z4F2', 'geneName': 'TEAD4', 'intActId': 'EBI-16429215'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'D3DUQ6', 'geneName': 'TEAD4', 'intActId': 'EBI-10176734'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q15561', 'geneName': 'TEAD4', 'intActId': 'EBI-747736'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q0P5Q0', 'geneName': 'TMSB4X', 'intActId': 'EBI-10226570'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P19237', 'geneName': 'TNNI1', 'intActId': 'EBI-746692'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8IWZ5', 'geneName': 'TRIM42', 'intActId': 'EBI-5235829'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q86UV6-2', 'geneName': 'TRIM74', 'intActId': 'EBI-10259086'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9NRE2', 'geneName': 'TSHZ2', 'intActId': 'EBI-10687282'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UJ04', 'geneName': 'TSPYL4', 'intActId': 'EBI-308511'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q5W5X9-3', 'geneName': 'TTC23', 'intActId': 'EBI-9090990'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8N5M4', 'geneName': 'TTC9C', 'intActId': 'EBI-2851213'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O14530', 'geneName': 'TXNDC9', 'intActId': 'EBI-707554'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UMX0', 'geneName': 'UBQLN1', 'intActId': 'EBI-741480'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UMX0-2', 'geneName': 'UBQLN1', 'intActId': 'EBI-10173939'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96HZ7', 'geneName': 'URB1-AS1', 'intActId': 'EBI-10288943'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'O75604', 'geneName': 'USP2', 'intActId': 'EBI-743272'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P61758', 'geneName': 'VBP1', 'intActId': 'EBI-357430'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y2B5', 'geneName': 'VPS9D1', 'intActId': 'EBI-9031083'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q64LD2', 'geneName': 'WDR25', 'intActId': 'EBI-744560'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q06250', 'geneName': 'WT1-AS', 'intActId': 'EBI-10223946'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q53FD0', 'geneName': 'ZC2HC1C', 'intActId': 'EBI-740767'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6PEW1', 'geneName': 'ZCCHC12', 'intActId': 'EBI-748373'}, 'numberOfExperiments': 12, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y260', 'geneName': 'ZFAB', 'intActId': 'EBI-750052'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q68DK2-5', 'geneName': 'ZFYVE26', 'intActId': 'EBI-8656416'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q96DA0', 'geneName': 'ZG16B', 'intActId': 'EBI-953824'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UQR1-2', 'geneName': 'ZNF148', 'intActId': 'EBI-11742222'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P13682', 'geneName': 'ZNF35', 'intActId': 'EBI-11041653'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z4V0', 'geneName': 'ZNF438', 'intActId': 'EBI-11962468'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q8TBZ8', 'geneName': 'ZNF564', 'intActId': 'EBI-10273713'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q9P0T4', 'geneName': 'ZNF581', 'intActId': 'EBI-745520'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'A0A0S2Z5X4', 'geneName': 'ZNF688', 'intActId': 'EBI-16429014'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'Q6ZN96', 'intActId': 'EBI-10255097'}, 'numberOfExperiments': 6, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q8ND90', 'intActId': 'EBI-302345'}, 'interactantTwo': {'uniProtKBAccession': 'P0DTD1', 'geneName': 'rep', 'chainId': 'PRO_0000449633', 'intActId': 'EBI-25492395'}, 'numberOfExperiments': 3, 'organismDiffer': True}]}, {'commentType': 'SUBCELLULAR LOCATION', 'note': {'texts': [{'value': 'In tumor cells, it is cytoplasmic'}]}, 'subcellularLocations': [{'location': {'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10050892'}], 'value': 'Nucleus, nucleolus', 'id': 'SL-0188'}}]}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '10050892'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '19366867'}], 'value': 'Testis- and brain-specific. In some cancer patients, specifically expressed by paraneoplastic tumor cells'}], 'commentType': 'TISSUE SPECIFICITY'}, {'texts': [{'value': 'Antibodies against PNMA1 are present in sera from patients suffering of paraneoplastic neurological disorders'}], 'commentType': 'MISCELLANEOUS'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000305'}], 'value': 'Belongs to the PNMA family'}], 'commentType': 'SIMILARITY'}], 'features': [{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 353, 'modifier': 'EXACT'}}, 'description': 'Paraneoplastic antigen Ma1', 'featureId': 'PRO_0000155199'}, {'type': 'Natural variant', 'location': {'start': {'value': 54, 'modifier': 'EXACT'}, 'end': {'value': 54, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs35129712', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs35129712'}], 'featureId': 'VAR_053595', 'alternativeSequence': {'originalSequence': 'M', 'alternativeSequences': ['V']}}, {'type': 'Natural variant', 'location': {'start': {'value': 215, 'modifier': 'EXACT'}, 'end': {'value': 215, 'modifier': 'EXACT'}}, 'description': 'in dbSNP:rs34413931', 'featureCrossReferences': [{'database': 'dbSNP', 'id': 'rs34413931'}], 'featureId': 'VAR_053596', 'alternativeSequence': {'originalSequence': 'R', 'alternativeSequences': ['P']}}], 'keywords': [{'id': 'KW-0539', 'category': 'Cellular component', 'name': 'Nucleus'}, {'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}, {'id': 'KW-0825', 'category': 'Molecular function', 'name': 'Tumor antigen'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '10050892', 'citationType': 'journal article', 'authors': ['Dalmau J.', 'Gultekin S.H.', 'Voltz R.', 'Hoard R.', 'DesChamps T.', 'Balmaceda C.', 'Batchelor T.', 'Gerstner E.', 'Eichen J.', 'Frennier J.', 'Posner J.B.', 'Rosenfeld M.R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '10050892'}, {'database': 'DOI', 'id': '10.1093/brain/122.1.27'}], 'title': 'Ma1, a novel neuron- and testis-specific protein, is recognized by the serum of patients with paraneoplastic neurological disorders.', 'publicationDate': '1999', 'journal': 'Brain', 'firstPage': '27', 'lastPage': '39', 'volume': '122'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [MRNA]', 'SUBCELLULAR LOCATION', 'TISSUE SPECIFICITY'], 'referenceComments': [{'value': 'Cerebellum', 'type': 'TISSUE'}]}, {'referenceNumber': 2, 'citation': {'id': '16214224', 'citationType': 'journal article', 'authors': ['Schueller M.', 'Jenne D.E.', 'Voltz R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '16214224'}, {'database': 'DOI', 'id': '10.1016/j.jneuroim.2005.08.019'}], 'title': 'The human PNMA family: novel neuronal proteins implicated in paraneoplastic neurological disease.', 'publicationDate': '2005', 'journal': 'J. Neuroimmunol.', 'firstPage': '172', 'lastPage': '176', 'volume': '169'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [MRNA]']}, {'referenceNumber': 3, 'citation': {'id': '14702039', 'citationType': 'journal article', 'authors': ['Ota T.', 'Suzuki Y.', 'Nishikawa T.', 'Otsuki T.', 'Sugiyama T.', 'Irie R.', 'Wakamatsu A.', 'Hayashi K.', 'Sato H.', 'Nagai K.', 'Kimura K.', 'Makita H.', 'Sekine M.', 'Obayashi M.', 'Nishi T.', 'Shibahara T.', 'Tanaka T.', 'Ishii S.', 'Yamamoto J.', 'Saito K.', 'Kawai Y.', 'Isono Y.', 'Nakamura Y.', 'Nagahari K.', 'Murakami K.', 'Yasuda T.', 'Iwayanagi T.', 'Wagatsuma M.', 'Shiratori A.', 'Sudo H.', 'Hosoiri T.', 'Kaku Y.', 'Kodaira H.', 'Kondo H.', 'Sugawara M.', 'Takahashi M.', 'Kanda K.', 'Yokoi T.', 'Furuya T.', 'Kikkawa E.', 'Omura Y.', 'Abe K.', 'Kamihara K.', 'Katsuta N.', 'Sato K.', 'Tanikawa M.', 'Yamazaki M.', 'Ninomiya K.', 'Ishibashi T.', 'Yamashita H.', 'Murakawa K.', 'Fujimori K.', 'Tanai H.', 'Kimata M.', 'Watanabe M.', 'Hiraoka S.', 'Chiba Y.', 'Ishida S.', 'Ono Y.', 'Takiguchi S.', 'Watanabe S.', 'Yosida M.', 'Hotuta T.', 'Kusano J.', 'Kanehori K.', 'Takahashi-Fujii A.', 'Hara H.', 'Tanase T.-O.', 'Nomura Y.', 'Togiya S.', 'Komai F.', 'Hara R.', 'Takeuchi K.', 'Arita M.', 'Imose N.', 'Musashino K.', 'Yuuki H.', 'Oshima A.', 'Sasaki N.', 'Aotsuka S.', 'Yoshikawa Y.', 'Matsunawa H.', 'Ichihara T.', 'Shiohata N.', 'Sano S.', 'Moriya S.', 'Momiyama H.', 'Satoh N.', 'Takami S.', 'Terashima Y.', 'Suzuki O.', 'Nakagawa S.', 'Senoh A.', 'Mizoguchi H.', 'Goto Y.', 'Shimizu F.', 'Wakebe H.', 'Hishigaki H.', 'Watanabe T.', 'Sugiyama A.', 'Takemoto M.', 'Kawakami B.', 'Yamazaki M.', 'Watanabe K.', 'Kumagai A.', 'Itakura S.', 'Fukuzumi Y.', 'Fujimori Y.', 'Komiyama M.', 'Tashiro H.', 'Tanigami A.', 'Fujiwara T.', 'Ono T.', 'Yamada K.', 'Fujii Y.', 'Ozaki K.', 'Hirao M.', 'Ohmori Y.', 'Kawabata A.', 'Hikiji T.', 'Kobatake N.', 'Inagaki H.', 'Ikema Y.', 'Okamoto S.', 'Okitani R.', 'Kawakami T.', 'Noguchi S.', 'Itoh T.', 'Shigeta K.', 'Senba T.', 'Matsumura K.', 'Nakajima Y.', 'Mizuno T.', 'Morinaga M.', 'Sasaki M.', 'Togashi T.', 'Oyama M.', 'Hata H.', 'Watanabe M.', 'Komatsu T.', 'Mizushima-Sugano J.', 'Satoh T.', 'Shirai Y.', 'Takahashi Y.', 'Nakagawa K.', 'Okumura K.', 'Nagase T.', 'Nomura N.', 'Kikuchi H.', 'Masuho Y.', 'Yamashita R.', 'Nakai K.', 'Yada T.', 'Nakamura Y.', 'Ohara O.', 'Isogai T.', 'Sugano S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '14702039'}, {'database': 'DOI', 'id': '10.1038/ng1285'}], 'title': 'Complete sequencing and characterization of 21,243 full-length human cDNAs.', 'publicationDate': '2004', 'journal': 'Nat. Genet.', 'firstPage': '40', 'lastPage': '45', 'volume': '36'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]']}, {'referenceNumber': 4, 'citation': {'id': 'CI-5GBDQ6B103N1E', 'citationType': 'submission', 'authors': ['Mural R.J.', 'Istrail S.', 'Sutton G.G.', 'Florea L.', 'Halpern A.L.', 'Mobarry C.M.', 'Lippert R.', 'Walenz B.', 'Shatkay H.', 'Dew I.', 'Miller J.R.', 'Flanigan M.J.', 'Edwards N.J.', 'Bolanos R.', 'Fasulo D.', 'Halldorsson B.V.', 'Hannenhalli S.', 'Turner R.', 'Yooseph S.', 'Lu F.', 'Nusskern D.R.', 'Shue B.C.', 'Zheng X.H.', 'Zhong F.', 'Delcher A.L.', 'Huson D.H.', 'Kravitz S.A.', 'Mouchard L.', 'Reinert K.', 'Remington K.A.', 'Clark A.G.', 'Waterman M.S.', 'Eichler E.E.', 'Adams M.D.', 'Hunkapiller M.W.', 'Myers E.W.', 'Venter J.C.'], 'publicationDate': 'JUL-2005', 'submissionDatabase': 'EMBL/GenBank/DDBJ databases'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 5, 'citation': {'id': '15489334', 'citationType': 'journal article', 'authoringGroup': ['The MGC Project Team'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15489334'}, {'database': 'DOI', 'id': '10.1101/gr.2596504'}], 'title': 'The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).', 'publicationDate': '2004', 'journal': 'Genome Res.', 'firstPage': '2121', 'lastPage': '2127', 'volume': '14'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]'], 'referenceComments': [{'value': 'Prostate', 'type': 'TISSUE'}]}, {'referenceNumber': 6, 'citation': {'id': '17974005', 'citationType': 'journal article', 'authors': ['Bechtel S.', 'Rosenfelder H.', 'Duda A.', 'Schmidt C.P.', 'Ernst U.', 'Wellenreuther R.', 'Mehrle A.', 'Schuster C.', 'Bahr A.', 'Bloecker H.', 'Heubner D.', 'Hoerlein A.', 'Michel G.', 'Wedler H.', 'Koehrer K.', 'Ottenwaelder B.', 'Poustka A.', 'Wiemann S.', 'Schupp I.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '17974005'}, {'database': 'DOI', 'id': '10.1186/1471-2164-8-399'}], 'title': 'The full-ORF clone resource of the German cDNA consortium.', 'publicationDate': '2007', 'journal': 'BMC Genomics', 'firstPage': '399', 'lastPage': '399', 'volume': '8'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA] OF 162-353'], 'referenceComments': [{'value': 'Testis', 'type': 'TISSUE'}]}, {'referenceNumber': 7, 'citation': {'id': '19366867', 'citationType': 'journal article', 'authors': ['Takaji M.', 'Komatsu Y.', 'Watakabe A.', 'Hashikawa T.', 'Yamamori T.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '19366867'}, {'database': 'DOI', 'id': '10.1093/cercor/bhp062'}], 'title': 'Paraneoplastic antigen-like 5 gene (PNMA5) is preferentially expressed in the association areas in a primate specific manner.', 'publicationDate': '2009', 'journal': 'Cereb. Cortex', 'firstPage': '2865', 'lastPage': '2879', 'volume': '19'}, 'referencePositions': ['TISSUE SPECIFICITY']}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AF037364', 'properties': [{'key': 'ProteinId', 'value': 'AAD13810.3'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AF320308', 'properties': [{'key': 'ProteinId', 'value': 'AAN05100.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AK290980', 'properties': [{'key': 'ProteinId', 'value': 'BAF83669.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'CH471061', 'properties': [{'key': 'ProteinId', 'value': 'EAW81126.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BC039577', 'properties': [{'key': 'ProteinId', 'value': 'AAH39577.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AL834327', 'properties': [{'key': 'ProteinId', 'value': 'CAD38995.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'CCDS', 'id': 'CCDS9818.1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'RefSeq', 'id': 'NP_006020.4', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_006029.5'}]}, {'database': 'AlphaFoldDB', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID', 'id': '114667', 'properties': [{'key': 'Interactions', 'value': '188'}]}, {'database': 'IntAct', 'id': 'Q8ND90', 'properties': [{'key': 'Interactions', 'value': '194'}]}, {'database': 'MINT', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'STRING', 'id': '9606.ENSP00000318914', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'iPTMnet', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhosphoSitePlus', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioMuta', 'id': 'PNMA1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DMDM', 'id': '37999715', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'jPOST', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MassIVE', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PaxDb', 'id': '9606-ENSP00000318914', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '72992', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pumba', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '26', 'properties': [{'key': 'antibodies', 'value': '231 antibodies from 26 providers'}]}, {'database': 'DNASU', 'id': '9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000316836.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000318914.3'}, {'key': 'GeneId', 'value': 'ENSG00000176903.5'}]}, {'database': 'GeneID', 'id': '9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MANE-Select', 'id': 'ENST00000316836.5', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000318914.3'}, {'key': 'RefSeqNucleotideId', 'value': 'NM_006029.5'}, {'key': 'RefSeqProteinId', 'value': 'NP_006020.4'}]}, {'database': 'UCSC', 'id': 'uc001xor.2', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'AGR', 'id': 'HGNC:9158', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneCards', 'id': 'PNMA1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:9158', 'properties': [{'key': 'GeneName', 'value': 'PNMA1'}]}, {'database': 'HPA', 'id': 'ENSG00000176903', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Low tissue specificity'}]}, {'database': 'MIM', 'id': '604010', 'properties': [{'key': 'Type', 'value': 'gene'}]}, {'database': 'neXtProt', 'id': 'NX_Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OpenTargets', 'id': 'ENSG00000176903', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PharmGKB', 'id': 'PA33481', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000176903', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'eggNOG', 'id': 'ENOG502SPHT', 'properties': [{'key': 'ToxonomicScope', 'value': 'Eukaryota'}]}, {'database': 'GeneTree', 'id': 'ENSGT01030000234522', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_014694_0_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'InParanoid', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'EHTNEVM', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '115435at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhylomeDB', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'TreeFam', 'id': 'TF335054', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PathwayCommons', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SignaLink', 'id': 'Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '9240', 'properties': [{'key': 'hits', 'value': '18 hits in 1148 CRISPR screens'}]}, {'database': 'CD-CODE', 'id': '91857CE7', 'properties': [{'key': 'EntryName', 'value': 'Nucleolus'}]}, {'database': 'ChiTaRS', 'id': 'PNMA1', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'GenomeRNAi', 'id': '9240', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pharos', 'id': 'Q8ND90', 'properties': [{'key': 'DevelopmentLevel', 'value': 'Tbio'}]}, {'database': 'PRO', 'id': 'PR:Q8ND90', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome 14'}]}, {'database': 'RNAct', 'id': 'Q8ND90', 'properties': [{'key': 'moleculeType', 'value': 'protein'}]}, {'database': 'Bgee', 'id': 'ENSG00000176903', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in cortical plate and 100 other cell types or tissues'}]}, {'database': 'GO', 'id': 'GO:0005737', 'properties': [{'key': 'GoTerm', 'value': 'C:cytoplasm'}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '10050892'}]}, {'database': 'GO', 'id': 'GO:0005730', 'properties': [{'key': 'GoTerm', 'value': 'C:nucleolus'}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '10050892'}]}, {'database': 'GO', 'id': 'GO:0002437', 'properties': [{'key': 'GoTerm', 'value': 'P:inflammatory response to antigenic stimulus'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000250', 'source': 'PubMed', 'id': '15201193'}]}, {'database': 'GO', 'id': 'GO:0043065', 'properties': [{'key': 'GoTerm', 'value': 'P:positive regulation of apoptotic process'}, {'key': 'GoEvidenceType', 'value': 'ISS:UniProtKB'}]}, {'database': 'InterPro', 'id': 'IPR026523', 'properties': [{'key': 'EntryName', 'value': 'PNMA'}]}, {'database': 'InterPro', 'id': 'IPR048270', 'properties': [{'key': 'EntryName', 'value': 'PNMA_C'}]}, {'database': 'InterPro', 'id': 'IPR048271', 'properties': [{'key': 'EntryName', 'value': 'PNMA_N'}]}, {'database': 'PANTHER', 'id': 'PTHR23095', 'properties': [{'key': 'EntryName', 'value': 'PARANEOPLASTIC ANTIGEN'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PANTHER', 'id': 'PTHR23095:SF17', 'properties': [{'key': 'EntryName', 'value': 'PARANEOPLASTIC ANTIGEN MA1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF14893', 'properties': [{'key': 'EntryName', 'value': 'PNMA'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF20846', 'properties': [{'key': 'EntryName', 'value': 'PNMA_N'}, {'key': 'MatchStatus', 'value': '1'}]}], 'sequence': {'value': 'MAMTLLEDWCRGMDVNSQRALLVWGIPVNCDEAEIEETLQAAMPQVSYRMLGRMFWREENAKAALLELTGAVDYAAIPREMPGKGGVWKVLFKPPTSDAEFLERLHLFLAREGWTVQDVARVLGFQNPTPTPGPEMPAEMLNYILDNVIQPLVESIWYKRLTLFSGRDIPGPGEETFDPWLEHTNEVLEEWQVSDVEKRRRLMESLRGPAADVIRILKSNNPAITTAECLKALEQVFGSVESSRDAQIKFLNTYQNPGEKLSAYVIRLEPLLQKVVEKGAIDKDNVNQARLEQVIAGANHSGAIRRQLWLTGAGEGPAPNLFQLLVQIREEEAKEEEEEAEATLLQLGLEGHF', 'length': 353, 'molWeight': 39761, 'crc64': 'EB7F5B6AEDA25961', 'md5': 'F38EF6628256BF10939F84F4973927D7'}, 'extraAttributes': {'countByCommentType': {'INTERACTION': 154, 'SUBCELLULAR LOCATION': 1, 'TISSUE SPECIFICITY': 1, 'MISCELLANEOUS': 1, 'SIMILARITY': 1}, 'countByFeatureType': {'Chain': 1, 'Natural variant': 2}, 'uniParcId': 'UPI000003779C'}}}, {'from': '8233', 'to': {'entryType': 'UniProtKB reviewed (Swiss-Prot)', 'primaryAccession': 'Q15696', 'secondaryAccessions': ['Q14D69'], 'uniProtkbId': 'U2AFM_HUMAN', 'entryAudit': {'firstPublicDate': '1997-11-01', 'lastAnnotationUpdateDate': '2025-04-09', 'lastSequenceUpdateDate': '1997-05-01', 'entryVersion': 202, 'sequenceVersion': 2}, 'annotationScore': 5.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'recommendedName': {'fullName': {'value': 'U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 2'}}, 'alternativeNames': [{'fullName': {'value': 'CCCH type zinc finger, RNA-binding motif and serine/arginine rich protein 2'}}, {'fullName': {'value': 'Renal carcinoma antigen NY-REN-20'}}, {'fullName': {'value': 'U2(RNU2) small nuclear RNA auxiliary factor 1-like 2'}}, {'fullName': {'value': 'U2AF35-related protein'}, 'shortNames': [{'value': 'URP'}]}]}, 'genes': [{'geneName': {'value': 'ZRSR2'}, 'synonyms': [{'value': 'U2AF1-RS2'}, {'value': 'U2AF1L2'}, {'value': 'U2AF1RS2'}, {'value': 'URP'}]}], 'comments': [{'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '21041408'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '9237760'}], 'value': "Pre-mRNA-binding protein required for splicing of both U2- and U12-type introns. Selectively interacts with the 3'-splice site of U2- and U12-type pre-mRNAs and promotes different steps in U2 and U12 intron splicing. Recruited to U12 pre-mRNAs in an ATP-dependent manner and is required for assembly of the pre-spliceosome, a precursor to other spliceosomal complexes. For U2-type introns, it is selectively and specifically required for the second step of splicing"}], 'commentType': 'FUNCTION'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '15146077'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '21041408'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '9237760'}], 'value': 'Component of the U11/U12 snRNPs that are part of the U12-type spliceosome. Interacts (via RS domain) with SRSF1 and SRSF2. Interacts with U2AF2/U2AF65'}], 'commentType': 'SUBUNIT'}, {'commentType': 'INTERACTION', 'interactions': [{'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q9UK58', 'geneName': 'CCNL1', 'intActId': 'EBI-2836773'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'P49760', 'geneName': 'CLK2', 'intActId': 'EBI-750020'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q92997', 'geneName': 'DVL3', 'intActId': 'EBI-739789'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q7Z7F0-4', 'geneName': 'KHDC4', 'intActId': 'EBI-9089060'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y383', 'geneName': 'LUC7L2', 'intActId': 'EBI-352851'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q6BDI9', 'geneName': 'REP15', 'intActId': 'EBI-12048237'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q9H190', 'geneName': 'SDCBP2', 'intActId': 'EBI-742426'}, 'numberOfExperiments': 7, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q96SB4', 'geneName': 'SRPK1', 'intActId': 'EBI-539478'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'P78362', 'geneName': 'SRPK2', 'intActId': 'EBI-593303'}, 'numberOfExperiments': 8, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'A7MD48', 'geneName': 'SRRM4', 'intActId': 'EBI-3867173'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q13243', 'geneName': 'SRSF5', 'intActId': 'EBI-720503'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q9Y2D8', 'geneName': 'SSX2IP', 'intActId': 'EBI-2212028'}, 'numberOfExperiments': 4, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q96N21', 'geneName': 'TEPSIN', 'intActId': 'EBI-11139477'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q8WVP5', 'geneName': 'TNFAIP8L1', 'intActId': 'EBI-752102'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q15696', 'geneName': 'ZRSR2', 'intActId': 'EBI-6657923'}, 'numberOfExperiments': 3, 'organismDiffer': False}, {'interactantOne': {'uniProtKBAccession': 'Q15696', 'intActId': 'EBI-6657923'}, 'interactantTwo': {'uniProtKBAccession': 'Q15695', 'geneName': 'ZRSR2P1', 'intActId': 'EBI-12270264'}, 'numberOfExperiments': 6, 'organismDiffer': False}]}, {'commentType': 'SUBCELLULAR LOCATION', 'subcellularLocations': [{'location': {'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '15146077'}], 'value': 'Nucleus', 'id': 'SL-0191'}}]}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '9237760'}], 'value': 'Widely expressed'}], 'commentType': 'TISSUE SPECIFICITY'}, {'texts': [{'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '9237760'}], 'value': 'Phosphorylated in the RS domain by SRPK1'}], 'commentType': 'PTM'}, {'commentType': 'DISEASE', 'disease': {'diseaseId': 'Orofaciodigital syndrome 21', 'diseaseAccession': 'DI-06953', 'acronym': 'OFD21', 'description': 'A form of orofaciodigital syndrome, a group of heterogeneous disorders characterized by malformations of the oral cavity, face and digits, and associated phenotypic abnormalities that lead to the delineation of various subtypes. OFD21 is an X-linked recessive form characterized by postaxial polydactyly of the hands, hallux duplication, palatal defects, fused incisors, accessory oral frenula and tongue nodules, in association with brain anomalies that range from pituitary anomalies to alobar holoprosencephaly.', 'diseaseCrossReference': {'database': 'MIM', 'id': '301132'}, 'evidences': [{'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '31680349'}, {'evidenceCode': 'ECO:0000269', 'source': 'PubMed', 'id': '38158857'}]}, 'note': {'texts': [{'value': 'The disease is caused by variants affecting the gene represented in this entry'}]}}], 'features': [{'type': 'Chain', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 482, 'modifier': 'EXACT'}}, 'description': 'U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 2', 'featureId': 'PRO_0000082001'}, {'type': 'Domain', 'location': {'start': {'value': 198, 'modifier': 'EXACT'}, 'end': {'value': 304, 'modifier': 'EXACT'}}, 'description': 'RRM', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00176'}]}, {'type': 'Zinc finger', 'location': {'start': {'value': 166, 'modifier': 'EXACT'}, 'end': {'value': 194, 'modifier': 'EXACT'}}, 'description': 'C3H1-type 1', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}]}, {'type': 'Zinc finger', 'location': {'start': {'value': 306, 'modifier': 'EXACT'}, 'end': {'value': 333, 'modifier': 'EXACT'}}, 'description': 'C3H1-type 2', 'evidences': [{'evidenceCode': 'ECO:0000255', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}]}, {'type': 'Region', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 59, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Region', 'location': {'start': {'value': 115, 'modifier': 'EXACT'}, 'end': {'value': 135, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Region', 'location': {'start': {'value': 351, 'modifier': 'EXACT'}, 'end': {'value': 482, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 14, 'modifier': 'EXACT'}, 'end': {'value': 31, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 46, 'modifier': 'EXACT'}, 'end': {'value': 58, 'modifier': 'EXACT'}}, 'description': 'Acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 360, 'modifier': 'EXACT'}, 'end': {'value': 375, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 383, 'modifier': 'EXACT'}, 'end': {'value': 398, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 399, 'modifier': 'EXACT'}, 'end': {'value': 412, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 413, 'modifier': 'EXACT'}, 'end': {'value': 435, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 436, 'modifier': 'EXACT'}, 'end': {'value': 454, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 349, 'modifier': 'EXACT'}, 'end': {'value': 349, 'modifier': 'EXACT'}}, 'description': 'Phosphoserine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '18669648'}, {'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '20068231'}, {'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '23186163'}]}, {'type': 'Modified residue', 'location': {'start': {'value': 384, 'modifier': 'EXACT'}, 'end': {'value': 384, 'modifier': 'EXACT'}}, 'description': 'Phosphoserine', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '17081983'}, {'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '21406692'}]}, {'type': 'Cross-link', 'location': {'start': {'value': 45, 'modifier': 'EXACT'}, 'end': {'value': 45, 'modifier': 'EXACT'}}, 'description': 'Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '28112733'}]}, {'type': 'Cross-link', 'location': {'start': {'value': 62, 'modifier': 'EXACT'}, 'end': {'value': 62, 'modifier': 'EXACT'}}, 'description': 'Glycyl lysine isopeptide (Lys-Gly) (interchain with G-Cter in SUMO2)', 'evidences': [{'evidenceCode': 'ECO:0007744', 'source': 'PubMed', 'id': '28112733'}]}], 'keywords': [{'id': 'KW-1186', 'category': 'Disease', 'name': 'Ciliopathy'}, {'id': 'KW-1017', 'category': 'PTM', 'name': 'Isopeptide bond'}, {'id': 'KW-0479', 'category': 'Ligand', 'name': 'Metal-binding'}, {'id': 'KW-0507', 'category': 'Biological process', 'name': 'mRNA processing'}, {'id': 'KW-0508', 'category': 'Biological process', 'name': 'mRNA splicing'}, {'id': 'KW-0539', 'category': 'Cellular component', 'name': 'Nucleus'}, {'id': 'KW-0597', 'category': 'PTM', 'name': 'Phosphoprotein'}, {'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}, {'id': 'KW-0677', 'category': 'Domain', 'name': 'Repeat'}, {'id': 'KW-0687', 'category': 'Molecular function', 'name': 'Ribonucleoprotein'}, {'id': 'KW-0694', 'category': 'Molecular function', 'name': 'RNA-binding'}, {'id': 'KW-0747', 'category': 'Cellular component', 'name': 'Spliceosome'}, {'id': 'KW-0832', 'category': 'PTM', 'name': 'Ubl conjugation'}, {'id': 'KW-0862', 'category': 'Ligand', 'name': 'Zinc'}, {'id': 'KW-0863', 'category': 'Domain', 'name': 'Zinc-finger'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '8586425', 'citationType': 'journal article', 'authors': ['Kitagawa K.', 'Wang X.', 'Hatada I.', 'Yamaoka T.', 'Nojima H.', 'Inazawa J.', 'Abe T.', 'Mitsuya K.', 'Oshimura M.', 'Murata A.', 'Monden M.', 'Mukai T.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '8586425'}, {'database': 'DOI', 'id': '10.1006/geno.1995.9879'}], 'title': 'Isolation and mapping of human homologues of an imprinted mouse gene U2af1-rs1.', 'publicationDate': '1995', 'journal': 'Genomics', 'firstPage': '257', 'lastPage': '263', 'volume': '30'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [MRNA]'], 'referenceComments': [{'value': 'Brain', 'type': 'TISSUE'}]}, {'referenceNumber': 2, 'citation': {'id': '15772651', 'citationType': 'journal article', 'authors': ['Ross M.T.', 'Grafham D.V.', 'Coffey A.J.', 'Scherer S.', 'McLay K.', 'Muzny D.', 'Platzer M.', 'Howell G.R.', 'Burrows C.', 'Bird C.P.', 'Frankish A.', 'Lovell F.L.', 'Howe K.L.', 'Ashurst J.L.', 'Fulton R.S.', 'Sudbrak R.', 'Wen G.', 'Jones M.C.', 'Hurles M.E.', 'Andrews T.D.', 'Scott C.E.', 'Searle S.', 'Ramser J.', 'Whittaker A.', 'Deadman R.', 'Carter N.P.', 'Hunt S.E.', 'Chen R.', 'Cree A.', 'Gunaratne P.', 'Havlak P.', 'Hodgson A.', 'Metzker M.L.', 'Richards S.', 'Scott G.', 'Steffen D.', 'Sodergren E.', 'Wheeler D.A.', 'Worley K.C.', 'Ainscough R.', 'Ambrose K.D.', 'Ansari-Lari M.A.', 'Aradhya S.', 'Ashwell R.I.', 'Babbage A.K.', 'Bagguley C.L.', 'Ballabio A.', 'Banerjee R.', 'Barker G.E.', 'Barlow K.F.', 'Barrett I.P.', 'Bates K.N.', 'Beare D.M.', 'Beasley H.', 'Beasley O.', 'Beck A.', 'Bethel G.', 'Blechschmidt K.', 'Brady N.', 'Bray-Allen S.', 'Bridgeman A.M.', 'Brown A.J.', 'Brown M.J.', 'Bonnin D.', 'Bruford E.A.', 'Buhay C.', 'Burch P.', 'Burford D.', 'Burgess J.', 'Burrill W.', 'Burton J.', 'Bye J.M.', 'Carder C.', 'Carrel L.', 'Chako J.', 'Chapman J.C.', 'Chavez D.', 'Chen E.', 'Chen G.', 'Chen Y.', 'Chen Z.', 'Chinault C.', 'Ciccodicola A.', 'Clark S.Y.', 'Clarke G.', 'Clee C.M.', 'Clegg S.', 'Clerc-Blankenburg K.', 'Clifford K.', 'Cobley V.', 'Cole C.G.', 'Conquer J.S.', 'Corby N.', 'Connor R.E.', 'David R.', 'Davies J.', 'Davis C.', 'Davis J.', 'Delgado O.', 'Deshazo D.', 'Dhami P.', 'Ding Y.', 'Dinh H.', 'Dodsworth S.', 'Draper H.', 'Dugan-Rocha S.', 'Dunham A.', 'Dunn M.', 'Durbin K.J.', 'Dutta I.', 'Eades T.', 'Ellwood M.', 'Emery-Cohen A.', 'Errington H.', 'Evans K.L.', 'Faulkner L.', 'Francis F.', 'Frankland J.', 'Fraser A.E.', 'Galgoczy P.', 'Gilbert J.', 'Gill R.', 'Gloeckner G.', 'Gregory S.G.', 'Gribble S.', 'Griffiths C.', 'Grocock R.', 'Gu Y.', 'Gwilliam R.', 'Hamilton C.', 'Hart E.A.', 'Hawes A.', 'Heath P.D.', 'Heitmann K.', 'Hennig S.', 'Hernandez J.', 'Hinzmann B.', 'Ho S.', 'Hoffs M.', 'Howden P.J.', 'Huckle E.J.', 'Hume J.', 'Hunt P.J.', 'Hunt A.R.', 'Isherwood J.', 'Jacob L.', 'Johnson D.', 'Jones S.', 'de Jong P.J.', 'Joseph S.S.', 'Keenan S.', 'Kelly S.', 'Kershaw J.K.', 'Khan Z.', 'Kioschis P.', 'Klages S.', 'Knights A.J.', 'Kosiura A.', 'Kovar-Smith C.', 'Laird G.K.', 'Langford C.', 'Lawlor S.', 'Leversha M.', 'Lewis L.', 'Liu W.', 'Lloyd C.', 'Lloyd D.M.', 'Loulseged H.', 'Loveland J.E.', 'Lovell J.D.', 'Lozado R.', 'Lu J.', 'Lyne R.', 'Ma J.', 'Maheshwari M.', 'Matthews L.H.', 'McDowall J.', 'McLaren S.', 'McMurray A.', 'Meidl P.', 'Meitinger T.', 'Milne S.', 'Miner G.', 'Mistry S.L.', 'Morgan M.', 'Morris S.', 'Mueller I.', 'Mullikin J.C.', 'Nguyen N.', 'Nordsiek G.', 'Nyakatura G.', "O'dell C.N.", 'Okwuonu G.', 'Palmer S.', 'Pandian R.', 'Parker D.', 'Parrish J.', 'Pasternak S.', 'Patel D.', 'Pearce A.V.', 'Pearson D.M.', 'Pelan S.E.', 'Perez L.', 'Porter K.M.', 'Ramsey Y.', 'Reichwald K.', 'Rhodes S.', 'Ridler K.A.', 'Schlessinger D.', 'Schueler M.G.', 'Sehra H.K.', 'Shaw-Smith C.', 'Shen H.', 'Sheridan E.M.', 'Shownkeen R.', 'Skuce C.D.', 'Smith M.L.', 'Sotheran E.C.', 'Steingruber H.E.', 'Steward C.A.', 'Storey R.', 'Swann R.M.', 'Swarbreck D.', 'Tabor P.E.', 'Taudien S.', 'Taylor T.', 'Teague B.', 'Thomas K.', 'Thorpe A.', 'Timms K.', 'Tracey A.', 'Trevanion S.', 'Tromans A.C.', "d'Urso M.", 'Verduzco D.', 'Villasana D.', 'Waldron L.', 'Wall M.', 'Wang Q.', 'Warren J.', 'Warry G.L.', 'Wei X.', 'West A.', 'Whitehead S.L.', 'Whiteley M.N.', 'Wilkinson J.E.', 'Willey D.L.', 'Williams G.', 'Williams L.', 'Williamson A.', 'Williamson H.', 'Wilming L.', 'Woodmansey R.L.', 'Wray P.W.', 'Yen J.', 'Zhang J.', 'Zhou J.', 'Zoghbi H.', 'Zorilla S.', 'Buck D.', 'Reinhardt R.', 'Poustka A.', 'Rosenthal A.', 'Lehrach H.', 'Meindl A.', 'Minx P.J.', 'Hillier L.W.', 'Willard H.F.', 'Wilson R.K.', 'Waterston R.H.', 'Rice C.M.', 'Vaudin M.', 'Coulson A.', 'Nelson D.L.', 'Weinstock G.', 'Sulston J.E.', 'Durbin R.M.', 'Hubbard T.', 'Gibbs R.A.', 'Beck S.', 'Rogers J.', 'Bentley D.R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15772651'}, {'database': 'DOI', 'id': '10.1038/nature03440'}], 'title': 'The DNA sequence of the human X chromosome.', 'publicationDate': '2005', 'journal': 'Nature', 'firstPage': '325', 'lastPage': '337', 'volume': '434'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]']}, {'referenceNumber': 3, 'citation': {'id': '15489334', 'citationType': 'journal article', 'authoringGroup': ['The MGC Project Team'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15489334'}, {'database': 'DOI', 'id': '10.1101/gr.2596504'}], 'title': 'The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC).', 'publicationDate': '2004', 'journal': 'Genome Res.', 'firstPage': '2121', 'lastPage': '2127', 'volume': '14'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE MRNA]']}, {'referenceNumber': 4, 'citation': {'id': '9237760', 'citationType': 'journal article', 'authors': ['Tronchere H.', 'Wang J.', 'Fu X.D.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '9237760'}, {'database': 'DOI', 'id': '10.1038/41137'}], 'title': 'A protein related to splicing factor U2AF35 that interacts with U2AF65 and SR proteins in splicing of pre-mRNA.', 'publicationDate': '1997', 'journal': 'Nature', 'firstPage': '397', 'lastPage': '400', 'volume': '388'}, 'referencePositions': ['FUNCTION', 'TISSUE SPECIFICITY', 'PHOSPHORYLATION', 'INTERACTION WITH SRSF1; SRSF2; SRPK1 AND U2AF2']}, {'referenceNumber': 5, 'citation': {'id': '10508479', 'citationType': 'journal article', 'authors': ['Scanlan M.J.', 'Gordan J.D.', 'Williamson B.', 'Stockert E.', 'Bander N.H.', 'Jongeneel C.V.', 'Gure A.O.', 'Jaeger D.', 'Jaeger E.', 'Knuth A.', 'Chen Y.-T.', 'Old L.J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '10508479'}, {'database': 'DOI', 'id': '10.1002/(sici)1097-0215(19991112)83:4<456::aid-ijc4>3.0.co;2-5'}], 'title': 'Antigens recognized by autologous antibody in patients with renal-cell carcinoma.', 'publicationDate': '1999', 'journal': 'Int. J. Cancer', 'firstPage': '456', 'lastPage': '464', 'volume': '83'}, 'referencePositions': ['IDENTIFICATION AS A RENAL CANCER ANTIGEN'], 'referenceComments': [{'value': 'Renal cell carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 6, 'citation': {'id': '15146077', 'citationType': 'journal article', 'authors': ['Will C.L.', 'Schneider C.', 'Hossbach M.', 'Urlaub H.', 'Rauhut R.', 'Elbashir S.', 'Tuschl T.', 'Luehrmann R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15146077'}, {'database': 'DOI', 'id': '10.1261/rna.7320604'}], 'title': 'The human 18S U11/U12 snRNP contains a set of novel proteins not found in the U2-dependent spliceosome.', 'publicationDate': '2004', 'journal': 'RNA', 'firstPage': '929', 'lastPage': '941', 'volume': '10'}, 'referencePositions': ['IDENTIFICATION IN THE U11/U12 SPLICEOSOME COMPLEX', 'IDENTIFICATION BY MASS SPECTROMETRY', 'SUBCELLULAR LOCATION']}, {'referenceNumber': 7, 'citation': {'id': '17081983', 'citationType': 'journal article', 'authors': ['Olsen J.V.', 'Blagoev B.', 'Gnad F.', 'Macek B.', 'Kumar C.', 'Mortensen P.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '17081983'}, {'database': 'DOI', 'id': '10.1016/j.cell.2006.09.026'}], 'title': 'Global, in vivo, and site-specific phosphorylation dynamics in signaling networks.', 'publicationDate': '2006', 'journal': 'Cell', 'firstPage': '635', 'lastPage': '648', 'volume': '127'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-384', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 8, 'citation': {'id': '18669648', 'citationType': 'journal article', 'authors': ['Dephoure N.', 'Zhou C.', 'Villen J.', 'Beausoleil S.A.', 'Bakalarski C.E.', 'Elledge S.J.', 'Gygi S.P.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '18669648'}, {'database': 'DOI', 'id': '10.1073/pnas.0805139105'}], 'title': 'A quantitative atlas of mitotic phosphorylation.', 'publicationDate': '2008', 'journal': 'Proc. Natl. Acad. Sci. U.S.A.', 'firstPage': '10762', 'lastPage': '10767', 'volume': '105'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-349', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 9, 'citation': {'id': '21041408', 'citationType': 'journal article', 'authors': ['Shen H.', 'Zheng X.', 'Luecke S.', 'Green M.R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '21041408'}, {'database': 'DOI', 'id': '10.1101/gad.1974810'}], 'title': "The U2AF35-related protein Urp contacts the 3' splice site to promote U12-type intron splicing and the second step of U2-type intron splicing.", 'publicationDate': '2010', 'journal': 'Genes Dev.', 'firstPage': '2389', 'lastPage': '2394', 'volume': '24'}, 'referencePositions': ['FUNCTION', 'IDENTIFICATION IN THE U11/U12 SPLICEOSOME COMPLEX', 'RNA-BINDING']}, {'referenceNumber': 10, 'citation': {'id': '20068231', 'citationType': 'journal article', 'authors': ['Olsen J.V.', 'Vermeulen M.', 'Santamaria A.', 'Kumar C.', 'Miller M.L.', 'Jensen L.J.', 'Gnad F.', 'Cox J.', 'Jensen T.S.', 'Nigg E.A.', 'Brunak S.', 'Mann M.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '20068231'}, {'database': 'DOI', 'id': '10.1126/scisignal.2000475'}], 'title': 'Quantitative phosphoproteomics reveals widespread full phosphorylation site occupancy during mitosis.', 'publicationDate': '2010', 'journal': 'Sci. Signal.', 'firstPage': 'RA3', 'lastPage': 'RA3', 'volume': '3'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-349', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}]}, {'referenceNumber': 11, 'citation': {'id': '21406692', 'citationType': 'journal article', 'authors': ['Rigbolt K.T.', 'Prokhorova T.A.', 'Akimov V.', 'Henningsen J.', 'Johansen P.T.', 'Kratchmarova I.', 'Kassem M.', 'Mann M.', 'Olsen J.V.', 'Blagoev B.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '21406692'}, {'database': 'DOI', 'id': '10.1126/scisignal.2001570'}], 'title': 'System-wide temporal characterization of the proteome and phosphoproteome of human embryonic stem cell differentiation.', 'publicationDate': '2011', 'journal': 'Sci. Signal.', 'firstPage': 'RS3', 'lastPage': 'RS3', 'volume': '4'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-384', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]']}, {'referenceNumber': 12, 'citation': {'id': '23186163', 'citationType': 'journal article', 'authors': ['Zhou H.', 'Di Palma S.', 'Preisinger C.', 'Peng M.', 'Polat A.N.', 'Heck A.J.', 'Mohammed S.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '23186163'}, {'database': 'DOI', 'id': '10.1021/pr300630k'}], 'title': 'Toward a comprehensive characterization of a human cancer cell phosphoproteome.', 'publicationDate': '2013', 'journal': 'J. Proteome Res.', 'firstPage': '260', 'lastPage': '271', 'volume': '12'}, 'referencePositions': ['PHOSPHORYLATION [LARGE SCALE ANALYSIS] AT SER-349', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]'], 'referenceComments': [{'value': 'Cervix carcinoma', 'type': 'TISSUE'}, {'value': 'Erythroleukemia', 'type': 'TISSUE'}]}, {'referenceNumber': 13, 'citation': {'id': '28112733', 'citationType': 'journal article', 'authors': ['Hendriks I.A.', 'Lyon D.', 'Young C.', 'Jensen L.J.', 'Vertegaal A.C.', 'Nielsen M.L.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '28112733'}, {'database': 'DOI', 'id': '10.1038/nsmb.3366'}], 'title': 'Site-specific mapping of the human SUMO proteome reveals co-modification with phosphorylation.', 'publicationDate': '2017', 'journal': 'Nat. Struct. Mol. Biol.', 'firstPage': '325', 'lastPage': '336', 'volume': '24'}, 'referencePositions': ['SUMOYLATION [LARGE SCALE ANALYSIS] AT LYS-45 AND LYS-62', 'IDENTIFICATION BY MASS SPECTROMETRY [LARGE SCALE ANALYSIS]']}, {'referenceNumber': 14, 'citation': {'id': '31680349', 'citationType': 'journal article', 'authors': ['Guo W.', 'Lai Y.', 'Yan Z.', 'Wang Y.', 'Nie Y.', 'Guan S.', 'Kuo Y.', 'Zhang W.', 'Zhu X.', 'Peng M.', 'Zhi X.', 'Wei Y.', 'Yan L.', 'Qiao J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '31680349'}, {'database': 'DOI', 'id': '10.1002/humu.23935'}], 'title': 'Trio-whole-exome sequencing and preimplantation genetic diagnosis for unexplained recurrent fetal malformations.', 'publicationDate': '2020', 'journal': 'Hum. Mutat.', 'firstPage': '432', 'lastPage': '448', 'volume': '41'}, 'referencePositions': ['INVOLVEMENT IN OFD21']}, {'referenceNumber': 15, 'citation': {'id': '38158857', 'citationType': 'journal article', 'authors': ['Hannes L.', 'Atzori M.', 'Goldenberg A.', 'Argente J.', 'Attie-Bitach T.', 'Amiel J.', 'Attanasio C.', 'Braslavsky D.G.', 'Bruel A.L.', 'Castanet M.', 'Dubourg C.', 'Jacobs A.', 'Lyonnet S.', 'Martinez-Mayer J.', 'Perez Millan M.I.', 'Pezzella N.', 'Pelgrims E.', 'Aerden M.', 'Bauters M.', 'Rochtus A.', 'Scaglia P.', 'Swillen A.', 'Sifrim A.', 'Tammaro R.', 'Mau-Them F.T.', 'Odent S.', 'Thauvin-Robinet C.', 'Franco B.', 'Breckpot J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '38158857'}, {'database': 'DOI', 'id': '10.1016/j.gim.2023.101059'}], 'title': 'Differential alternative splicing analysis links variation in ZRSR2 to a novel type of oral-facial-digital syndrome.', 'publicationDate': '2024', 'journal': 'Genet. Med.', 'firstPage': '101059', 'lastPage': '101059', 'volume': '26'}, 'referencePositions': ['INVOLVEMENT IN OFD21']}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'D49677', 'properties': [{'key': 'ProteinId', 'value': 'BAA08533.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'AC004106', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'AC096510', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'BC113454', 'properties': [{'key': 'ProteinId', 'value': 'AAI13455.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'EMBL', 'id': 'BC113480', 'properties': [{'key': 'ProteinId', 'value': 'AAI13481.1'}, {'key': 'Status', 'value': '-'}, {'key': 'MoleculeType', 'value': 'mRNA'}]}, {'database': 'CCDS', 'id': 'CCDS14172.1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'RefSeq', 'id': 'NP_005080.1', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'NM_005089.4'}]}, {'database': 'AlphaFoldDB', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SMR', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID', 'id': '113865', 'properties': [{'key': 'Interactions', 'value': '80'}]}, {'database': 'CORUM', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DIP', 'id': 'DIP-62117N', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'IntAct', 'id': 'Q15696', 'properties': [{'key': 'Interactions', 'value': '60'}]}, {'database': 'STRING', 'id': '9606.ENSP00000303015', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GlyGen', 'id': 'Q15696', 'properties': [{'key': 'glycosylation', 'value': '1 site, 1 O-linked glycan (1 site)'}]}, {'database': 'iPTMnet', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhosphoSitePlus', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioMuta', 'id': 'ZRSR2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DMDM', 'id': '2833266', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'jPOST', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MassIVE', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PaxDb', 'id': '9606-ENSP00000303015', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'ProteomicsDB', 'id': '60704', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pumba', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Antibodypedia', 'id': '8950', 'properties': [{'key': 'antibodies', 'value': '142 antibodies from 22 providers'}]}, {'database': 'DNASU', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000307771.8', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000303015.7'}, {'key': 'GeneId', 'value': 'ENSG00000169249.14'}]}, {'database': 'GeneID', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'KEGG', 'id': 'hsa:8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MANE-Select', 'id': 'ENST00000307771.8', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000303015.7'}, {'key': 'RefSeqNucleotideId', 'value': 'NM_005089.4'}, {'key': 'RefSeqProteinId', 'value': 'NP_005080.1'}]}, {'database': 'UCSC', 'id': 'uc004cxg.5', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'AGR', 'id': 'HGNC:23019', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GeneCards', 'id': 'ZRSR2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:23019', 'properties': [{'key': 'GeneName', 'value': 'ZRSR2'}]}, {'database': 'HPA', 'id': 'ENSG00000169249', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Low tissue specificity'}]}, {'database': 'MalaCards', 'id': 'ZRSR2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'MIM', 'id': '300028', 'properties': [{'key': 'Type', 'value': 'gene'}]}, {'database': 'MIM', 'id': '301132', 'properties': [{'key': 'Type', 'value': 'phenotype'}]}, {'database': 'neXtProt', 'id': 'NX_Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OpenTargets', 'id': 'ENSG00000169249', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PharmGKB', 'id': 'PA162410930', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'VEuPathDB', 'id': 'HostDB:ENSG00000169249', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'eggNOG', 'id': 'KOG2202', 'properties': [{'key': 'ToxonomicScope', 'value': 'Eukaryota'}]}, {'database': 'GeneTree', 'id': 'ENSGT00950000183152', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HOGENOM', 'id': 'CLU_029117_1_0_1', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'InParanoid', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OMA', 'id': 'MDLRIME', 'properties': [{'key': 'Fingerprint', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '75923at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PhylomeDB', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'TreeFam', 'id': 'TF324447', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PathwayCommons', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Reactome', 'id': 'R-HSA-72165', 'properties': [{'key': 'PathwayName', 'value': 'mRNA Splicing - Minor Pathway'}]}, {'database': 'SignaLink', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'SIGNOR', 'id': 'Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'BioGRID-ORCS', 'id': '8233', 'properties': [{'key': 'hits', 'value': '193 hits in 789 CRISPR screens'}]}, {'database': 'ChiTaRS', 'id': 'ZRSR2', 'properties': [{'key': 'OrganismName', 'value': 'human'}]}, {'database': 'GeneWiki', 'id': 'ZRSR2', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'GenomeRNAi', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Pharos', 'id': 'Q15696', 'properties': [{'key': 'DevelopmentLevel', 'value': 'Tbio'}]}, {'database': 'PRO', 'id': 'PR:Q15696', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome X'}]}, {'database': 'RNAct', 'id': 'Q15696', 'properties': [{'key': 'moleculeType', 'value': 'protein'}]}, {'database': 'Bgee', 'id': 'ENSG00000169249', 'properties': [{'key': 'ExpressionPatterns', 'value': 'Expressed in sural nerve and 208 other cell types or tissues'}]}, {'database': 'ExpressionAtlas', 'id': 'Q15696', 'properties': [{'key': 'ExpressionPatterns', 'value': 'baseline and differential'}]}, {'database': 'GO', 'id': 'GO:0005654', 'properties': [{'key': 'GoTerm', 'value': 'C:nucleoplasm'}, {'key': 'GoEvidenceType', 'value': 'TAS:Reactome'}]}, {'database': 'GO', 'id': 'GO:0005681', 'properties': [{'key': 'GoTerm', 'value': 'C:spliceosomal complex'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0005689', 'properties': [{'key': 'GoTerm', 'value': 'C:U12-type spliceosomal complex'}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '21041408'}]}, {'database': 'GO', 'id': 'GO:0089701', 'properties': [{'key': 'GoTerm', 'value': 'C:U2AF complex'}, {'key': 'GoEvidenceType', 'value': 'IBA:GO_Central'}]}, {'database': 'GO', 'id': 'GO:0042802', 'properties': [{'key': 'GoTerm', 'value': 'F:identical protein binding'}, {'key': 'GoEvidenceType', 'value': 'IPI:IntAct'}], 'evidences': [{'evidenceCode': 'ECO:0000353', 'source': 'PubMed', 'id': '32296183'}]}, {'database': 'GO', 'id': 'GO:0030628', 'properties': [{'key': 'GoTerm', 'value': "F:pre-mRNA 3'-splice site binding"}, {'key': 'GoEvidenceType', 'value': 'IDA:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000314', 'source': 'PubMed', 'id': '21041408'}]}, {'database': 'GO', 'id': 'GO:0008270', 'properties': [{'key': 'GoTerm', 'value': 'F:zinc ion binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-KW'}]}, {'database': 'GO', 'id': 'GO:0000398', 'properties': [{'key': 'GoTerm', 'value': 'P:mRNA splicing, via spliceosome'}, {'key': 'GoEvidenceType', 'value': 'IMP:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000315', 'source': 'PubMed', 'id': '9237760'}]}, {'database': 'GO', 'id': 'GO:0008380', 'properties': [{'key': 'GoTerm', 'value': 'P:RNA splicing'}, {'key': 'GoEvidenceType', 'value': 'IC:HGNC-UCL'}], 'evidences': [{'evidenceCode': 'ECO:0000305', 'source': 'PubMed', 'id': '15146077'}]}, {'database': 'GO', 'id': 'GO:0000245', 'properties': [{'key': 'GoTerm', 'value': 'P:spliceosomal complex assembly'}, {'key': 'GoEvidenceType', 'value': 'IMP:UniProtKB'}], 'evidences': [{'evidenceCode': 'ECO:0000315', 'source': 'PubMed', 'id': '21041408'}]}, {'database': 'CDD', 'id': 'cd12540', 'properties': [{'key': 'EntryName', 'value': 'RRM_U2AFBPL'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'FunFam', 'id': '3.30.70.330:FF:000209', 'properties': [{'key': 'EntryName', 'value': 'U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 2'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Gene3D', 'id': '3.30.70.330', 'properties': [{'key': 'EntryName', 'value': '-'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR012677', 'properties': [{'key': 'EntryName', 'value': 'Nucleotide-bd_a/b_plait_sf'}]}, {'database': 'InterPro', 'id': 'IPR035979', 'properties': [{'key': 'EntryName', 'value': 'RBD_domain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000504', 'properties': [{'key': 'EntryName', 'value': 'RRM_dom'}]}, {'database': 'InterPro', 'id': 'IPR003954', 'properties': [{'key': 'EntryName', 'value': 'RRM_dom_euk'}]}, {'database': 'InterPro', 'id': 'IPR009145', 'properties': [{'key': 'EntryName', 'value': 'U2AF_small'}]}, {'database': 'InterPro', 'id': 'IPR000571', 'properties': [{'key': 'EntryName', 'value': 'Znf_CCCH'}]}, {'database': 'PANTHER', 'id': 'PTHR12620', 'properties': [{'key': 'EntryName', 'value': 'U2 SNRNP AUXILIARY FACTOR, SMALL SUBUNIT'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF00076', 'properties': [{'key': 'EntryName', 'value': 'RRM_1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF00642', 'properties': [{'key': 'EntryName', 'value': 'zf-CCCH'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PRINTS', 'id': 'PR01848', 'properties': [{'key': 'EntryName', 'value': 'U2AUXFACTOR'}]}, {'database': 'SMART', 'id': 'SM00361', 'properties': [{'key': 'EntryName', 'value': 'RRM_1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00356', 'properties': [{'key': 'EntryName', 'value': 'ZnF_C3H1'}, {'key': 'MatchStatus', 'value': '2'}]}, {'database': 'SUPFAM', 'id': 'SSF54928', 'properties': [{'key': 'EntryName', 'value': 'RNA-binding domain, RBD'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50102', 'properties': [{'key': 'EntryName', 'value': 'RRM'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50103', 'properties': [{'key': 'EntryName', 'value': 'ZF_C3H1'}, {'key': 'MatchStatus', 'value': '2'}]}], 'sequence': {'value': 'MAAPEKMTFPEKPSHKKYRAALKKEKRKKRRQELARLRDSGLSQKEEEEDTFIEEQQLEEEKLLERERQRLHEEWLLREQKAQEEFRIKKEKEEAAKKRQEEQERKLKEQWEEQQRKEREEEEQKRQEKKEKEEALQKMLDQAENELENGTTWQNPEPPVDFRVMEKDRANCPFYSKTGACRFGDRCSRKHNFPTSSPTLLIKSMFTTFGMEQCRRDDYDPDASLEYSEEETYQQFLDFYEDVLPEFKNVGKVIQFKVSCNLEPHLRGNVYVQYQSEEECQAALSLFNGRWYAGRQLQCEFCPVTRWKMAICGLFEIQQCPRGKHCNFLHVFRNPNNEFWEANRDIYLSPDRTGSSFGKNSERRERMGHHDDYYSRLRGRRNPSPDHSYKRNGESERKSSRHRGKKSHKRTSKSRERHNSRSRGRNRDRSRDRSRGRGSRSRSRSRSRRSRRSRSQSSSRSRSRGRRRSGNRDRTVQSPKSK', 'length': 482, 'molWeight': 58045, 'crc64': '1DACC8A6CA4727A6', 'md5': '8516F0DAAFB48D7A45D609C8F686F8F2'}, 'extraAttributes': {'countByCommentType': {'FUNCTION': 1, 'SUBUNIT': 1, 'INTERACTION': 16, 'SUBCELLULAR LOCATION': 1, 'TISSUE SPECIFICITY': 1, 'PTM': 1, 'DISEASE': 1}, 'countByFeatureType': {'Chain': 1, 'Domain': 1, 'Zinc finger': 2, 'Region': 3, 'Compositional bias': 7, 'Modified residue': 2, 'Cross-link': 2}, 'uniParcId': 'UPI0000137929'}}}, {'from': '8233', 'to': {'entryType': 'UniProtKB unreviewed (TrEMBL)', 'primaryAccession': 'A0A8I5KSD0', 'uniProtkbId': 'A0A8I5KSD0_HUMAN', 'entryAudit': {'firstPublicDate': '2022-05-25', 'lastAnnotationUpdateDate': '2025-04-02', 'lastSequenceUpdateDate': '2022-05-25', 'entryVersion': 17, 'sequenceVersion': 1}, 'annotationScore': 2.0, 'organism': {'scientificName': 'Homo sapiens', 'commonName': 'Human', 'taxonId': 9606, 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}, {'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'lineage': ['Eukaryota', 'Metazoa', 'Chordata', 'Craniata', 'Vertebrata', 'Euteleostomi', 'Mammalia', 'Eutheria', 'Euarchontoglires', 'Primates', 'Haplorrhini', 'Catarrhini', 'Hominidae', 'Homo']}, 'proteinExistence': '1: Evidence at protein level', 'proteinDescription': {'submissionNames': [{'fullName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}], 'value': 'Zinc finger CCCH-type, RNA binding motif and serine/arginine rich 2'}}]}, 'genes': [{'geneName': {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}], 'value': 'ZRSR2'}}], 'features': [{'type': 'Signal', 'location': {'start': {'value': 1, 'modifier': 'EXACT'}, 'end': {'value': 16, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'SignalP'}]}, {'type': 'Chain', 'location': {'start': {'value': 17, 'modifier': 'EXACT'}, 'end': {'value': 460, 'modifier': 'EXACT'}}, 'description': '', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'SignalP'}], 'featureId': 'PRO_5035241810'}, {'type': 'Domain', 'location': {'start': {'value': 140, 'modifier': 'EXACT'}, 'end': {'value': 168, 'modifier': 'EXACT'}}, 'description': 'C3H1-type', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50103'}]}, {'type': 'Domain', 'location': {'start': {'value': 172, 'modifier': 'EXACT'}, 'end': {'value': 278, 'modifier': 'EXACT'}}, 'description': 'RRM', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50102'}]}, {'type': 'Domain', 'location': {'start': {'value': 280, 'modifier': 'EXACT'}, 'end': {'value': 307, 'modifier': 'EXACT'}}, 'description': 'C3H1-type', 'evidences': [{'evidenceCode': 'ECO:0000259', 'source': 'PROSITE', 'id': 'PS50103'}]}, {'type': 'Zinc finger', 'location': {'start': {'value': 140, 'modifier': 'EXACT'}, 'end': {'value': 168, 'modifier': 'EXACT'}}, 'description': 'C3H1-type', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}]}, {'type': 'Zinc finger', 'location': {'start': {'value': 280, 'modifier': 'EXACT'}, 'end': {'value': 307, 'modifier': 'EXACT'}}, 'description': 'C3H1-type', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}]}, {'type': 'Region', 'location': {'start': {'value': 89, 'modifier': 'EXACT'}, 'end': {'value': 109, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Region', 'location': {'start': {'value': 325, 'modifier': 'EXACT'}, 'end': {'value': 460, 'modifier': 'EXACT'}}, 'description': 'Disordered', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 334, 'modifier': 'EXACT'}, 'end': {'value': 349, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 357, 'modifier': 'EXACT'}, 'end': {'value': 372, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 373, 'modifier': 'EXACT'}, 'end': {'value': 386, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 387, 'modifier': 'EXACT'}, 'end': {'value': 409, 'modifier': 'EXACT'}}, 'description': 'Basic and acidic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}, {'type': 'Compositional bias', 'location': {'start': {'value': 410, 'modifier': 'EXACT'}, 'end': {'value': 428, 'modifier': 'EXACT'}}, 'description': 'Basic residues', 'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'MobiDB-lite'}]}], 'keywords': [{'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022723'}, {'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}], 'id': 'KW-0479', 'category': 'Ligand', 'name': 'Metal-binding'}, {'evidences': [{'evidenceCode': 'ECO:0007829', 'source': 'PeptideAtlas', 'id': 'A0A8I5KSD0'}, {'evidenceCode': 'ECO:0007829', 'source': 'ProteomicsDB', 'id': 'A0A8I5KSD0'}], 'id': 'KW-1267', 'category': 'Technical term', 'name': 'Proteomics identification'}, {'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}], 'id': 'KW-1185', 'category': 'Technical term', 'name': 'Reference proteome'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022737'}], 'id': 'KW-0677', 'category': 'Domain', 'name': 'Repeat'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00176'}], 'id': 'KW-0694', 'category': 'Molecular function', 'name': 'RNA-binding'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'SAM', 'id': 'SignalP'}], 'id': 'KW-0732', 'category': 'Domain', 'name': 'Signal'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022833'}, {'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}], 'id': 'KW-0862', 'category': 'Ligand', 'name': 'Zinc'}, {'evidences': [{'evidenceCode': 'ECO:0000256', 'source': 'ARBA', 'id': 'ARBA00022771'}, {'evidenceCode': 'ECO:0000256', 'source': 'PROSITE-ProRule', 'id': 'PRU00723'}], 'id': 'KW-0863', 'category': 'Domain', 'name': 'Zinc-finger'}], 'references': [{'referenceNumber': 1, 'citation': {'id': '11237011', 'citationType': 'journal article', 'authoringGroup': ['International Human Genome Sequencing Consortium'], 'authors': ['Lander E.S.', 'Linton L.M.', 'Birren B.', 'Nusbaum C.', 'Zody M.C.', 'Baldwin J.', 'Devon K.', 'Dewar K.', 'Doyle M.', 'FitzHugh W.', 'Funke R.', 'Gage D.', 'Harris K.', 'Heaford A.', 'Howland J.', 'Kann L.', 'Lehoczky J.', 'LeVine R.', 'McEwan P.', 'McKernan K.', 'Meldrim J.', 'Mesirov J.P.', 'Miranda C.', 'Morris W.', 'Naylor J.', 'Raymond C.', 'Rosetti M.', 'Santos R.', 'Sheridan A.', 'Sougnez C.', 'Stange-Thomann N.', 'Stojanovic N.', 'Subramanian A.', 'Wyman D.', 'Rogers J.', 'Sulston J.', 'Ainscough R.', 'Beck S.', 'Bentley D.', 'Burton J.', 'Clee C.', 'Carter N.', 'Coulson A.', 'Deadman R.', 'Deloukas P.', 'Dunham A.', 'Dunham I.', 'Durbin R.', 'French L.', 'Grafham D.', 'Gregory S.', 'Hubbard T.', 'Humphray S.', 'Hunt A.', 'Jones M.', 'Lloyd C.', 'McMurray A.', 'Matthews L.', 'Mercer S.', 'Milne S.', 'Mullikin J.C.', 'Mungall A.', 'Plumb R.', 'Ross M.', 'Shownkeen R.', 'Sims S.', 'Waterston R.H.', 'Wilson R.K.', 'Hillier L.W.', 'McPherson J.D.', 'Marra M.A.', 'Mardis E.R.', 'Fulton L.A.', 'Chinwalla A.T.', 'Pepin K.H.', 'Gish W.R.', 'Chissoe S.L.', 'Wendl M.C.', 'Delehaunty K.D.', 'Miner T.L.', 'Delehaunty A.', 'Kramer J.B.', 'Cook L.L.', 'Fulton R.S.', 'Johnson D.L.', 'Minx P.J.', 'Clifton S.W.', 'Hawkins T.', 'Branscomb E.', 'Predki P.', 'Richardson P.', 'Wenning S.', 'Slezak T.', 'Doggett N.', 'Cheng J.F.', 'Olsen A.', 'Lucas S.', 'Elkin C.', 'Uberbacher E.', 'Frazier M.', 'Gibbs R.A.', 'Muzny D.M.', 'Scherer S.E.', 'Bouck J.B.', 'Sodergren E.J.', 'Worley K.C.', 'Rives C.M.', 'Gorrell J.H.', 'Metzker M.L.', 'Naylor S.L.', 'Kucherlapati R.S.', 'Nelson D.L.', 'Weinstock G.M.', 'Sakaki Y.', 'Fujiyama A.', 'Hattori M.', 'Yada T.', 'Toyoda A.', 'Itoh T.', 'Kawagoe C.', 'Watanabe H.', 'Totoki Y.', 'Taylor T.', 'Weissenbach J.', 'Heilig R.', 'Saurin W.', 'Artiguenave F.', 'Brottier P.', 'Bruls T.', 'Pelletier E.', 'Robert C.', 'Wincker P.', 'Smith D.R.', 'Doucette-Stamm L.', 'Rubenfield M.', 'Weinstock K.', 'Lee H.M.', 'Dubois J.', 'Rosenthal A.', 'Platzer M.', 'Nyakatura G.', 'Taudien S.', 'Rump A.', 'Yang H.', 'Yu J.', 'Wang J.', 'Huang G.', 'Gu J.', 'Hood L.', 'Rowen L.', 'Madan A.', 'Qin S.', 'Davis R.W.', 'Federspiel N.A.', 'Abola A.P.', 'Proctor M.J.', 'Myers R.M.', 'Schmutz J.', 'Dickson M.', 'Grimwood J.', 'Cox D.R.', 'Olson M.V.', 'Kaul R.', 'Raymond C.', 'Shimizu N.', 'Kawasaki K.', 'Minoshima S.', 'Evans G.A.', 'Athanasiou M.', 'Schultz R.', 'Roe B.A.', 'Chen F.', 'Pan H.', 'Ramser J.', 'Lehrach H.', 'Reinhardt R.', 'McCombie W.R.', 'de la Bastide M.', 'Dedhia N.', 'Blocker H.', 'Hornischer K.', 'Nordsiek G.', 'Agarwala R.', 'Aravind L.', 'Bailey J.A.', 'Bateman A.', 'Batzoglou S.', 'Birney E.', 'Bork P.', 'Brown D.G.', 'Burge C.B.', 'Cerutti L.', 'Chen H.C.', 'Church D.', 'Clamp M.', 'Copley R.R.', 'Doerks T.', 'Eddy S.R.', 'Eichler E.E.', 'Furey T.S.', 'Galagan J.', 'Gilbert J.G.', 'Harmon C.', 'Hayashizaki Y.', 'Haussler D.', 'Hermjakob H.', 'Hokamp K.', 'Jang W.', 'Johnson L.S.', 'Jones T.A.', 'Kasif S.', 'Kaspryzk A.', 'Kennedy S.', 'Kent W.J.', 'Kitts P.', 'Koonin E.V.', 'Korf I.', 'Kulp D.', 'Lancet D.', 'Lowe T.M.', 'McLysaght A.', 'Mikkelsen T.', 'Moran J.V.', 'Mulder N.', 'Pollara V.J.', 'Ponting C.P.', 'Schuler G.', 'Schultz J.', 'Slater G.', 'Smit A.F.', 'Stupka E.', 'Szustakowski J.', 'Thierry-Mieg D.', 'Thierry-Mieg J.', 'Wagner L.', 'Wallis J.', 'Wheeler R.', 'Williams A.', 'Wolf Y.I.', 'Wolfe K.H.', 'Yang S.P.', 'Yeh R.F.', 'Collins F.', 'Guyer M.S.', 'Peterson J.', 'Felsenfeld A.', 'Wetterstrand K.A.', 'Patrinos A.', 'Morgan M.J.', 'de Jong P.', 'Catanese J.J.', 'Osoegawa K.', 'Shizuya H.', 'Choi S.', 'Chen Y.J.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '11237011'}, {'database': 'DOI', 'id': '10.1038/35057062'}], 'title': 'Initial sequencing and analysis of the human genome.', 'publicationDate': '2001', 'journal': 'Nature', 'firstPage': '860', 'lastPage': '921', 'volume': '409'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}]}, {'referenceNumber': 2, 'citation': {'id': '15496913', 'citationType': 'journal article', 'authoringGroup': ['International Human Genome Sequencing Consortium'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15496913'}, {'database': 'DOI', 'id': '10.1038/nature03001'}], 'title': 'Finishing the euchromatic sequence of the human genome.', 'publicationDate': '2004', 'journal': 'Nature', 'firstPage': '931', 'lastPage': '945', 'volume': '431'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}]}, {'referenceNumber': 3, 'citation': {'id': '15772651', 'citationType': 'journal article', 'authors': ['Ross M.T.', 'Grafham D.V.', 'Coffey A.J.', 'Scherer S.', 'McLay K.', 'Muzny D.', 'Platzer M.', 'Howell G.R.', 'Burrows C.', 'Bird C.P.', 'Frankish A.', 'Lovell F.L.', 'Howe K.L.', 'Ashurst J.L.', 'Fulton R.S.', 'Sudbrak R.', 'Wen G.', 'Jones M.C.', 'Hurles M.E.', 'Andrews T.D.', 'Scott C.E.', 'Searle S.', 'Ramser J.', 'Whittaker A.', 'Deadman R.', 'Carter N.P.', 'Hunt S.E.', 'Chen R.', 'Cree A.', 'Gunaratne P.', 'Havlak P.', 'Hodgson A.', 'Metzker M.L.', 'Richards S.', 'Scott G.', 'Steffen D.', 'Sodergren E.', 'Wheeler D.A.', 'Worley K.C.', 'Ainscough R.', 'Ambrose K.D.', 'Ansari-Lari M.A.', 'Aradhya S.', 'Ashwell R.I.', 'Babbage A.K.', 'Bagguley C.L.', 'Ballabio A.', 'Banerjee R.', 'Barker G.E.', 'Barlow K.F.', 'Barrett I.P.', 'Bates K.N.', 'Beare D.M.', 'Beasley H.', 'Beasley O.', 'Beck A.', 'Bethel G.', 'Blechschmidt K.', 'Brady N.', 'Bray-Allen S.', 'Bridgeman A.M.', 'Brown A.J.', 'Brown M.J.', 'Bonnin D.', 'Bruford E.A.', 'Buhay C.', 'Burch P.', 'Burford D.', 'Burgess J.', 'Burrill W.', 'Burton J.', 'Bye J.M.', 'Carder C.', 'Carrel L.', 'Chako J.', 'Chapman J.C.', 'Chavez D.', 'Chen E.', 'Chen G.', 'Chen Y.', 'Chen Z.', 'Chinault C.', 'Ciccodicola A.', 'Clark S.Y.', 'Clarke G.', 'Clee C.M.', 'Clegg S.', 'Clerc-Blankenburg K.', 'Clifford K.', 'Cobley V.', 'Cole C.G.', 'Conquer J.S.', 'Corby N.', 'Connor R.E.', 'David R.', 'Davies J.', 'Davis C.', 'Davis J.', 'Delgado O.', 'Deshazo D.', 'Dhami P.', 'Ding Y.', 'Dinh H.', 'Dodsworth S.', 'Draper H.', 'Dugan-Rocha S.', 'Dunham A.', 'Dunn M.', 'Durbin K.J.', 'Dutta I.', 'Eades T.', 'Ellwood M.', 'Emery-Cohen A.', 'Errington H.', 'Evans K.L.', 'Faulkner L.', 'Francis F.', 'Frankland J.', 'Fraser A.E.', 'Galgoczy P.', 'Gilbert J.', 'Gill R.', 'Glockner G.', 'Gregory S.G.', 'Gribble S.', 'Griffiths C.', 'Grocock R.', 'Gu Y.', 'Gwilliam R.', 'Hamilton C.', 'Hart E.A.', 'Hawes A.', 'Heath P.D.', 'Heitmann K.', 'Hennig S.', 'Hernandez J.', 'Hinzmann B.', 'Ho S.', 'Hoffs M.', 'Howden P.J.', 'Huckle E.J.', 'Hume J.', 'Hunt P.J.', 'Hunt A.R.', 'Isherwood J.', 'Jacob L.', 'Johnson D.', 'Jones S.', 'de Jong P.J.', 'Joseph S.S.', 'Keenan S.', 'Kelly S.', 'Kershaw J.K.', 'Khan Z.', 'Kioschis P.', 'Klages S.', 'Knights A.J.', 'Kosiura A.', 'Kovar-Smith C.', 'Laird G.K.', 'Langford C.', 'Lawlor S.', 'Leversha M.', 'Lewis L.', 'Liu W.', 'Lloyd C.', 'Lloyd D.M.', 'Loulseged H.', 'Loveland J.E.', 'Lovell J.D.', 'Lozado R.', 'Lu J.', 'Lyne R.', 'Ma J.', 'Maheshwari M.', 'Matthews L.H.', 'McDowall J.', 'McLaren S.', 'McMurray A.', 'Meidl P.', 'Meitinger T.', 'Milne S.', 'Miner G.', 'Mistry S.L.', 'Morgan M.', 'Morris S.', 'Muller I.', 'Mullikin J.C.', 'Nguyen N.', 'Nordsiek G.', 'Nyakatura G.', "O'Dell C.N.", 'Okwuonu G.', 'Palmer S.', 'Pandian R.', 'Parker D.', 'Parrish J.', 'Pasternak S.', 'Patel D.', 'Pearce A.V.', 'Pearson D.M.', 'Pelan S.E.', 'Perez L.', 'Porter K.M.', 'Ramsey Y.', 'Reichwald K.', 'Rhodes S.', 'Ridler K.A.', 'Schlessinger D.', 'Schueler M.G.', 'Sehra H.K.', 'Shaw-Smith C.', 'Shen H.', 'Sheridan E.M.', 'Shownkeen R.', 'Skuce C.D.', 'Smith M.L.', 'Sotheran E.C.', 'Steingruber H.E.', 'Steward C.A.', 'Storey R.', 'Swann R.M.', 'Swarbreck D.', 'Tabor P.E.', 'Taudien S.', 'Taylor T.', 'Teague B.', 'Thomas K.', 'Thorpe A.', 'Timms K.', 'Tracey A.', 'Trevanion S.', 'Tromans A.C.', "d'Urso M.", 'Verduzco D.', 'Villasana D.', 'Waldron L.', 'Wall M.', 'Wang Q.', 'Warren J.', 'Warry G.L.', 'Wei X.', 'West A.', 'Whitehead S.L.', 'Whiteley M.N.', 'Wilkinson J.E.', 'Willey D.L.', 'Williams G.', 'Williams L.', 'Williamson A.', 'Williamson H.', 'Wilming L.', 'Woodmansey R.L.', 'Wray P.W.', 'Yen J.', 'Zhang J.', 'Zhou J.', 'Zoghbi H.', 'Zorilla S.', 'Buck D.', 'Reinhardt R.', 'Poustka A.', 'Rosenthal A.', 'Lehrach H.', 'Meindl A.', 'Minx P.J.', 'Hillier L.W.', 'Willard H.F.', 'Wilson R.K.', 'Waterston R.H.', 'Rice C.M.', 'Vaudin M.', 'Coulson A.', 'Nelson D.L.', 'Weinstock G.', 'Sulston J.E.', 'Durbin R.', 'Hubbard T.', 'Gibbs R.A.', 'Beck S.', 'Rogers J.', 'Bentley D.R.'], 'citationCrossReferences': [{'database': 'PubMed', 'id': '15772651'}, {'database': 'DOI', 'id': '10.1038/nature03440'}], 'title': 'The DNA sequence of the human X chromosome.', 'publicationDate': '2005', 'journal': 'Nature', 'firstPage': '325', 'lastPage': '337', 'volume': '434'}, 'referencePositions': ['NUCLEOTIDE SEQUENCE [LARGE SCALE GENOMIC DNA]'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}, {'evidenceCode': 'ECO:0000313', 'source': 'Proteomes', 'id': 'UP000005640'}]}, {'referenceNumber': 4, 'citation': {'id': 'CI-20IMBF1U3E5V5', 'citationType': 'submission', 'authoringGroup': ['Ensembl'], 'publicationDate': 'DEC-2024', 'submissionDatabase': 'UniProtKB'}, 'referencePositions': ['IDENTIFICATION'], 'evidences': [{'evidenceCode': 'ECO:0000313', 'source': 'Ensembl', 'id': 'ENSP00000510773.1'}]}], 'uniProtKBCrossReferences': [{'database': 'EMBL', 'id': 'AC004106', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'AC096510', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'EMBL', 'id': 'KF458988', 'properties': [{'key': 'ProteinId', 'value': '-'}, {'key': 'Status', 'value': 'NOT_ANNOTATED_CDS'}, {'key': 'MoleculeType', 'value': 'Genomic_DNA'}]}, {'database': 'RefSeq', 'id': 'XP_011543891.2', 'properties': [{'key': 'NucleotideSequenceId', 'value': 'XM_011545589.2'}]}, {'database': 'SMR', 'id': 'A0A8I5KSD0', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'PeptideAtlas', 'id': 'A0A8I5KSD0', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Ensembl', 'id': 'ENST00000684799.1', 'properties': [{'key': 'ProteinId', 'value': 'ENSP00000510773.1'}, {'key': 'GeneId', 'value': 'ENSG00000169249.14'}]}, {'database': 'GeneID', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'CTD', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'DisGeNET', 'id': '8233', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'HGNC', 'id': 'HGNC:23019', 'properties': [{'key': 'GeneName', 'value': 'ZRSR2'}]}, {'database': 'GeneTree', 'id': 'ENSGT00950000183152', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'OrthoDB', 'id': '75923at2759', 'properties': [{'key': 'Description', 'value': '-'}]}, {'database': 'Proteomes', 'id': 'UP000005640', 'properties': [{'key': 'Component', 'value': 'Chromosome X'}]}, {'database': 'GO', 'id': 'GO:0089701', 'properties': [{'key': 'GoTerm', 'value': 'C:U2AF complex'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'GO', 'id': 'GO:0003723', 'properties': [{'key': 'GoTerm', 'value': 'F:RNA binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-UniRule'}]}, {'database': 'GO', 'id': 'GO:0008270', 'properties': [{'key': 'GoTerm', 'value': 'F:zinc ion binding'}, {'key': 'GoEvidenceType', 'value': 'IEA:UniProtKB-KW'}]}, {'database': 'GO', 'id': 'GO:0000398', 'properties': [{'key': 'GoTerm', 'value': 'P:mRNA splicing, via spliceosome'}, {'key': 'GoEvidenceType', 'value': 'IEA:InterPro'}]}, {'database': 'CDD', 'id': 'cd12540', 'properties': [{'key': 'EntryName', 'value': 'RRM_U2AFBPL'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'FunFam', 'id': '3.30.70.330:FF:000209', 'properties': [{'key': 'EntryName', 'value': 'U2 small nuclear ribonucleoprotein auxiliary factor 35 kDa subunit-related protein 2'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Gene3D', 'id': '3.30.70.330', 'properties': [{'key': 'EntryName', 'value': '-'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'InterPro', 'id': 'IPR012677', 'properties': [{'key': 'EntryName', 'value': 'Nucleotide-bd_a/b_plait_sf'}]}, {'database': 'InterPro', 'id': 'IPR035979', 'properties': [{'key': 'EntryName', 'value': 'RBD_domain_sf'}]}, {'database': 'InterPro', 'id': 'IPR000504', 'properties': [{'key': 'EntryName', 'value': 'RRM_dom'}]}, {'database': 'InterPro', 'id': 'IPR003954', 'properties': [{'key': 'EntryName', 'value': 'RRM_dom_euk'}]}, {'database': 'InterPro', 'id': 'IPR009145', 'properties': [{'key': 'EntryName', 'value': 'U2AF_small'}]}, {'database': 'InterPro', 'id': 'IPR000571', 'properties': [{'key': 'EntryName', 'value': 'Znf_CCCH'}]}, {'database': 'PANTHER', 'id': 'PTHR12620', 'properties': [{'key': 'EntryName', 'value': 'U2 SNRNP AUXILIARY FACTOR, SMALL SUBUNIT'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF00076', 'properties': [{'key': 'EntryName', 'value': 'RRM_1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'Pfam', 'id': 'PF00642', 'properties': [{'key': 'EntryName', 'value': 'zf-CCCH'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PRINTS', 'id': 'PR01848', 'properties': [{'key': 'EntryName', 'value': 'U2AUXFACTOR'}]}, {'database': 'SMART', 'id': 'SM00361', 'properties': [{'key': 'EntryName', 'value': 'RRM_1'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'SMART', 'id': 'SM00356', 'properties': [{'key': 'EntryName', 'value': 'ZnF_C3H1'}, {'key': 'MatchStatus', 'value': '2'}]}, {'database': 'SUPFAM', 'id': 'SSF54928', 'properties': [{'key': 'EntryName', 'value': 'RNA-binding domain, RBD'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50102', 'properties': [{'key': 'EntryName', 'value': 'RRM'}, {'key': 'MatchStatus', 'value': '1'}]}, {'database': 'PROSITE', 'id': 'PS50103', 'properties': [{'key': 'EntryName', 'value': 'ZF_C3H1'}, {'key': 'MatchStatus', 'value': '2'}]}], 'sequence': {'value': 'MTFLCFLRALPASLVALCKEEEEDTFIEEQQLEEEKLLERERQRLHEEWLLREQKAQEEFRIKKEKEEAAKKRQEEQERKLKEQWEEQQRKEREEEEQKRQEKKEKEEALQKMLDQAENELENGTTWQNPEPPVDFRVMEKDRANCPFYSKTGACRFGDRCSRKHNFPTSSPTLLIKSMFTTFGMEQCRRDDYDPDASLEYSEEETYQQFLDFYEDVLPEFKNVGKVIQFKVSCNLEPHLRGNVYVQYQSEEECQAALSLFNGRWYAGRQLQCEFCPVTRWKMAICGLFEIQQCPRGKHCNFLHVFRNPNNEFWEANRDIYLSPDRTGSSFGKNSERRERMGHHDDYYSRLRGRRNPSPDHSYKRNGESERKSSRHRGKKSHKRTSKSRERHNSRSRGRNRDRSRDRSRGRGSRSRSRSRSRRSRRSRSQSSSRSRSRGRRRSETGSCYVAQTGGQWLFT', 'length': 460, 'molWeight': 55164, 'crc64': '767508C34EFA4466', 'md5': '778986CF61A1A953A5B1D7F6E9248A50'}, 'extraAttributes': {'countByFeatureType': {'Signal': 1, 'Chain': 1, 'Domain': 3, 'Zinc finger': 2, 'Region': 2, 'Compositional bias': 5}, 'uniParcId': 'UPI0007DC7599'}}}]}
Store the mapping results in a dictionary¶
Key is the gene ID and value is a nested dictionary with keys "description" and "sequence"
dic_gene_id_to_descp_seq = {}
for result in mapping_results['results']:
# print(result['to'])
if result['to']['entryType'] == 'UniProtKB reviewed (Swiss-Prot)':
# print (result['from'], result['to'])
dic_gene_id_to_descp_seq[result['from']] = {}
for comment in result['to']['comments']:
if comment['commentType'] == 'FUNCTION':
for text in comment['texts']:
# print (text['value'])
description = text['value']
dic_gene_id_to_descp_seq[result['from']]['description'] = description
dic_gene_id_to_descp_seq[result['from']]['sequence'] = result['to']['sequence']['value']
# Display the contents of the dictionary
for gene_id, descp_seq in dic_gene_id_to_descp_seq.items():
print(f"Gene ID: {gene_id}")
print(f"Description: {descp_seq['description']}")
print(f"Sequence: {descp_seq['sequence']}")
print()
Gene ID: 9796 Description: Its interaction with PHYH suggests a role in the development of the central system Sequence: MELLSTPHSIEINNITCDSFRISWAMEDSDLERVTHYFIDLNKKENKNSNKFKHRDVPTKLVAKAVPLPMTVRGHWFLSPRTEYSVAVQTAVKQSDGEYLVSGWSETVEFCTGDYAKEHLAQLQEKAEQIAGRMLRFSVFYRNHHKEYFQHARTHCGNMLQPYLKDNSGSHGSPTSGMLHGVFFSCNTEFNTGQPPQDSPYGRWRFQIPAQRLFNPSTNLYFADFYCMYTAYHYAILVLAPKGSLGDRFCRDRLPLLDIACNKFLTCSVEDGELVFRHAQDLILEIIYTEPVDLSLGTLGEISGHQLMSLSTADAKKDPSCKTCNISVGR Gene ID: 56992 Description: Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly Sequence: MAPGCKTELRSVTNGQSNQPSNEGDAIKVFVRIRPPAERSGSADGEQNLCLSVLSSTSLRLHSNPEPKTFTFDHVADVDTTQESVFATVAKSIVESCMSGYNGTIFAYGQTGSGKTFTMMGPSESDNFSHNLRGVIPRSFEYLFSLIDREKEKAGAGKSFLCKCSFIEIYNEQIYDLLDSASAGLYLREHIKKGVFVVGAVEQVVTSAAEAYQVLSGGWRNRRVASTSMNRESSRSHAVFTITIESMEKSNEIVNIRTSLLNLVDLAGSERQKDTHAEGMRLKEAGNINRSLSCLGQVITALVDVGNGKQRHVCYRDSKLTFLLRDSLGGNAKTAIIANVHPGSRCFGETLSTLNFAQRAKLIKNKAVVNEDTQGNVSQLQAEVKRLKEQLAELASGQTPPESFLTRDKKKTNYMEYFQEAMLFFKKSEQEKKSLIEKVTQLEDLTLKKEKFIQSNKMIVKFREDQIIRLEKLHKESRGGFLPEEQDRLLSELRNEIQTLREQIEHHPRVAKYAMENHSLREENRRLRLLEPVKRAQEMDAQTIAKLEKAFSEISGMEKSDKNQQGFSPKAQKEPCLFANTEKLKAQLLQIQTELNNSKQEYEEFKELTRKRQLELESELQSLQKANLNLENLLEATKACKRQEVSQLNKIHAETLKIITTPTKAYQLHSRPVPKLSPEMGSFGSLYTQNSSILDNDILNEPVPPEMNEQAFEAISEELRTVQEQMSALQAKLDEEEHKNLKLQQHVDKLEHHSTQMQELFSSERIDWTKQQEELLSQLNVLEKQLQETQTKNDFLKSEVHDLRVVLHSADKELSSVKLEYSSFKTNQEKEFNKLSERHMHVQLQLDNLRLENEKLLESKACLQDSYDNLQEIMKFEIDQLSRNLQNFKKENETLKSDLNNLMELLEAEKERNNKLSLQFEEDKENSSKEILKVLEAVRQEKQKETAKCEQQMAKVQKLEESLLATEKVISSLEKSRDSDKKVVADLMNQIQELRTSVCEKTETIDTLKQELKDINCKYNSALVDREESRVLIKKQEVDILDLKETLRLRILSEDIERDMLCEDLAHATEQLNMLTEASKKHSGLLQSAQEELTKKEALIQELQHKLNQKKEEVEQKKNEYNFKMRQLEHVMDSAAEDPQSPKTPPHFQTHLAKLLETQEQEIEDGRASKTSLEHLVTKLNEDREVKNAEILRMKEQLREMENLRLESQQLIEKNWLLQGQLDDIKRQKENSDQNHPDNQQLKNEQEESIKERLAKSKIVEEMLKMKADLEEVQSALYNKEMECLRMTDEVERTQTLESKAFQEKEQLRSKLEEMYEERERTSQEMEMLRKQVECLAEENGKLVGHQNLHQKIQYVVRLKKENVRLAEETEKLRAENVFLKEKKRSES Gene ID: 7918 Description: Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly Sequence: MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF Gene ID: 9240 Description: Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly Sequence: MAMTLLEDWCRGMDVNSQRALLVWGIPVNCDEAEIEETLQAAMPQVSYRMLGRMFWREENAKAALLELTGAVDYAAIPREMPGKGGVWKVLFKPPTSDAEFLERLHLFLAREGWTVQDVARVLGFQNPTPTPGPEMPAEMLNYILDNVIQPLVESIWYKRLTLFSGRDIPGPGEETFDPWLEHTNEVLEEWQVSDVEKRRRLMESLRGPAADVIRILKSNNPAITTAECLKALEQVFGSVESSRDAQIKFLNTYQNPGEKLSAYVIRLEPLLQKVVEKGAIDKDNVNQARLEQVIAGANHSGAIRRQLWLTGAGEGPAPNLFQLLVQIREEEAKEEEEEAEATLLQLGLEGHF Gene ID: 8233 Description: Pre-mRNA-binding protein required for splicing of both U2- and U12-type introns. Selectively interacts with the 3'-splice site of U2- and U12-type pre-mRNAs and promotes different steps in U2 and U12 intron splicing. Recruited to U12 pre-mRNAs in an ATP-dependent manner and is required for assembly of the pre-spliceosome, a precursor to other spliceosomal complexes. For U2-type introns, it is selectively and specifically required for the second step of splicing Sequence: MAAPEKMTFPEKPSHKKYRAALKKEKRKKRRQELARLRDSGLSQKEEEEDTFIEEQQLEEEKLLERERQRLHEEWLLREQKAQEEFRIKKEKEEAAKKRQEEQERKLKEQWEEQQRKEREEEEQKRQEKKEKEEALQKMLDQAENELENGTTWQNPEPPVDFRVMEKDRANCPFYSKTGACRFGDRCSRKHNFPTSSPTLLIKSMFTTFGMEQCRRDDYDPDASLEYSEEETYQQFLDFYEDVLPEFKNVGKVIQFKVSCNLEPHLRGNVYVQYQSEEECQAALSLFNGRWYAGRQLQCEFCPVTRWKMAICGLFEIQQCPRGKHCNFLHVFRNPNNEFWEANRDIYLSPDRTGSSFGKNSERRERMGHHDDYYSRLRGRRNPSPDHSYKRNGESERKSSRHRGKKSHKRTSKSRERHNSRSRGRNRDRSRDRSRGRGSRSRSRSRSRRSRRSRSQSSSRSRSRGRRRSGNRDRTVQSPKSK
Most of the biomedical graphs offer gene names, hence you can choose to also query seqeuences and descriptions via the Gene names using the EnrichmentWithUniprot class¶
for gene_id in inputs:
# Get the gene name of the gene ID
gene_name = dic_gene_ids[gene_id].name
print (f"Gene name: {gene_name}")
# Create an instance of the EnrichmentWithUniProt class
enrich_uniprot = EnrichmentWithUniProt()
# Get the sequence and description for the gene name
description, sequence = enrich_uniprot.enrich_documents([gene_name])
dic_gene_id_to_descp_seq[gene_id]['description'] = description
dic_gene_id_to_descp_seq[gene_id]['sequence'] = sequence
print (f"Gene name: {gene_name}\nDescription: {description}\nSequence: {sequence}")
Gene name: PHYHIP Gene name: PHYHIP Description: ['Its interaction with PHYH suggests a role in the development of the central system'] Sequence: ['MELLSTPHSIEINNITCDSFRISWAMEDSDLERVTHYFIDLNKKENKNSNKFKHRDVPTKLVAKAVPLPMTVRGHWFLSPRTEYSVAVQTAVKQSDGEYLVSGWSETVEFCTGDYAKEHLAQLQEKAEQIAGRMLRFSVFYRNHHKEYFQHARTHCGNMLQPYLKDNSGSHGSPTSGMLHGVFFSCNTEFNTGQPPQDSPYGRWRFQIPAQRLFNPSTNLYFADFYCMYTAYHYAILVLAPKGSLGDRFCRDRLPLLDIACNKFLTCSVEDGELVFRHAQDLILEIIYTEPVDLSLGTLGEISGHQLMSLSTADAKKDPSCKTCNISVGR'] Gene name: KIF15 Gene name: KIF15 Description: ['Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly'] Sequence: ['MAPGCKTELRSVTNGQSNQPSNEGDAIKVFVRIRPPAERSGSADGEQNLCLSVLSSTSLRLHSNPEPKTFTFDHVADVDTTQESVFATVAKSIVESCMSGYNGTIFAYGQTGSGKTFTMMGPSESDNFSHNLRGVIPRSFEYLFSLIDREKEKAGAGKSFLCKCSFIEIYNEQIYDLLDSASAGLYLREHIKKGVFVVGAVEQVVTSAAEAYQVLSGGWRNRRVASTSMNRESSRSHAVFTITIESMEKSNEIVNIRTSLLNLVDLAGSERQKDTHAEGMRLKEAGNINRSLSCLGQVITALVDVGNGKQRHVCYRDSKLTFLLRDSLGGNAKTAIIANVHPGSRCFGETLSTLNFAQRAKLIKNKAVVNEDTQGNVSQLQAEVKRLKEQLAELASGQTPPESFLTRDKKKTNYMEYFQEAMLFFKKSEQEKKSLIEKVTQLEDLTLKKEKFIQSNKMIVKFREDQIIRLEKLHKESRGGFLPEEQDRLLSELRNEIQTLREQIEHHPRVAKYAMENHSLREENRRLRLLEPVKRAQEMDAQTIAKLEKAFSEISGMEKSDKNQQGFSPKAQKEPCLFANTEKLKAQLLQIQTELNNSKQEYEEFKELTRKRQLELESELQSLQKANLNLENLLEATKACKRQEVSQLNKIHAETLKIITTPTKAYQLHSRPVPKLSPEMGSFGSLYTQNSSILDNDILNEPVPPEMNEQAFEAISEELRTVQEQMSALQAKLDEEEHKNLKLQQHVDKLEHHSTQMQELFSSERIDWTKQQEELLSQLNVLEKQLQETQTKNDFLKSEVHDLRVVLHSADKELSSVKLEYSSFKTNQEKEFNKLSERHMHVQLQLDNLRLENEKLLESKACLQDSYDNLQEIMKFEIDQLSRNLQNFKKENETLKSDLNNLMELLEAEKERNNKLSLQFEEDKENSSKEILKVLEAVRQEKQKETAKCEQQMAKVQKLEESLLATEKVISSLEKSRDSDKKVVADLMNQIQELRTSVCEKTETIDTLKQELKDINCKYNSALVDREESRVLIKKQEVDILDLKETLRLRILSEDIERDMLCEDLAHATEQLNMLTEASKKHSGLLQSAQEELTKKEALIQELQHKLNQKKEEVEQKKNEYNFKMRQLEHVMDSAAEDPQSPKTPPHFQTHLAKLLETQEQEIEDGRASKTSLEHLVTKLNEDREVKNAEILRMKEQLREMENLRLESQQLIEKNWLLQGQLDDIKRQKENSDQNHPDNQQLKNEQEESIKERLAKSKIVEEMLKMKADLEEVQSALYNKEMECLRMTDEVERTQTLESKAFQEKEQLRSKLEEMYEERERTSQEMEMLRKQVECLAEENGKLVGHQNLHQKIQYVVRLKKENVRLAEETEKLRAENVFLKEKKRSES'] Gene name: GPANK1 Gene name: GPANK1 Description: [''] Sequence: ['MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF'] Gene name: PNMA1 Gene name: PNMA1 Description: [''] Sequence: ['MAMTLLEDWCRGMDVNSQRALLVWGIPVNCDEAEIEETLQAAMPQVSYRMLGRMFWREENAKAALLELTGAVDYAAIPREMPGKGGVWKVLFKPPTSDAEFLERLHLFLAREGWTVQDVARVLGFQNPTPTPGPEMPAEMLNYILDNVIQPLVESIWYKRLTLFSGRDIPGPGEETFDPWLEHTNEVLEEWQVSDVEKRRRLMESLRGPAADVIRILKSNNPAITTAECLKALEQVFGSVESSRDAQIKFLNTYQNPGEKLSAYVIRLEPLLQKVVEKGAIDKDNVNQARLEQVIAGANHSGAIRRQLWLTGAGEGPAPNLFQLLVQIREEEAKEEEEEAEATLLQLGLEGHF'] Gene name: ZRSR2 Gene name: ZRSR2 Description: ["Pre-mRNA-binding protein required for splicing of both U2- and U12-type introns. Selectively interacts with the 3'-splice site of U2- and U12-type pre-mRNAs and promotes different steps in U2 and U12 intron splicing. Recruited to U12 pre-mRNAs in an ATP-dependent manner and is required for assembly of the pre-spliceosome, a precursor to other spliceosomal complexes. For U2-type introns, it is selectively and specifically required for the second step of splicing"] Sequence: ['MAAPEKMTFPEKPSHKKYRAALKKEKRKKRRQELARLRDSGLSQKEEEEDTFIEEQQLEEEKLLERERQRLHEEWLLREQKAQEEFRIKKEKEEAAKKRQEEQERKLKEQWEEQQRKEREEEEQKRQEKKEKEEALQKMLDQAENELENGTTWQNPEPPVDFRVMEKDRANCPFYSKTGACRFGDRCSRKHNFPTSSPTLLIKSMFTTFGMEQCRRDDYDPDASLEYSEEETYQQFLDFYEDVLPEFKNVGKVIQFKVSCNLEPHLRGNVYVQYQSEEECQAALSLFNGRWYAGRQLQCEFCPVTRWKMAICGLFEIQQCPRGKHCNFLHVFRNPNNEFWEANRDIYLSPDRTGSSFGKNSERRERMGHHDDYYSRLRGRRNPSPDHSYKRNGESERKSSRHRGKKSHKRTSKSRERHNSRSRGRNRDRSRDRSRGRGSRSRSRSRSRRSRRSRSQSSSRSRSRGRRRSGNRDRTVQSPKSK']
Map the description and sequence from the dictionary to their corresponding nodes in the graph¶
from tqdm import tqdm
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_type') != 'gene/protein':
continue
gene_id = kg.nodes[node].get('node_id')
# Ignore the genes/proteins without description
if gene_id not in dic_gene_id_to_descp_seq:
continue
description = dic_gene_id_to_descp_seq[gene_id]['description']
sequence = dic_gene_id_to_descp_seq[gene_id]['sequence']
print (f"node: {node}, gene ID: {gene_id}, description: {description}, sequence: {sequence}")
G.add_nodes_from([(node, {'description': description, 'sequence': sequence})])
# Recompose the graph
kg = nx.compose(G, kg)
100%|██████████| 129262/129262 [00:00<00:00, 1531104.56it/s]
node: PHYHIP, gene ID: 9796, description: ['Its interaction with PHYH suggests a role in the development of the central system'], sequence: ['MELLSTPHSIEINNITCDSFRISWAMEDSDLERVTHYFIDLNKKENKNSNKFKHRDVPTKLVAKAVPLPMTVRGHWFLSPRTEYSVAVQTAVKQSDGEYLVSGWSETVEFCTGDYAKEHLAQLQEKAEQIAGRMLRFSVFYRNHHKEYFQHARTHCGNMLQPYLKDNSGSHGSPTSGMLHGVFFSCNTEFNTGQPPQDSPYGRWRFQIPAQRLFNPSTNLYFADFYCMYTAYHYAILVLAPKGSLGDRFCRDRLPLLDIACNKFLTCSVEDGELVFRHAQDLILEIIYTEPVDLSLGTLGEISGHQLMSLSTADAKKDPSCKTCNISVGR'] node: KIF15, gene ID: 56992, description: ['Plus-end directed kinesin-like motor enzyme involved in mitotic spindle assembly'], sequence: ['MAPGCKTELRSVTNGQSNQPSNEGDAIKVFVRIRPPAERSGSADGEQNLCLSVLSSTSLRLHSNPEPKTFTFDHVADVDTTQESVFATVAKSIVESCMSGYNGTIFAYGQTGSGKTFTMMGPSESDNFSHNLRGVIPRSFEYLFSLIDREKEKAGAGKSFLCKCSFIEIYNEQIYDLLDSASAGLYLREHIKKGVFVVGAVEQVVTSAAEAYQVLSGGWRNRRVASTSMNRESSRSHAVFTITIESMEKSNEIVNIRTSLLNLVDLAGSERQKDTHAEGMRLKEAGNINRSLSCLGQVITALVDVGNGKQRHVCYRDSKLTFLLRDSLGGNAKTAIIANVHPGSRCFGETLSTLNFAQRAKLIKNKAVVNEDTQGNVSQLQAEVKRLKEQLAELASGQTPPESFLTRDKKKTNYMEYFQEAMLFFKKSEQEKKSLIEKVTQLEDLTLKKEKFIQSNKMIVKFREDQIIRLEKLHKESRGGFLPEEQDRLLSELRNEIQTLREQIEHHPRVAKYAMENHSLREENRRLRLLEPVKRAQEMDAQTIAKLEKAFSEISGMEKSDKNQQGFSPKAQKEPCLFANTEKLKAQLLQIQTELNNSKQEYEEFKELTRKRQLELESELQSLQKANLNLENLLEATKACKRQEVSQLNKIHAETLKIITTPTKAYQLHSRPVPKLSPEMGSFGSLYTQNSSILDNDILNEPVPPEMNEQAFEAISEELRTVQEQMSALQAKLDEEEHKNLKLQQHVDKLEHHSTQMQELFSSERIDWTKQQEELLSQLNVLEKQLQETQTKNDFLKSEVHDLRVVLHSADKELSSVKLEYSSFKTNQEKEFNKLSERHMHVQLQLDNLRLENEKLLESKACLQDSYDNLQEIMKFEIDQLSRNLQNFKKENETLKSDLNNLMELLEAEKERNNKLSLQFEEDKENSSKEILKVLEAVRQEKQKETAKCEQQMAKVQKLEESLLATEKVISSLEKSRDSDKKVVADLMNQIQELRTSVCEKTETIDTLKQELKDINCKYNSALVDREESRVLIKKQEVDILDLKETLRLRILSEDIERDMLCEDLAHATEQLNMLTEASKKHSGLLQSAQEELTKKEALIQELQHKLNQKKEEVEQKKNEYNFKMRQLEHVMDSAAEDPQSPKTPPHFQTHLAKLLETQEQEIEDGRASKTSLEHLVTKLNEDREVKNAEILRMKEQLREMENLRLESQQLIEKNWLLQGQLDDIKRQKENSDQNHPDNQQLKNEQEESIKERLAKSKIVEEMLKMKADLEEVQSALYNKEMECLRMTDEVERTQTLESKAFQEKEQLRSKLEEMYEERERTSQEMEMLRKQVECLAEENGKLVGHQNLHQKIQYVVRLKKENVRLAEETEKLRAENVFLKEKKRSES'] node: GPANK1, gene ID: 7918, description: [''], sequence: ['MSRPLLITFTPATDPSDLWKDGQQQPQPEKPESTLDGAAARAFYEALIGDESSAPDSQRSQTEPARERKRKKRRIMKAPAAEAVAEGASGRHGQGRSLEAEDKMTHRILRAAQEGDLPELRRLLEPHEAGGAGGNINARDAFWWTPLMCAARAGQGAAVSYLLGRGAAWVGVCELSGRDAAQLAEEAGFPEVARMVRESHGETRSPENRSPTPSLQYCENCDTHFQDSNHRTSTAHLLSLSQGPQPPNLPLGVPISSPGFKLLLRGGWEPGMGLGPRGEGRANPIPTVLKRDQEGLGYRSAPQPRVTHFPAWDTRAVAGRERPPRVATLSWREERRREEKDRAWERDLRTYMNLEF'] node: PNMA1, gene ID: 9240, description: [''], sequence: ['MAMTLLEDWCRGMDVNSQRALLVWGIPVNCDEAEIEETLQAAMPQVSYRMLGRMFWREENAKAALLELTGAVDYAAIPREMPGKGGVWKVLFKPPTSDAEFLERLHLFLAREGWTVQDVARVLGFQNPTPTPGPEMPAEMLNYILDNVIQPLVESIWYKRLTLFSGRDIPGPGEETFDPWLEHTNEVLEEWQVSDVEKRRRLMESLRGPAADVIRILKSNNPAITTAECLKALEQVFGSVESSRDAQIKFLNTYQNPGEKLSAYVIRLEPLLQKVVEKGAIDKDNVNQARLEQVIAGANHSGAIRRQLWLTGAGEGPAPNLFQLLVQIREEEAKEEEEEAEATLLQLGLEGHF'] node: ZRSR2, gene ID: 8233, description: ["Pre-mRNA-binding protein required for splicing of both U2- and U12-type introns. Selectively interacts with the 3'-splice site of U2- and U12-type pre-mRNAs and promotes different steps in U2 and U12 intron splicing. Recruited to U12 pre-mRNAs in an ATP-dependent manner and is required for assembly of the pre-spliceosome, a precursor to other spliceosomal complexes. For U2-type introns, it is selectively and specifically required for the second step of splicing"], sequence: ['MAAPEKMTFPEKPSHKKYRAALKKEKRKKRRQELARLRDSGLSQKEEEEDTFIEEQQLEEEKLLERERQRLHEEWLLREQKAQEEFRIKKEKEEAAKKRQEEQERKLKEQWEEQQRKEREEEEQKRQEKKEKEEALQKMLDQAENELENGTTWQNPEPPVDFRVMEKDRANCPFYSKTGACRFGDRCSRKHNFPTSSPTLLIKSMFTTFGMEQCRRDDYDPDASLEYSEEETYQQFLDFYEDVLPEFKNVGKVIQFKVSCNLEPHLRGNVYVQYQSEEECQAALSLFNGRWYAGRQLQCEFCPVTRWKMAICGLFEIQQCPRGKHCNFLHVFRNPNNEFWEANRDIYLSPDRTGSSFGKNSERRERMGHHDDYYSRLRGRRNPSPDHSYKRNGESERKSSRHRGKKSHKRTSKSRERHNSRSRGRNRDRSRDRSRGRGSRSRSRSRSRRSRRSRSQSSSRSRSRGRRRSGNRDRTVQSPKSK']
Check device availability¶
device = "cuda:0" if torch.cuda.is_available() else "cpu"
device
'cuda:0'
Load the ESM2 model¶
emb_model = EmbeddingWithHuggingFace(model_name='facebook/esm2_t6_8M_UR50D',
model_cache_dir="../../../../data/facebook/esm2_t6_8M_UR50D/",
truncation=False,
device=device)
Some weights of EsmModel were not initialized from the model checkpoint at facebook/esm2_t6_8M_UR50D and are newly initialized: ['esm.pooler.dense.bias', 'esm.pooler.dense.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Generate sequence embedding and add it to the graph as new attribute "sequence_embedding"¶
# Embeddings using 1 sample at a time
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_type') != 'gene/protein':
continue
gene_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('sequence') is None:
continue
seq = kg.nodes[node].get('sequence')
# print (node, seq)
outputs = emb_model.embed_documents([seq])
G.add_nodes_from([(node, {'sequence_embedding': outputs[0]})])
torch.cuda.synchronize()
torch.cuda.empty_cache()
# Recompose the graph
kg = nx.compose(G, kg)
100%|██████████| 129262/129262 [00:05<00:00, 21864.09it/s]
Protein embedding¶
Load the BioBERT model
# Using MSFT's BioBERT
emb_model = EmbeddingWithHuggingFace(model_name='microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract',
model_cache_dir="../../../../data/microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract/",
truncation=False,
device=device)
Generate description embedding and add it to the graph as new attribute "description_embedding"¶
for i, node in tqdm(enumerate(kg.nodes)):
if kg.nodes[node].get('node_type') != 'gene/protein':
continue
gene_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
desc = kg.nodes[node].get('description')
outputs = emb_model.embed_documents([desc])
G.add_nodes_from([(node, {'description_embedding': outputs})])
torch.cuda.synchronize()
torch.cuda.empty_cache()
# Recompose the graph
kg = nx.compose(G, kg)
129262it [00:00, 372240.20it/s]
Put together all the results so far in a df¶
import pandas as pd
dic = {'gene':[],
'description':[],
'sequence':[],
'description_embedding':[],
'sequence_embedding':[]}
for node in tqdm(kg.nodes):
if kg.nodes[node].get('node_type') != 'gene/protein':
continue
gene_id = kg.nodes[node].get('node_id')
if kg.nodes[node].get('description') is None:
continue
dic['gene'].append(node)
dic['description'].append(kg.nodes[node].get('description'))
dic['sequence'].append(kg.nodes[node].get('sequence'))
dic['description_embedding'].append(kg.nodes[node].get('description_embedding'))
dic['sequence_embedding'].append(kg.nodes[node].get('sequence_embedding'))
# print (node, kg.nodes[node].get('description'), kg.nodes[node].get('sequence'), kg.nodes[node].get('description_embedding'))
df = pd.DataFrame(dic)
df
100%|██████████| 129262/129262 [00:00<00:00, 1170116.51it/s]
gene | description | sequence | description_embedding | sequence_embedding | |
---|---|---|---|---|---|
0 | PABPC1 | (Microbial infection) Positively regulates the... | MNPSAPSYPMASLYVGDLHPDVTEAMLYEKFSPAGPILSIRVCRDM... | [[tensor(-0.1369), tensor(-0.0554), tensor(0.0... | [tensor(0.0133), tensor(-0.0413), tensor(0.192... |
1 | GTPBP3 | GTPase component of the GTPBP3-MTO1 complex th... | MWRGLWTLAAQAARGPRRLCTRRSSGAPAPGSGATIFALSSGQGRC... | [[tensor(-0.0017), tensor(0.2555), tensor(0.45... | [tensor(-0.1717), tensor(-0.2150), tensor(-0.0... |
2 | SHOX2 | May be a growth regulator and have a role in s... | MEELTAFVSKSFDQKVKEKKEAITYREVLESGPLRGAKEPTGCTEA... | [[tensor(0.0310), tensor(0.3633), tensor(0.744... | [tensor(-0.0969), tensor(-0.2075), tensor(0.01... |
3 | ALDH16A1 | May be a growth regulator and have a role in s... | MAATRAGPRAREIFTSLEYGPVPESHACALAWLDTQDRCLGHYVNG... | [[tensor(0.0310), tensor(0.3633), tensor(0.744... | [tensor(-0.2420), tensor(-0.0727), tensor(0.10... |
4 | GMPR | Catalyzes the irreversible NADPH-dependent dea... | MPRIDADLKLDFKDVLLRPKRSSLKSRAEVDLERTFTFRNSKQTYS... | [[tensor(0.1503), tensor(-0.1371), tensor(0.33... | [tensor(0.0235), tensor(0.0571), tensor(0.0156... |
5 | INPP4B | Catalyzes the hydrolysis of the 4-position pho... | MEIKEEGASEEGQHFLPTAQANDPGDCQFTSIQKTPNEPQLEFILA... | [[tensor(0.1104), tensor(0.2020), tensor(0.234... | [tensor(-0.0336), tensor(-0.1216), tensor(0.02... |