Annotates variants in MAF with OncoKB™ annotation. Supports both python2 and python3.
MafAnnotator.py: A Mutation Annotation Format (MAF) file is a tab-delimited text file that lists mutations. MAFAnnotator annotated genes from MAF file by OncoKB™ Level of Evidences rules.
When you run MafAnnotator.py, FusionAnnotator.py and CnaAnnotator.py, you need a token before accessing the OncoKB™ data via its web API. Please visit OncoKB™ Data Access Page for more information about how to register an account and get an OncoKB™ API token.
With the token listed under OncoKB™ Account Settings Page, you could use it in the following format.
When you type python MafAnnotator.py -h in terminal, you can see all Python command parameters as below. -i <input MAF file>, -o <output MAF file> and-b oncokb_api_bear_token are required.
MafAnnotator.py -i <input MAF file> -o <output MAF file> [-p previous results] [-c <input clinical file>] [-s sample list filter] [-t <default tumor type>] [-u oncokb-base-url] [-b oncokb_api_bear_token] [-a]
Essential MAF columns (case insensitive):
HUGO_SYMBOL: Hugo gene symbol
VARIANT_CLASSIFICATION: Translational effect of variant allele
TUMOR_SAMPLE_BARCODE: sample ID
AMINO_ACID_CHANGE: amino acid change
PROTEIN_START: protein start
PROTEIN_END: protein end
PROTEIN_POSITION: can be used instead of PROTEIN_START and PROTEIN_END (in the output of vcf2map)
Essential clinical columns:
SAMPLE_ID: sample ID
ONCOTREE_CODE: tumor type code from oncotree (oncotree.mskcc.org)
Cancer type will be assigned based on the following priority:
1) ONCOTREE_CODE in clinical data file
2) ONCOTREE_CODE exist in MAF
3) default tumor type (-t)
Default OncoKB™ base url is http://oncokb.org.
Use -a to annotate mutational hotspots
Fusion Annotator
When you type python FusionAnnotator.py -h in terminal, you can see all Python command parameters as below. The required parameters is the same with MAF Annotator.
FusionAnnotator.py -i <input Fusion file> -o <output Fusion file> [-p previous results] [-c <input clinical file>] [-s sample list filter] [-t <default tumor type>] [-u oncokb-base-url] [-b oncokb_api_bear_token]
Essential Fusion columns (case insensitive):
HUGO_SYMBOL: Hugo gene symbol
VARIANT_CLASSIFICATION: Translational effect of variant allele
TUMOR_SAMPLE_BARCODE: sample ID
FUSION: amino acid change, e.g. "TMPRSS2-ERG fusion"
Essential clinical columns:
SAMPLE_ID: sample ID
ONCOTREE_CODE: tumor type code from oncotree (oncotree.mskcc.org)
Cancer type will be assigned based on the following priority:
1) ONCOTREE_CODE in clinical data file
2) ONCOTREE_CODE exist in Fusion
3) default tumor type (-t)
Default OncoKB™ base url is http://oncokb.org
CNA Annotator
When you type python CnaAnnotator.py -h in terminal, you can see all Python command parameters as below. The required parameters is the same with MAF Annotator.
CnaAnnotator.py -i <input CNA file> -o <output CNA file> [-p previous results] [-c <input clinical file>] [-s sample list filter] [-t <default tumor type>] [-u oncokb-base-url] [-b oncokb_api_bear_token]
Input CNA file should follow the GISTIC output (https://cbioportal.readthedocs.io/en/latest/File-Formats.html#discrete-copy-number-data)
Essential clinical columns:
SAMPLE_ID: sample ID
Cancer type will be assigned based on the following priority:
1) ONCOTREE_CODE in clinical data file
2) ONCOTREE_CODE exist in MAF
3) default tumor type (-t)
Default OncoKB™ base url is http://oncokb.org
Clinical Data Annotator
When you type python ClinicalDataAnnotator.py -h in terminal, you can see all Python command parameters as below. -i <input clinical file>, -o <output clinical file> and-a <annotated alteration files, separate by ,> are required.
ClinicalDataAnnotator.py -i <input clinical file> -o <output clinical file> -a <annotated alteration files, separate by ,> [-s sample list filter]
Essential clinical columns:
SAMPLE_ID: sample ID
OncoKB Plots
When you type python OncoKBPlots.py -h in terminal, you can see all Python command parameters as below. -i <input clinical file> and -o <output clinical file> are required.
OncoKBPlots.py -i <annotated clinical file> -o <output PDF file> [-c <categorization column, e.g. CANCER_TYPE>] [-s sample list filter] [-n threshold of # samples in a category] [-l comma separated levels to include]
Essential clinical columns:
SAMPLE_ID: sample ID
HIGHEST_LEVEL: Highest OncoKB levels
Supported levels (-l):
LEVEL_1,LEVEL_2,LEVEL_3A,LEVEL_3B,LEVEL_4,ONCOGENIC,VUS