Package 'creedenzymatic'

Title: creedenzymatic
Description: Combine kinome results from KRSA and UKA and other tools A package for integrating upstream kinases analyses
Authors: Ali Sajid Imami [aut, cre], Khaled Alganem [aut], Justin Creeden [aut], Abdul-Rizaq Hamoud [aut]
Maintainer: Ali Sajid Imami <[email protected]>
License: MIT + file LICENSE
Version: 6.1.0
Built: 2024-10-26 04:26:29 UTC
Source: https://github.com/CogDisResLab/creedenzymatic

Help Index


Align the rows and columns of two (or more) matrices

Description

Align the rows and columns of two (or more) matrices

Usage

.align_matrices(m1, m2, ..., L = NULL, na.pad = TRUE, as.3D = TRUE)

Arguments

m1

a matrix with unique row and column names

m2

a matrix with unique row and column names

...

additional matrices with unique row and column names

L

a list of matrix objects. If this is given, m1, m2, and ... are ignored

na.pad

boolean indicating whether to pad the combined matrix with NAs for rows/columns that are not shared by m1 and m2.

as.3D

boolean indicating whether to return the result as a 3D array. If FALSE, will return a list.

Value

an object containing the aligned matrices. Will either be a list or a 3D array


Check whether test_names are columns in the data.frame df

Description

Check whether test_names are columns in the data.frame df

Usage

.check_colnames(test_names, df, throw_error = T)

Arguments

test_names

a vector of column names to test

df

the data.frame to test against

throw_error

boolean indicating whether to throw an error if any test_names are not found in df

Value

boolean indicating whether or not all test_names are columns of df


Check for duplicates in a vector

Description

Check for duplicates in a vector

Usage

.check_dups(x, name = "")

Arguments

x

the vector

name

the name of the object to print in an error message if duplicates are found


Exract elements from a GCT matrix

Description

extract the elements from a GCT object where the values of row_field and col_field are the same. A concrete example is if g represents a matrix of signatures of genetic perturbations, and you wan to extract all the values of the targeted genes.

Usage

.extract.gct(
  g,
  row_field,
  col_field,
  rdesc = NULL,
  cdesc = NULL,
  row_keyfield = "id",
  col_keyfield = "id"
)

Arguments

g

the GCT object

row_field

the column name in rdesc to search on

col_field

the column name in cdesc to search on

rdesc

a data.frame of row annotations

cdesc

a data.frame of column annotations

row_keyfield

the column name of rdesc to use for annotating the rows of g

col_keyfield

the column name of cdesc to use for annotating the rows of g

Value

a list of the following elements

mask

a logical matrix of the same dimensions as ds@mat indicating which matrix elements have been extracted

idx

an array index into ds@mat representing which elements have been extracted

vals

a vector of the extracted values


Check if x is a whole number

Description

Check if x is a whole number

Usage

.is.wholenumber(x, tol = .Machine$double.eps^0.5)

Arguments

x

number to test

tol

the allowed tolerance

Value

boolean indicating whether x is tol away from a whole number value


Pad a matrix with additional rows/columns of NA values

Description

Pad a matrix with additional rows/columns of NA values

Usage

.na_pad_matrix(m, row_universe = NULL, col_universe = NULL)

Arguments

m

a matrix with unique row and column names

row_universe

a vector with the universe of possible row names

col_universe

a vector with the universe of possible column names

Value

a matrix


Parse a GCTX file into the workspace as a GCT object

Description

Parse a GCTX file into the workspace as a GCT object

Usage

.parse.gctx(fname, rid = NULL, cid = NULL, matrix_only = FALSE)

Arguments

fname

path to the GCTX file on disk

rid

either a vector of character or integer row indices or a path to a grp file containing character row indices. Only these indicies will be parsed from the file.

cid

either a vector of character or integer column indices or a path to a grp file containing character column indices. Only these indicies will be parsed from the file.

matrix_only

boolean indicating whether to parse only the matrix (ignoring row and column annotations)

Details

parse.gctx also supports parsing of plain text GCT files, so this function can be used as a general GCT parser.

See Also

Other GCTX parsing functions: .append.dim(), .fix.datatypes(), .process_ids(), .read.gctx.ids(), .read.gctx.meta(), .write.gct(), .write.gctx(), .write.gctx.meta()


Read a GMT file and return a list

Description

Read a GMT file and return a list

Usage

.parse.gmt(fname)

Arguments

fname

the file path to be parsed

Details

parse.gmt returns a nested list object. The top level contains one list per row in fname. Each of these is itself a list with the following fields: - head: the name of the data (row in fname) - desc: description of the corresponding data - len: the number of data items - entry: a vector of the data items

Value

a list of the contents of fname. See details.

See Also

http://clue.io/help for details on the GMT file format

Other CMap parsing functions: .parse.gmx(), .parse.grp(), .write.gmt(), .write.grp()


Read a GMX file and return a list

Description

Read a GMX file and return a list

Usage

.parse.gmx(fname)

Arguments

fname

the file path to be parsed

Details

parse.gmx returns a nested list object. The top level contains one list per column in fname. Each of these is itself a list with the following fields: - head: the name of the data (column in fname) - desc: description of the corresponding data - len: the number of data items - entry: a vector of the data items

Value

a list of the contents of fname. See details.

See Also

http://clue.io/help for details on the GMX file format

Other CMap parsing functions: .parse.gmt(), .parse.grp(), .write.gmt(), .write.grp()


Read a GRP file and return a vector of its contents

Description

Read a GRP file and return a vector of its contents

Usage

.parse.grp(fname)

Arguments

fname

the file path to be parsed

Value

a vector of the contents of fname

See Also

http://clue.io/help for details on the GRP file format

Other CMap parsing functions: .parse.gmt(), .parse.gmx(), .write.gmt(), .write.grp()


Read GCTX row or column ids

Description

Read GCTX row or column ids

Usage

.read.gctx.ids(gctx_path, dimension = "row")

Arguments

gctx_path

path to the GCTX file

dimension

which ids to read (row or column)

Value

a character vector of row or column ids from the provided file

See Also

Other GCTX parsing functions: .append.dim(), .fix.datatypes(), .parse.gctx(), .process_ids(), .read.gctx.meta(), .write.gct(), .write.gctx(), .write.gctx.meta()


Parse row or column metadata from GCTX files

Description

Parse row or column metadata from GCTX files

Usage

.read.gctx.meta(gctx_path, dimension = "row", ids = NULL)

Arguments

gctx_path

the path to the GCTX file

dimension

which metadata to read (row or column)

ids

a character vector of a subset of row/column ids for which to read the metadata

Value

a data.frame of metadata

See Also

Other GCTX parsing functions: .append.dim(), .fix.datatypes(), .parse.gctx(), .process_ids(), .read.gctx.ids(), .write.gct(), .write.gctx(), .write.gctx.meta()


Update the matrix of an existing GCTX file

Description

Update the matrix of an existing GCTX file

Usage

.update.gctx(x, ofile, rid = NULL, cid = NULL)

Arguments

x

an array of data

ofile

the filename of the GCTX to update

rid

integer indices or character ids of the rows to update

cid

integer indices or character ids of the columns to update

Details

Overwrite the rows and columns of ofile as indicated by rid and cid respectively. rid and cid can either be integer indices or character ids corresponding to the row and column ids in ofile.


Write a GCT object to disk in GCT format

Description

Write a GCT object to disk in GCT format

Usage

.write.gct(ds, ofile, precision = 4, appenddim = T, ver = 3)

Arguments

ds

the GCT object

ofile

the desired output filename

precision

the numeric precision at which to save the matrix. See details.

appenddim

boolean indicating whether to append matrix dimensions to filename

ver

the GCT version to write. See details.

Details

Since GCT is text format, the higher precision you choose, the larger the file size. ver is assumed to be 3, aka GCT version 1.3, which supports embedded row and column metadata in the GCT file. Any other value passed to ver will result in a GCT version 1.2 file which contains only the matrix data and no annotations.

See Also

Other GCTX parsing functions: .append.dim(), .fix.datatypes(), .parse.gctx(), .process_ids(), .read.gctx.ids(), .read.gctx.meta(), .write.gctx(), .write.gctx.meta()


Write a GCT object to disk in GCTX format

Description

Write a GCT object to disk in GCTX format

Usage

.write.gctx(
  ds,
  ofile,
  appenddim = T,
  compression_level = 0,
  matrix_only = F,
  max_chunk_kb = 1024
)

Arguments

ds

a GCT object

ofile

the desired file path for writing

appenddim

boolean indicating whether the resulting filename will have dimensions appended (e.g. my_file_n384x978.gctx)

compression_level

integer between 1-9 indicating how much to compress data before writing. Higher values result in smaller files but slower read times.

matrix_only

boolean indicating whether to write only the matrix data (and skip row, column annotations)

max_chunk_kb

for chunking, the maximum number of KB a given chunk will occupy

See Also

Other GCTX parsing functions: .append.dim(), .fix.datatypes(), .parse.gctx(), .process_ids(), .read.gctx.ids(), .read.gctx.meta(), .write.gct(), .write.gctx.meta()


Write a nested list to a GMT file

Description

Write a nested list to a GMT file

Usage

.write.gmt(lst, fname)

Arguments

lst

the nested list to write. See details.

fname

the desired file name

Details

lst needs to be a nested list where each sub-list is itself a list with the following fields: - head: the name of the data - desc: description of the corresponding data - len: the number of data items - entry: a vector of the data items

See Also

http://clue.io/help for details on the GMT file format

Other CMap parsing functions: .parse.gmt(), .parse.gmx(), .parse.grp(), .write.grp()


Write a vector to a GRP file

Description

Write a vector to a GRP file

Usage

.write.grp(vals, fname)

Arguments

vals

the vector of values to be written

fname

the desired file name

See Also

http://clue.io/help for details on the GRP file format

Other CMap parsing functions: .parse.gmt(), .parse.gmx(), .parse.grp(), .write.gmt()


Write a data.frame to a tab-delimited text file

Description

Write a data.frame to a tab-delimited text file

Usage

.write.tbl(tbl, ofile, ...)

Arguments

tbl

the data.frame to be written

ofile

the desired file name

...

additional arguments passed on to write.table

Details

This method simply calls write.table with some preset arguments that generate a unquoated, tab-delimited file without row names.


Add annotations to a GCT object

Description

Given a GCT object and either a data.frame or a path to an annotation table, apply the annotations to the gct using the given keyfield.

Usage

annotate.gct(g, annot, dimension = "row", keyfield = "id")

Arguments

g

a GCT object

annot

a data.frame or path to text table of annotations

dimension

either 'row' or 'column' indicating which dimension of g to annotate

keyfield

the character name of the column in annot that matches the row or column identifiers in g

Value

a GCT object with annotations applied to the specified dimension

See Also

Other GCT utilities: melt.gct(), merge.gct(), rank.gct(), subset.gct()


Combine data for quartile figure

Description

reads ranked tables from the different tools (KRSA, UKA, ... etc)

Usage

combine_tools(
  KRSA_df = NULL,
  UKA_df = NULL,
  KEA3_df = NULL,
  PTM_SEA_df = NULL,
  mapping_df = kinome_mp_file
)

Arguments

KRSA_df

dataframe, KRSA table output (requires at least Kinase and Score columns)

UKA_df

dataframe, UKA table output (requires at least Kinase and Score columns)

KEA3_df

dataframe, KEA table output (requires at least Kinase and Score columns)

PTM_SEA_df

dataframe, PTM_SEA table output (requires at least Kinase and Score columns)

mapping_df

kinome mapping df (default is kinome_mp_file_v1)

Details

This function takes in ranked tables from the different tools (KRSA, UKA, ... etc) and map them to the kinome mapping file and return df ready for the quartile figure

Value

dataframe, ready for quartile figure


Runs Creedenzymatic

Description

reads KRSA, UKA, LFC tables and run creedenzymatic

Usage

creedenzymatic(
  KRSA_table,
  UKA_table,
  LFC_table,
  avg_krsa = T,
  avg_lfc = T,
  prefix = "Comp1",
  ...
)

Arguments

KRSA_table

dataframe, KRSA table output

UKA_table

dataframe, UKA table output

LFC_table

dataframe, KEA table output

...

arguments passed to other functions

Details

This function takes in table and rank and quartile kinases based on the absolute Score values

Value

dataframe, Ranked and quartiled table


Extract Top Kinases

Description

reads combined dataframe (ranked and quartiled) and extracts top kinases based on adjustable criteria

Usage

extract_top_kinases(combined_df, min_qrt, min_counts)

Arguments

combined_df

dataframe, Ranked and quartiled dataframe

min_qrt

integer, minimum quartile to count

min_counts

integer, number of minimum hits

Details

This function takes in the combined dataframe (ranked and quartiled) and extracts top kinases based on adjustable criteria

Value

vector, top kinases


An S4 class to represent a GCT object

Description

The GCT class serves to represent annotated matrices. The mat slot contains said data and the rdesc and cdesc slots contain data frames with annotations about the rows and columns, respectively

Slots

mat

a numeric matrix

rid

a character vector of row ids

cid

a character vector of column ids

rdesc

a data.frame of row descriptors

rdesc

a data.frame of column descriptors

src

a character indicating the source (usually file path) of the data

See Also

parse.gctx, write.gctx, read.gctx.meta, read.gctx.ids

http://clue.io/help for more information on the GCT format


CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA)

Description

A data frame of CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA) (Latest Version)

Usage

kinome_mp_file

Format

A data frame with 527 rows and 26 variables:


CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA)

Description

A data frame of CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA) (Version 1)

Usage

kinome_mp_file_v1

Format

A data frame with 503 rows and 14 variables:


CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA)

Description

A data frame of CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA) (Version 2)

Usage

kinome_mp_file_v2

Format

A data frame with 514 rows and 12 variables:


CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA)

Description

A data frame of CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA) (Version 3)

Usage

kinome_mp_file_v3

Format

A data frame with 530 rows and 26 variables:


CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA)

Description

A data frame of CDRL Complete mapping file (UKA+KRSA+KEA3+PTM-SEA) (Version 4)

Usage

kinome_mp_file_v4

Format

A data frame with 527 rows and 26 variables:


Transform a GCT object in to a long form data.table (aka 'melt')

Description

Utilizes the data.table::melt function to transform the matrix into long form. Optionally can include the row and column annotations in the transformed data.table.

Usage

melt.gct(
  g,
  suffixes = NULL,
  remove_symmetries = F,
  keep_rdesc = T,
  keep_cdesc = T,
  ...
)

Arguments

g

the GCT object

suffixes

the character suffixes to be applied if there are collisions between the names of the row and column descriptors

remove_symmetries

boolean indicating whether to remove the lower triangle of the matrix (only applies if g@mat is symmetric)

keep_rdesc

boolean indicating whether to keep the row descriptors in the final result

keep_cdesc

boolean indicating whether to keep the column descriptors in the final result

...

further arguments passed along to data.table::merge

Value

a data.table object with the row and column ids and the matrix values and (optinally) the row and column descriptors

See Also

Other GCT utilities: annotate.gct(), merge.gct(), rank.gct(), subset.gct()


Merge two GCT objects together

Description

Merge two GCT objects together

Usage

merge.gct(g1, g2, dimension = "row", matrix_only = F)

Arguments

g1

the first GCT object

g2

the second GCT object

dimension

the dimension on which to merge (row or column)

matrix_only

boolean idicating whether to keep only the data matrices from g1 and g2 and ignore their row and column meta data

See Also

Other GCT utilities: annotate.gct(), melt.gct(), rank.gct(), subset.gct()


CDRL Complete mapping of peptides - used for ptm-sea (PTK PamChip 86402)

Description

A data frame of CDRL Complete mapping CDRL Complete mapping of peptides - used for ptm-sea (PTK PamChip 86402)

Usage

ptk_pamchip_86402_array_layout_ptmsea

Format

A data frame with x rows and x variables:


CDRL Complete mapping of peptides to HGNC symbols (PTK PamChip 86402)

Description

A data frame of CDRL Complete mapping CDRL Complete mapping of peptides to HGNC symbols (PTK PamChip 86402)

Usage

ptk_pamchip_86402_mapping

Format

A data frame with 193 rows and 2 variables:


Plot quartile Figure

Description

Takes the combined ranked dataframe (KRSA, UKA, .. etc) and generate a quartile figure

Usage

quartile_figure(df, grouping = "KinaseFamily")

Arguments

df

dataframe, combined mapped tables

grouping

character to choose grouping (KinaseFamily, subfamily, or group). Default is KinaseFamily

Value

ggplot figure


Rank Kinases based on a score

Description

This function will scale the scores on a percentile and quartile scales

Usage

rank_kinases(
  df,
  trns = c("raw", "abs"),
  sort = c("desc", "asc"),
  tool = c("KRSA", "UKA")
)

Arguments

df

dataframe with 2 columns: Kinase, Score

trns

for transformation of the score, the values accepted for this argument are abs and raw (abs: use absolute values of scores, raw: no transformation)

sort

accepts either asc or desc (ascending and descending)

tool

specifying the name of the tool


Convert a GCT object's matrix to ranks

Description

Convert a GCT object's matrix to ranks

Usage

rank.gct(g, dim = "col", decreasing = T)

Arguments

g

the GCT object to rank

dim

the dimension along which to rank (row or column)

decreasing

boolean indicating whether higher values should get lower ranks

Value

a modified version of g, with the values in the matrix converted to ranks

See Also

Other GCT utilities: annotate.gct(), melt.gct(), merge.gct(), subset.gct()


Reads a dataframe of Peptides IDs and their Scores and run KEA3

Description

reads a dataframe of Peptides IDs and their Scores (LFC, p-value, ... etc) and run KEA3 on a subset of these peptides or all of them

Usage

read_kea(
  df,
  filter = T,
  cutoff = 0.2,
  cutoff_abs = T,
  direction = "higher",
  rm_duplicates = T,
  method = "MeanRank",
  lib = c("kinase-substrate"),
  ...
)

Arguments

df

dataframe, must have at least Peptide and Score columns

filter

boolean to subset peptides or not

cutoff

numeric to act as the cutoff to filter out peptides

cutoff_abs

boolean (use absolute value or not) default is TRUE

direction

("lower", "higher) filter based on lower than or higher than the cutoff values (default to "higher")

rm_duplicates

boolean (TRUE or FALSE) remove genes duplicates

method

"MeanRank" takes the mean rank across all libraries or "MeanFDR" takes the mean of FDR across all libraries (default is "MeanRank")

lib

searched kea libraries "kinase-substrate" or "all" (default is "kinase-substrate" which will return only kinase libraries like ChengKSIN, PTMsigDB, PhosDAll)

...

arguments passed to rank_kinases function

Details

This function a dataframe of Peptides IDs and their Scores (LFC, p-value, ... etc) and run KEA3 on a subset of these peptides or all of them

Value

dataframe, Ranked and quartiled table


Reads and Rank KRSA table

Description

reads KRSA table and checks for correct format

Usage

read_krsa(df, ...)

Arguments

df

dataframe, table output (requires at least Kinase and Score columns)

...

arguments passed to rank_kinases function

Details

This function takes in table and rank and quartile kinases based on the absolute Score values

Value

dataframe, Ranked and quartiled table


Reads a dataframe of Peptides IDs and their Scores and run PTM-SEA

Description

reads a dataframe of Peptides IDs and their Scores (LFC, p-value, ... etc) and run PTM-SEA

Usage

read_ptmsea(df, ...)

Arguments

df

dataframe, must have at least Peptide and Score columns

...

arguments passed to run ptm-sea function

lib

searched PTM-SEA libraries "kinase-substrate" or "all" (default is "kinase-substrate" which will return only kinase libraries like ChengKSIN, PTMsigDB, PhosDAll)

Details

This function a dataframe of Peptides IDs and their Scores (LFC, p-value, ... etc) and run PTM-SEA

Value

dataframe, Ranked and quartiled table


Reads and Rank UKA table

Description

reads UKA table and checks for correct format

Usage

read_uka(df, ...)

Arguments

df

dataframe, UKA table output (requires at least Kinase and Z columns)

...

arguments passed to rank_kinases function

Details

This function takes in UKA table and rank and quartile kinases based on the absolute Score values

Value

dataframe, Ranked and quartiled UKA table


Run KEA3 API based on a set of gene symbols

Description

This function takes in HGNC gene symbols and connect to KEA3 API and returns results

Usage

run_kea(gene_set, lib = "kinase-substrate")

Arguments

gene_set

vector, HGNC gene symbols based on the differentially phosphorylated peptides

lib

searched kea libraries "kinase-substrate" or "all" (default is "kinase-substrate" which will return only kinase libraries like ChengKSIN, PTMsigDB, PhosDAll)

Value

list, tables from each KEA3 library


Run PTM-SEA API using a gct file as input

Description

This function takes in a gct file (created by the read_prmsea function) and run PTM-SEA API and returns results

Usage

run_ptmsea(gct_object, lib = "iptmnet", nperm = 1000, min.overlap = 1, ...)

Arguments

lib

searched kea libraries "iptmnet" or "ptm-sea" or "all (default is "iptmnet" which uses the iptmnet mapping)

nperm

number of permutations

min.overlap

minimum overlap of target peptides with referernce peptides sets

...

additional arguments passed to the ssGSEA_ce function

gene_set

vector, HGNC gene symbols based on the differentially phosphorylated peptides

Value

list


CDRL Complete mapping of peptides - used for ptm-sea (STK PamChip 87102)

Description

A data frame of CDRL Complete mapping CDRL Complete mapping of peptides - used for ptm-sea (STK PamChip 87102)

Usage

stk_pamchip_87102_array_layout_ptmsea

Format

A data frame with x rows and x variables:


CDRL Complete mapping of peptides to HGNC symbols (STK PamChip 87102)

Description

A data frame of CDRL Complete mapping CDRL Complete mapping of peptides to HGNC symbols (STK PamChip 87102)

Usage

stk_pamchip_87102_mapping

Format

A data frame with 141 rows and 2 variables:


Subset a gct object using the provided row and column ids

Description

Subset a gct object using the provided row and column ids

Usage

subset.gct(g, rid = NULL, cid = NULL)

Arguments

g

a gct object

rid

a vector of character ids or integer indices for ROWS

cid

a vector of character ids or integer indices for COLUMNS

See Also

Other GCT utilities: annotate.gct(), melt.gct(), merge.gct(), rank.gct()


Transpose a GCT object

Description

Transpose a GCT object

Usage

transpose.gct(g)

Arguments

g

the GCT object

Value

a modified verion of the input GCT object where the matrix has been transposed and the row and column ids and annotations have been swapped.


UKA Complete DB mapping File (STK + PTK)

Description

A data frame of the UKA Complete DB mapping File (STK + PTK)

Usage

uka_db_full

Format

A data frame with 11385 rows and 17 variables: