sim_data_func¶
Description¶
This function is used to generate the example data for functions func_ge and grid_func_ge. Users can customize the outcomes using the parameter shown in the parameter table below.
See also at func_ge and grid_scalar_ge.
Usage¶
sim_data_func(n, m, ytype, input_type = 'SNP', seed = 0)
Parameters¶
This part shows the meanings and data types of parameters. Users can check the table below to customize the simulation data.
Parameter |
Description |
|---|---|
n |
numeric, sample size. |
m |
numeric, the sequence length of each sample. |
ytype |
character, “Survival”, “Binary” or “Continuous” type of the output y. If not specified, the default is continuous. |
input_type |
character, “SNP” or “func” type of the input gene variables. If not specified, the default is “SNP”. |
seed |
numeric, random seeds each time when data is generated. |
Value¶
The function sim_data_func outputs a dictionary including response variable y, scalar variable z and sequence (genotypes) data X.
y: An array The response variable. When the type of output data is “survival”, output y is an n*2 array that consists:
The minimum of the survival time and censoring time.
The event indicator.
X: A matrix or a list of fd objects.
When input_type = “SNP”, a matrix representing the sequence data, with the number of rows equal to the number of samples.
When input_type = “func”, a list contains functional objects denoted as fd, with the number of rows equal to the number of samples.
location: A list defining the sampling sites of the sequence (genotypes) data.
Z: A matrix representing the scalar covariates, with the number of rows equal to the number of samples.
Examples¶
Here is a quick example for using this function:
from GENetLib.sim_data import sim_data_func
func_continuous = sim_data_func(n = 1000, m = 100, ytype = 'Continuous', seed = 1)
X = func_continuous['X']
y = func_continuous['y']
Z = func_continuous['Z']
location = func_continuous['location']
When users want to generate fd objects:
from GENetLib.sim_data import sim_data_func
func_continuous = sim_data_func(n = 1000, m = 100, input_type = 'SNP', ytype = 'Continuous', seed = 1)
X = func_continuous['X']
y = func_continuous['y']
Z = func_continuous['Z']
location = func_continuous['location']
Previous: sim_data_scalar | Next: scalar_ge