API
column_names(data, *args, **kwargs)
Returns the names of the columns in the data. Useful to investigate the dataset before running the actual algorithm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
v6 client provided by the algorithm wrapper |
required | |
data
|
DataFrame
|
dataframe containing the data, provided by algorithm wrapper |
required |
Returns: a list of column names
Source code in python/verticox/vantage6.py
cross_validate(client, data, feature_columns, event_times_column, event_happened_column, include_value=True, datanode_ids=None, central_node_id=None, convergence_precision=DEFAULT_PRECISION, rho=DEFAULT_RHO, n_splits=DEFAULT_KFOLD_SPLITS, *_args, **_kwargs)
Fit a cox proportional hazards model using the Verticox+ algorithm using crossvalidation.
Works similarly to the fit
method, but trains multiple times on smaller subsets of the data
using k-fold crossvalidation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
AlgorithmClient
|
v6 client provided by the algorithm wrapper |
required |
data
|
DataFrame
|
dataframe containing the data, provided by algorithm wrapper |
required |
feature_columns
|
List[str]
|
The columns to be used as features |
required |
event_times_column
|
str
|
The name of the column that contains the event times |
required |
event_happened_column
|
str
|
The name of the column that contains whether an event has happened, |
required |
include_value
|
The value in the event_happened_column that means the record is NOT right-censored |
True
|
|
datanode_ids
|
List[int]
|
List of organization ids of the nodes that will be used as feature nodes |
None
|
central_node_id
|
int
|
Organization id of the node that will be used as the central node. This |
None
|
convergence_precision
|
float
|
Precision for the Cox model. The algorithm will stop when the difference |
DEFAULT_PRECISION
|
rho
|
float
|
Penalty parameter |
DEFAULT_RHO
|
n_splits
|
int
|
Number of splits for crossvalidation |
DEFAULT_KFOLD_SPLITS
|
*_args
|
|
()
|
|
**_kwargs
|
|
{}
|
Returns: A tuple containing 3 lists: c_indices
, coefs
, baseline_hazards
Source code in python/verticox/vantage6.py
116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|
fit(client, data, feature_columns, event_times_column, event_happened_column, include_value=True, datanode_ids=None, central_node_id=None, precision=DEFAULT_PRECISION, rho=DEFAULT_RHO, database=None, *_args, **_kwargs)
Fit a cox proportional hazards model using the Verticox+ algorithm
Parameters:
Name | Type | Description | Default |
---|---|---|---|
client
|
AlgorithmClient
|
v6 client provided by the algorithm wrapper |
required |
data
|
DataFrame
|
dataframe containing the data, provided by algorithm wrapper |
required |
feature_columns
|
List[str]
|
The columns to be used as features |
required |
event_times_column
|
str
|
The name of the column that contains the event times |
required |
event_happened_column
|
str
|
The name of the column that contains whether an event has happened, |
required |
include_value
|
any
|
The value in the event_happened_column that means the record is NOT right-censored |
True
|
datanode_ids
|
List[int]
|
List of organization ids of the nodes that will be used as feature nodes |
None
|
central_node_id
|
int
|
Organization id of the node that will be used as the central node. This |
None
|
precision
|
float
|
Precision for the Cox model. The algorithm will stop when the difference |
DEFAULT_PRECISION
|
rho
|
float
|
Penalty parameter |
DEFAULT_RHO
|
database
|
str | None
|
Name of the database to be used (default is "default") |
None
|
*_args
|
|
()
|
|
**_kwargs
|
|
{}
|
Returns: A dictionary containing the coefficients of the model ("coefs") and the baseline hazard function of the model ("baseline_hazard_x" and "baseline_hazard_y").
Source code in python/verticox/vantage6.py
no_op(*args, **kwargs)
A function that does nothing for a while. It is used as a partial algorithm within the verticox+ algorithm and and should not be called by itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
*args
|
|
()
|
|
**kwargs
|
|
{}
|
Returns:
Source code in python/verticox/vantage6.py
run_datanode(data, *args, selected_columns=(), event_time_column=None, include_column=None, include_value=None, external_commodity_address=None, address=None, **kwargs)
Starts the datanode (feature node) as gRPC server. This function is a partial function called by the main verticox algorithm. It is not meant to be called by itself.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
the entire dataset, provided by the algorithm wrapper |
required |
include_value
|
bool | None
|
This value in the data means the record is NOT right-censored |
None
|
selected_columns
|
List[str]
|
the names of the columns that will be treated as features (covariants) in |
()
|
event_time_column
|
str | None
|
the name of the column that indicates event time |
None
|
include_column
|
str | None
|
the name of the column that indicates whether an event has taken place or whether the sample is right censored. If the value is False, the sample is right censored. |
None
|
external_commodity_address
|
str | None
|
Address of the n-party product protocol commodity server |
None
|
address
|
The address where this server will be running. |
None
|
Returns: None
Source code in python/verticox/vantage6.py
run_java_server(_data, *_args, features=None, event_times_column=None, event_happened_column=None, **kwargs)
Partial function that starts the java server. This function is called by the main verticox+
algorithm (fit
or cross_validate
) and should not be called by itself.
Args:
_data: data provided by the vantage6 algorithm wrapper
_args:
features: list of column names that will be used as features
event_times_column: Name of the column that contains the event times
event_happened_column: Name of the column that contains whether an event has happened,
or whether the sample is right-censored
*kwargs:
Source code in python/verticox/vantage6.py
test_sum_local_features(data, features, mask, *args, **kwargs)
Obsolete
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data
|
DataFrame
|
|
required |
features
|
List[str]
|
|
required |
mask
|
|
required | |
*args
|
|
()
|
|
**kwargs
|
|
{}
|
Returns:
Source code in python/verticox/vantage6.py
CrossValResult
dataclass
CrossValResult contains the result of a cross-validation task. It contains the c-indices, coefficients and baseline hazard functions for each fold.
Source code in python/verticox/client.py
FitResult
dataclass
FitResult contains the result of a fit task. It contains the coefficients and the baseline hazard function.
Source code in python/verticox/client.py
Task
Task is a wrapper around the vantage6 task object.
Source code in python/verticox/client.py
get_results()
Get the results of the task. This will block until the task is finished.
Returns:
Source code in python/verticox/client.py
VerticoxClient
Client for running verticox. This client is a wrapper around the vantage6 client to simplify use.
Source code in python/verticox/client.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
|
cross_validate(feature_columns, outcome_time_column, right_censor_column, feature_nodes, outcome_node, precision=_DEFAULT_PRECISION, n_splits=10, database='default')
Run cox proportional hazard analysis on the entire dataset using cross-validation. Uses 10 fold by default.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_columns
|
a list of column names that you want to use as features |
required | |
outcome_time_column
|
the column name of the outcome time |
required | |
right_censor_column
|
the column name of the binary value that indicates if an event |
required | |
feature_nodes
|
A list of node ids from the datasources that contain the feature columns |
required | |
outcome_node
|
The node id of the datasource that contains the outcome |
required | |
precision
|
precision of the verticox algorithm. The smaller the number, the more |
_DEFAULT_PRECISION
|
|
n_splits
|
The number of folds to use for cross-validation. Default is 10. |
10
|
|
database
|
If the nodes have multiple datasources, indicate the label of the datasource |
'default'
|
Returns: a Task
object containing info about the task.
Source code in python/verticox/client.py
fit(feature_columns, outcome_time_column, right_censor_column, feature_nodes, outcome_node, precision=_DEFAULT_PRECISION, database='default')
Run cox proportional hazard analysis on the entire dataset.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
feature_columns
|
a list of column names that you want to use as features |
required | |
outcome_time_column
|
the column name of the outcome time |
required | |
right_censor_column
|
the column name of the binary value that indicates if an event |
required | |
feature_nodes
|
A list of node ids from the datasources that contain the feature columns |
required | |
outcome_node
|
The node id of the datasource that contains the outcome |
required | |
precision
|
precision of the verticox algorithm. The smaller the number, the more |
_DEFAULT_PRECISION
|
|
database
|
If the nodes have multiple datasources, indicate the label of the datasource |
'default'
|
Returns: a Task
object containing info about the task.
Source code in python/verticox/client.py
get_active_node_organizations()
Get the organization ids of the active nodes in the collaboration.
Returns: a list of organization ids
Source code in python/verticox/client.py
get_column_names(**kwargs)
Get the column names of the dataset at all active nodes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
**kwargs
|
|
{}
|
Returns: