KGE.models.base_model.BaseModel

Classes

KGEModel

A base module for Knowledge Graph Embedding Model.

TypedStrategy

An implementation of typed negative sampling strategy.

UniformStrategy

An implementation of uniform negative sampling

tqdm

Decorate an iterable object, returning an iterator which acts exactly like the original iterable, but prints a dynamically updating progressbar every time a value is requested.

Class Inheritance Diagram

Inheritance diagram of KGE.models.base_model.BaseModel.KGEModel

base module for Knowledge Graph Embedding Model

class KGE.models.base_model.BaseModel.KGEModel[source]

Bases: object

A base module for Knowledge Graph Embedding Model.

Subclass of KGEModel can have thier own translation and interation model.

Methods

evaluate(eval_X, corrupt_side[, positive_X])

Evaluate triplets.

get_rank(x, positive_X, corrupt_side)

Get rank for specific one triplet.

restore_model_weights(model_weights)

Restore the model weights.

score_hrt(h, r, t)

Scoring the triplets.

train(train_X, val_X, metadata, epochs, ...)

Train the Knowledge Graph Embedding Model.

__init__(embedding_params, negative_ratio, corrupt_side, loss_fn, ns_strategy, n_workers)[source]

Initialize KGEModel.

Parameters
  • embedding_params (dict) – embedding dimension parameters

  • negative_ratio (int) – number of negative sample

  • corrupt_side (str) – corrupt from which side while trainging, can be 'h', 't', or 'h+t'

  • loss_fn (class) – loss function class KGE.loss.Loss

  • ns_strategy (function) – negative sampling strategy

  • n_workers (int) – number of workers for negative sampling

evaluate(eval_X, corrupt_side, positive_X=None)[source]

Evaluate triplets.

Parameters
  • eval_X (tf.Tensor or np.array) – triplets to be evaluated

  • corrupt_side (str) – corrupt triplets from which side, can be 'h' and 't'

  • positive_X (tf.Tensor or np.array, optional) – positive triplets that should be filtered while generating corrupted triplets, by default None (no filter applied)

Returns

evaluation result

Return type

dict

get_rank(x, positive_X, corrupt_side)[source]

Get rank for specific one triplet.

Parameters
  • x (tf.Tensor or np.array) – rank this triplet

  • positive_X (tf.Tensor or np.array, optional) – positive triplets that should bt filtered while generating corrupted triplets, if None, no filter applied

  • corrupt_side (str) – corrupt triplets from which side, can be 'h' and 't'

Returns

ranking result

Return type

int

restore_model_weights(model_weights)[source]

Restore the model weights.

Parameters

model_weights (dict) – dictionary of model weights to be restored

score_hrt(h, r, t)[source]

Scoring the triplets.

Should be implemented in subclass for their own scoring function.

Raises

NotImplementedError – subclass doesnt not implement score_hrt().

train(train_X, val_X, metadata, epochs, batch_size, early_stopping_rounds=None, model_weights_initial=None, restore_best_weight=True, optimizer='Adam', seed=None, log_path='./logs', log_projector=False)[source]

Train the Knowledge Graph Embedding Model.

Parameters
  • train_X (np.ndarray or str) –

    training triplets.

    If np.ndarray, shape should be (n,3) for \((h,r,t)\) respectively.

    If str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.

  • val_X (np.ndarray or str) –

    validation triplets.

    If np.ndarray, shape should be (n,3) for \((h,r,t)\) respectively.

    If str, training triplets should be save under this folder path with csv format, every csv files should have 3 columns without header for \((h,r,t)\) respectively.

  • metadata (dict) –

    metadata for kg data. should have following keys:

    'ent2ind': dict, dictionay that mapping entity to index.

    'ind2ent': list, list that mapping index to entity.

    'rel2ind': dict, dictionay that mapping relation to index.

    'ind2rel': list, list that mapping index to relation.

    can use KGE.data_utils.index_kg to index and get metadata.

  • epochs (int) – number of epochs

  • batch_size (int) – batch_size

  • early_stopping_rounds (int, optional) – number of rounds that trigger early stopping, by default None (no early stopping)

  • model_weights_initial (dict, optional) – initial model wieghts with specific value, by default None

  • restore_best_weight (bool, optional) – restore weight to the best iteration if early stopping rounds is not None, by default True

  • optimizer (str or tensorflow.keras.optimizers, optional) – optimizer that apply in training, by default 'Adam', use the default setting of tf.keras.optimizers.Adam

  • seed (int, optional) – random seed for shuffling data & embedding initialzation, by default None

  • log_path (str, optional) – path for tensorboard logging, by default “./logs”

  • log_projector (bool, optional) – project the embbedings in the tensorboard projector tab, setting this True will write the metadata and embedding tsv files in log_path and project this data on tensorboard projector tab, by default False