Some of the parameters can only be used in the context of learning (L), some in the context of predicting (P), some can be used for both (L+P). A bold X means that you have to set this parameter.
|PATH_TRAINING||X||X||Filepath to training data|
|PATH_VALID||X||Filepath to validation data (used by AnyBURL to filter rankings), point to an empty file if there is no validation set|
|PATH_TEST||X||Filepath to test data (used by AnyBURL to filter rankings), point to an empty file if there is no test set|
|PATH_RULES||X||Filepath to previously learned rules|
|PATH_OUTPUT||X||X||Filepath where you want to store the rules / predictions, seconds of the snapshot are attached.|
|SATURATION||X||0.99||The higher the number, the more rules are found until the algorithm continues with the higher path length.|
|SAMPLE_SIZE||X||500||Size of sample used for computing the approximated confidence value|
|BATCH_TIME||X||1000||Batchtime in milliseconds, should not be set to a lower value.|
|THRESHOLD_CONFIDENCE||X||X||0.01||All rules above this threshold AND above the following one are stored / used for prediction. It could be that you have to use THRESHOLD_APPLIED_CONFIDENCE instead of THRESHOLD_CONFIDENCE in the prediction phase, just specify both parameters with the same name.|
|THRESHOLD_CORRECT_PREDICTIONS||X||X||2||Only rules with at least n correct predictions are stored / used for prediction.|
|MAX_LENGTH_CYCLIC||X||X||3||The maximal number of body atoms in cyclic rules (inclusive this number). Once the number is exceeded, only the other type of rules is searched for.|
|MAX_LENGTH_ACYCLIC||X||X||2||The maximal number of body atoms in acyclic rules (inclusive this number). Once the number is exceeded, only the other type of rules is searched for.|
|SNAPSHOTS_AT||X||X||10,100||The default stores rules learned after 10 and 100 seconds. After the last snapshot AnyBURL terminates. Change as you want.|
|WORKER_THREADS||X||X||3||It is recommended to set this to n-1 if you have n cores.|
|UNSEEN_NEGATIVE_EXAMPLES||X||5||Number of negative examples that are added in the denominator as pessimistic variant of laplace smoothing within confidence computation, this number affects prediction (Apply) only!|
|AGGREGATION_TYPE||X||maxplus||Choose between noisyor and maxplus, we strongly recommend maxplus||TOP_K_OUTPUT||X||10||top k candidates are created as output, choose at least 10 if you want to compute hits@10 ...|
Examples of minimal configurations files can be found here and here.
This is another example of learning only high quality rules on a machine with many cores.
Both in the learning and the prediction results (rules or predictions) are stored with a suffix of the snapshot-second. If learning terminates before the final snapshot is reached (because the saturation of maximum rule length for both types has been reached), then rules are stored with suffix 0.