Synopsis

cleora [OPTIONS] [inputs]...

Configuration Options

Input Files

File Type

Info

In a TSV file, columns should be separated by TABs, and values within one column's row should be separated by spaces. In the JSON format, multiple values are provided as an array.

Dimension

Number of Iterations

Columns

Example of column configuration:
The picture below presents data example with column configurations and resulting graphs. Input file required for this example should consist of TAB-separated columns, and space-seperated values within one column row:

u1 <\t> p1 p2 p3
u2 <\t> p2 p4

example use case of column modifiers

Output Directory

Relation Name

Example:
The file with embeddings, generated based on two columns user and product, is by default called emb__user__product.out. When we set --relation-name to purchase, the file will be called purchase__user__product.out.

Prepend Field Name

Example:
Consider an example of embeddings generated for the product column. In each row of the output file, we have the product identifier, the number of nodes with this product (indicating in how many rows of our data the product occurred), and the generated embedding:

1388 120 0.03775605 -0.00534315 -0.07677672 -0.033221997 -0.11690934 0.07979556 -0.047545113 0.04019881 0.11354096 0.09381865 0.0139150405 -0.041348357 
where the first number is the product ID, the second is the number of nodes, and the subsequent numbers comprise the embedding vector.

If we set --prepend-field-name to 1, we will get:

product__1388 120 0.03775605 -0.00534315 -0.07677672 -0.033221997 -0.11690934 0.07979556 -0.047545113 0.04019881 0.11354096 0.09381865 0.0139150405 -0.041348357 

Log Every N

In-Memory Embedding Calculation

Output Format

Seed

Version

Examples of Cleora Configuration

For input file1.tsv:

user1    product7 product2 product10
user2    product11
user3    product1 product2 product11 product13
run:
chmod +x cleora
./cleora --type tsv \
         --columns="user complex::reflexive::product" \
         --dimension 128 \
         --number-of-iterations 5 \
         --relation-name=test_relation_name \
         --prepend-field-name 0 \
         file1.tsv 

Note

Before the first run, ensure that the Cleora binary file has execute permissions (chmod +x).

Output Format

Output files are saved in the current location or under --output-directory. Each row in the file consists of:

Example:
We use Cleora to create embeddings for the product column. We get the output file where each row stores: the product ID, information about how many times the product occurred in the input data (which translates to the number of nodes), and the embedding of this particular product.

1388 120 0.03775605 -0.00534315 -0.07677672 -0.033221997 -0.11690934 0.07979556 -0.047545113 0.04019881 0.11354096 0.09381865 0.0139150405 -0.041348357 
Cleora produces a file for each relation configured based on columns in data and provided columns modifiers. For details see Cleora Algorithm Overview.