From 7158470970f2289ac822f6c33a5f2c342fe451f7 Mon Sep 17 00:00:00 2001
From: Daniel Obraczka <daniel@obraczka.de>
Date: Thu, 18 Jul 2024 15:29:06 +0200
Subject: [PATCH 1/2] Add README to run scripts

---
 run_scripts/README.md | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100644 run_scripts/README.md

diff --git a/run_scripts/README.md b/run_scripts/README.md
new file mode 100644
index 0000000..446ae0d
--- /dev/null
+++ b/run_scripts/README.md
@@ -0,0 +1,38 @@
+# Installation
+
+In order to reproduce our results clone the repository and checkout the specific tag to get the state at which the experiments where done:
+
+```
+git clone https://github.com/dobraczka/klinker.git
+cd klinker
+git checkout paper
+```
+
+Create a virtual environment with micromamba and install the dependencies:
+
+```
+micromamba env create -n klinker-conda --file=klinker-conda.yaml
+micromamba activate klinker-conda
+pip install -e ".[all]"
+```
+
+# Running the experiments
+We originally used SLURM to run our experiments utilizing SLURM Job arrays. We adapted our code so it can be run without SLURM, but kept the arrays.
+For each embedding based method the entries 0-15 utilize sentence transformer embeddings and 16-31 rely on SIF aggregated fasttext embeddings.
+For the entries 24-31 it is expected, that you have the dimensionality reduced fasttext embeddings in `~/.data/klinker/word_embeddings/100wiki.en.bin`.
+For methods without embeddings (`non_relational/run_token.sh` and `relational/run_relational_token.sh`) only the entries 0-15 exist.
+
+You can reduce the dimensionality of the fasttext embeddings like this:
+```
+import fasttext
+import fasttext.util
+
+ft = fasttext.load_model('wiki.en.bin')
+fasttext.util.reduce_model(ft, 100)
+ft.save_model("~/.data/klinker/word_embeddings/100wiki.en.bin")
+```
+
+The experiments can then be run individually by supplying the wanted entry as first argument, e.g:
+```
+bash run_scripts/relational/run_token_attribute.sh 16
+```

From fd0d7f5a5712a624c6e93e8190dbfa536270c36b Mon Sep 17 00:00:00 2001
From: Daniel Obraczka <daniel@obraczka.de>
Date: Thu, 18 Jul 2024 22:53:50 +0200
Subject: [PATCH 2/2] Added zenodo

---
 README.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/README.md b/README.md
index e5d7675..9a6edec 100644
--- a/README.md
+++ b/README.md
@@ -88,5 +88,4 @@ micromamba run -n klinker-conda python experiment.py movie-graph-benchmark-datas
 ```
 This would be similar to the steps described in the above usage section.
 
-In order to precisely reproduce the results from the paper we provide (adapted) run scripts from our SLURM batch scripts in the `run_scripts` folder.
-We recommend to `git checkout paper` to checkout out the tagged commit on which the experiments were run since future development does not aim to be backwards compatible with this state.
+In order to precisely reproduce the results from the paper we provide (adapted) run scripts from our SLURM batch scripts in the `run_scripts` folder. Please consult the `run_scripts/README.md` for further information. For archival purposes the experiment artifacts and the source code are stored in [Zenodo](https://zenodo.org/records/12774407).