SplitterPytorch

A PyTorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

Splitter: Learning Node Representations that Capture Multiple Social Contexts. Alessandro Epasto and Bryan Perozzi. WWW, 2019. [Paper]

This code is based on https://github.com/benedekrozemberczki/Splitter for the splitter codes, and node2vec https://github.com/eliorc/node2vec. I reorganize the codes and fix some bugs for my research. So,this codes works pretty well on emperical dataset, and very easy to use. The original Tensorflow implementation is available [here]. But there is only codes for generating persona, not embedding codes...

Requirements

The codebase is implemented in Python 3.5.2. package versions used for development are just below.

networkx          1.11
tqdm              4.28.1
numpy             1.15.4
pandas            0.23.4
texttable         1.5.0
scipy             1.1.0
argparse          1.1.0
torch             0.4.1
gensim            3.6.0

Datasets - inputs

The code takes the edge list of the graph in a csv file. You can easily make the edgelist file with networkx function nx.write_edgelist

Outputs

There are 4 outputs on Splitter

Persona graph Persona graph is a result graph of ego-splitting. File format is edgelist, which is same to format of inputs
Persona mapping Persona mapping is a dict that connnect orginal node and splitted persona nodes. Bascially, relation between two entities is 1 to M relations. File format is json.
Base embedding Result embedding of Node2vec on orignal graph, it used as base embedding of splitter. File format is word2vec fomrat of gensim. You can read a result with gensim.models.Keyedvectors.load_word2vec_format()
Persona embedding Result embedding of Spitter on persona graph. This embedding is final results of this resposiotry. File format is pickle(.pkl).

Options

The training of a Splitter embedding is handled by the src/main.py script which provides the following command line arguments.

Input and output options

  --input [INPUT]       Input graph path
  --persona-graph [PERSONA_GRAPH]
                        Persona network path.
  --persona-mapping [PERSONA_MAPPING]
                        Persona mapping path.
  --emb_base [EMB_BASE]
                        Base Embeddings path
  --emb_persona [EMB_PERSONA]
                        Persona Embeddings path

Model options

    --dimensions DIMENSIONS
                        Number of dimensions. Default is 128.
  --walk-length WALK_LENGTH
                        Length of walk per source. Default is 80.
  --num-walks NUM_WALKS
                        Number of walks per source. Default is 10.
  --window-size WINDOW_SIZE
                        Context size for optimization. Default is 10.
  --base_iter BASE_ITER
                        Number of epochs in base embedding
  --p P                 Return hyperparameter for random-walker. Default is 1.
  --q Q                 Inout hyperparameter for random-walker. Default is 1.
  --learning-rate LEARNING_RATE
                        Learning rate. Default is 0.01.
  --lambd LAMBD         Regularization parameter. Default is 0.1.
  --negative-samples NEGATIVE_SAMPLES
                        Negative sample number. Default is 5.
  --seed SEED           Random seed for PyTorch. Default is 42.
  --workers WORKERS     Number of parallel workers. Default is 8.
  --weighted            Boolean specifying (un)weighted. Default is
                        unweighted.
  --unweighted
  --directed            Graph is (un)directed. Default is undirected.
  --undirected

Examples

The following commands learn an embedding and save it with the persona map. Training a model on the default dataset.

python src/main.py

Training a Splitter model with 32 dimensions.

python src/main.py --dimensions 32

Increasing the number of walks and the walk length.

python src/main.py --number-of-walks 20 --walk-length 80

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
emb		emb
graph		graph
mapping		mapping
src		src
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SplitterPytorch

Requirements

Datasets - inputs

Outputs

Options

Input and output options

Model options

Examples

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

jisungyoon/SplitterPytorch

Folders and files

Latest commit

History

Repository files navigation

SplitterPytorch

Requirements

Datasets - inputs

Outputs

Options

Input and output options

Model options

Examples

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages