hodge.html

<html><head><title>niplav</title>
<link href="./favicon.png" rel="shortcut icon" type="image/png"/>
<link href="main.css" rel="stylesheet" type="text/css"/>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
<!DOCTYPE HTML>

<style type="text/css">
code.has-jax {font: inherit; font-size: 100%; background: inherit; border: inherit;}
</style>
<script async="" src="./mathjax/latest.js?config=TeX-MML-AM_CHTML" type="text/javascript">
</script>
<script type="text/x-mathjax-config">
	MathJax.Hub.Config({
	extensions: ["tex2jax.js"],
	jax: ["input/TeX", "output/HTML-CSS"],
	tex2jax: {
		inlineMath: [ ['$','$'], ["\\(","\\)"] ],
		displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
		processEscapes: true,
		skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
	},
	"HTML-CSS": { availableFonts: ["TeX"] }
	});
</script>
<script>
document.addEventListener('DOMContentLoaded', function () {
	// Change the title to the h1 header
	var title = document.querySelector('h1')
	if(title) {
		var title_elem = document.querySelector('title')
		title_elem.textContent=title.textContent + " – niplav"
	}
});
</script>
</head><body><h2 id="home"><a href="./index.html">home</a></h2>
<p><em>author: niplav, created: 2024-01-29, modified: 2024-01-29, language: english, status: in progress, importance: 4, confidence: possible</em></p>
<blockquote>
<p><strong>What perfect sum of cycles and potential!</strong></p>
</blockquote>
<h1 id="Nothing_to_See_Here_Just_An_Implementation_of_HodgeRank"><a class="hanchor" href="#Nothing_to_See_Here_Just_An_Implementation_of_HodgeRank">Nothing to See Here, Just An Implementation of HodgeRank</a></h1>
<!--TODO: Schizo theories about how Hodge Theory is the solution to
everything?-->
<p><a href="./doc/preference/statistical_ranking_and_combinatorial_hodge_theory_jiang_et_al_2011.pdf" title="Statistical ranking and combinatorial Hodge theory">Jiang et al.
2011</a>
propose a ranking algorithm for incomplete and cyclic data, but neglect
to give code or even pseudo-code for this algorithm. The algorithm turns out
to be easy to implement in Python using networkx and numpy, with some help from <a href="https://medium.com/@zj444/hodgerank-generating-movie-ranking-from-imdb-movie-ratings-part-1-2a88ec148f10" title="HodgeRank: Generating Movie Ranking From IMDb Movie Ratings (Part 1: Problem Formulation)">this
guide</a>.
Since other people might find the code valuable as well, I provide it here
(mostly without comment or explanation).</p>
<pre><code>    import numpy as np
    import networkx as nx
    import itertools as it

    def positive_edges(prefs):
            edges=[]
            for e in it.combinations(prefs.keys(), 2):
                    if np.all(np.isnan(prefs[e[0]]-prefs[e[1]])):
                            weight=np.nan
                    else:
                            weight=np.nanmean(prefs[e[0]]-prefs[e[1]])

                    n=np.sum(~np.isnan(prefs[e[0]]-prefs[e[1]]))
                    if np.isnan(weight):
                            continue
                    elif weight&gt;=0:
                            edges.append((e[0], e[1], {'weight': weight, 'n': n}))
                    else:
                            edges.append((e[1], e[0], {'weight': -weight, 'n': n}))
            return edges

    def prefgraph(prefs):
            g=nx.DiGraph()
            g.add_nodes_from(list(prefs.keys()))
            edges=positive_edges(prefs)
            g.add_edges_from(edges)

    def decompose(g):
            f=np.array([g[e[0]][e[1]]['weight'] for e in g.edges])
            W=np.diag([g[e[0]][e[1]]['n'] for e in g.edges])

            origins=np.zeros((len(g.edges), len(g.nodes)))

            idx=dict()
            nodes=list(g.nodes)
            for i in range(0, len(nodes)):
                    idx[nodes[i]]=i

            origins=np.zeros((len(g.edges), len(g.nodes)))
            c=0
            for e in g.edges:
                    sign=np.sign(g[e[0]][e[1]]['weight'])
                    if np.isnan(sign):
                            sign=0
                    origins[c][e[0]]=sign*-1
                    origins[c][e[1]]=sign
                    c=c+1

            try:
                    s=-np.linalg.pinv(origins.T@W@origins)@origins.T@W@f
            except LinAlgError:
                    s=np.zeros(len(list(g.nodes)))

            values=dict()

            for option in idx.keys():
                    values[option]=s[idx[option]]

            return values

    def hodgerank(prefs):
            g=prefgraph(prefs)
            return decompose(g)
</code></pre>
<p>If <code>decompose</code> can't find a solution to the given preferences, it returns
a vector of 0s as a default result.</p>
<p>The input for <code>hodgerank</code> is a dictionary of preferences, where the
keys are the options and the preferences are arrays of the same size,
empty answers being designated as <code>np.nan</code>. The output is a dictionary
with the options as keys and the corresponding HodgeRank values.</p>
<p>Examples:</p>
<pre><code>    &gt;&gt;&gt; prefs1={0:np.array([5,3]),1:np.array([4,5]),2:np.array([3, np.nan]),3:np.array([5,2])}
    &gt;&gt;&gt; hodgerank(prefs1)
    {0: 0.4642857142857137, 1: 0.7499999999999991, 2: -1.2499999999999987, 3: 0.0357142857142852}

    &gt;&gt;&gt; prefs2={0:np.array([5,3]),1:np.array([4,5]),2:np.array([3, 1]),3:np.array([5,2])}
    &gt;&gt;&gt; hodgerank(prefs2)
    {0: 0.5000000000000002, 1: 0.9999999999999998, 2: -1.4999999999999993, 3: 4.163336342344337e-16}
</code></pre>
<p><code>prefs2</code> is an interesting case: We know from
<a href=".doc/preference/statistical_ranking_and_combinatorial_hodge_theory_jiang_et_al_2011.pdf" title="Statistical ranking and combinatorial Hodge theory">Jiang et al.
2011</a>
that if the preferences are all complete, then HodgeRank is equivalent
to the sum of values under some linear transformation. That linear
transformation here is <code>$(x \cdot 2)+7$</code>, and yields the values
<code>{0: 8, 1: 9, 2: 4, 3: 7}</code>.</p>
<pre><code>    &gt;&gt;&gt; prefs3={0:np.array([4,np.nan,3]),1:np.array([3,4,np.nan]),2:np.array([np.nan, 3,4])}
    &gt;&gt;&gt; hodgerank(prefs3)
    {0: 0.0, 1: 1.3877787807814457e-17, 2: 0.0}

    &gt;&gt;&gt; prefs4={0:np.array([5,np.nan,3]),1:np.array([3,4,np.nan]),2:np.array([np.nan, 3,4])}
    &gt;&gt;&gt; hodgerank(prefs4)
    {0: 0.3333333333333334, 1: -0.3333333333333334, 2: -1.1102230246251565e-16}
</code></pre>
<p>Going from <code>prefs3</code> to <code>prefs4</code> shows that HodgeRank can be manipulated
by the first voter: they simply increase their score of the first option
(which they like most anyway) and thereby make it the highest ranking
option in the global ranking.</p>
<pre><code>    &gt;&gt;&gt; prefs5={0:np.array([5,np.nan,3]),1:np.array([3,4,np.nan]),2:np.array([1, 3,4])}
    &gt;&gt;&gt; hodgerank(prefs5)
    {0: 1.0000000000000002, 1: 5.551115123125783e-17, 2: -1.0000000000000004}

    &gt;&gt;&gt; prefs6={0:np.array([1,1]),1:np.array([1,1]),2:np.array([1,1])}
    &gt;&gt;&gt; hodgerank(prefs6)
    {0: 0.0, 1: 0.0, 2: 0.0}

    &gt;&gt;&gt; prefs7={0:np.array([5,3]),1:np.array([4,5]),2:np.array([3, np.nan]),3:np.array([5,2]),4:np.array([np.nan,2])}
    &gt;&gt;&gt; hodgerank(prefs7)
    {0: 0.6357142857142856, 1: 1.1357142857142855, 2: -0.971428571428572, 3: 0.3142857142857142, 4: -1.114285714285714}

    &gt;&gt;&gt; prefs8={'a':np.array([5,3]),'b':np.array([4,5]),'c':np.array([3, np.nan]),'d':np.array([5,2])}
    &gt;&gt;&gt; hodgerank(prefs8)
    {'a': 0.4642857142857137, 'b': 0.7499999999999991, 'c': -1.2499999999999987, 'd': 0.0357142857142852}
</code></pre>
</body></html>