# Mathematician wanted

Hello,

We're comparing several thousands of texts to calculate their similarity, based on Jaccard distance. The number of texts we compare can go upto 100 K. Each text has a reference number.

At the end of this comparison process, we get (n*(n-1))/2 values where n is the number of articles we compared.

We now want to extract the less similar texts after this comparison work. We want to have 2 options: extract the x less similar texts or extract all the text with a maximum similarity of n %.

We also want to generate a table where we'd get the following information: the number of texts we can extract with a maximum similarity ratio of x %, with x going from 0 to 100 by increments of 1.

To adress this part, we need to hire a mathematician. The calculation/algorithm will then be implemented by our developer.

Best regards,

Fab.

Om arbejdsgiveren:
( 34 bedømmelser ) PARIS, France

Projekt ID: #20296600

