Færdiggjort

Perl short text classifier to guess person's ethnicity from their name

Your job is to create a simple, short string classifier in Perl. The input to the system is a person's name, thus an UTF-8 encoded string usually between 10-40 characters in length, and the system will classify to which of the pre-defined classes the string belongs to. The classes are ethnicity groups.

For example, if input to the system is "John Smith", the system would output class "English", or if the input is "hiromi akiyama" the system would output "Japanese". There are 18 different classes (ethnicity groups).

The system has two parts: 1) Training script called [url removed, login to view] which trains the system using given training data (list of known 'name = ethnicity' pairs) and saves the "trained state" of the system to disk. The script is called by "perl [url removed, login to view] [url removed, login to view]".

2) Analyzer script [url removed, login to view] which loads the "trained state" from disk (generated by the training script previously), and uses the loaded data to classify to which class a given string belongs to. The script is called by "perl [url removed, login to view] [url removed, login to view]" in which case it will load the given test file, OR as in "perl [url removed, login to view] "john smith"" in which case it would simply analyze (classify) the given string from the command line ("john smith" in this case).

Attached is data.zip. It contains [url removed, login to view] and testing_data.txt. The data is in format of "name:class" where the name is base64 encoded.

Your system must be able to be trained using the given [url removed, login to view] in a way it analyzes [url removed, login to view] with 90% or better accuracy.

Notice: The solution must be some kind of training based solution. For example, a bayesian classifier, ngram analyzer or artificial intelligence or machine learning of some sort. The solution must not be based on any regular expressions or fixed (human written) set of detection rules.

You are free to use any existing free Perl code, libraries and modules, such as AI or data classifier libraries.

Evner: Perl

Se mere: output perl script text file, perl parse text insert mysql, perl extract text tags, perl match text length, english short text, perl extract text html tags, perl add text pdf, use perl add text pdf, perl matching text files, perl match text file, perl match text exactly order, perl search text contains, perl xls text script, short text words, short text english

Om arbejdsgiveren:
( 601 bedømmelser ) Turku, Thailand

Projekt ID: #16538057

Tildelt til:

kchwistek

Hi I am quite experienced programmer knowing several programming languages. Your project is interesting. In past I have studied Computer Science and the AI topic is something what I like to think about. Unfortunately t Flere

$165 USD in 10 dage
(2 bedømmelser)
3.5

5 freelancere byder i gennemsnit $156 på dette job

freelance4hire80

hi, I've checked the project spec. I can come out a perl script for you by using Algorithm::NaiveBayes for example to predict the person's ethnicity using the training data sets

$155 USD in 3 dage
(49 bedømmelser)
6.6
$155 USD in 3 dage
(0 bedømmelser)
0.0
$150 USD in 3 dage
(0 bedømmelser)
0.0
balu0priya1

I would like to work on this project as I have enough experience in perl .... If interested please let's know.

$155 USD in 3 dage
(0 bedømmelser)
0.0