The input to the project is free form financial narrative from companies who see to repurchase stock.
The output is a well formed row of data that cannot normally be extracted using ad-hoc string
manipulations. Someone with a natural language processing/AI background and good Java experience is needed
for this.
The output looks like this:
symbol,min,max,TD,oddLotsHonored,tdp
vr,30.50,33.50,6/5/12,yes, yes
Symbol is the stock symbol: VR
Min is the minimum price
Max is the maximum price
TD is the termination date
OddLotsHonored: if they are honored the value is yes
TDP: Three day protect is true
The input might be:
Validus Holdings, Ltd.
Offer to Purchase for Not More Than $200,000,000 in Cash
its Common Shares
at a Purchase Price Not Greater Than $33.50
Nor Less Than $30.50 Per Share
THE OFFER, PRORATION PERIOD AND WITHDRAWAL RIGHTS WILL EXPIRE AT 5:00 P.M., NEW YORK CITY TIME, ON JUNE 5, 2012, UNLESS THE OFFER IS EXTENDED OR WITHDRAWN (SUCH DATE, AS IT MAY BE EXTENDED, THE “EXPIRATION DATE”).
...
and any required signature guarantees and other documents required by the Letter of Transmittal, are received by the Depositary within three business days after the date of receipt by the Depositary of the Notice of Guaranteed Delivery.
...
There may be significant variation to the above...for example:
Offer to Purchase for Cash
by
The Home Depot, Inc.
of
Up to 250 million Shares of its Common Stock at a
Purchase Price not greater than $44.00 nor less than
$39.00 per Share
THE OFFER, PRORATION PERIOD AND WITHDRAWAL RIGHTS WILL EXPIRE AT 5:00 P.M., NEW YORK CITY TIME, ON AUGUST 16, 2007, UNLESS THE OFFER IS EXTENDED (THE “EXPIRATION TIME”).
...
the Depositary receives, at one of its addresses set forth on the back cover of this Offer to Purchase and within the period of three trading days after the date of execution of that Notice of Guaranteed Delivery, either: (i) the certificates representing the shares being tendered, in the proper form for transfer, together with (1) a Letter of Transmittal relating thereto, which has been validly completed and duly executed and includes all signature guarantees required thereon and (2) all other required documents; or (ii) confirmation of book-entry transfer of the shares into the Depositary’s account at the book-entry transfer facility, together with
...
[login to view URL]
has an example where the three day protect is not in effect (since they don't mention it).
Here are more test cases:
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
[login to view URL]
If you can make these work, you are done. Ad-hoc REGEX approaches are not generally good enough
for this work, you need NLP.
The WORK MUST be done in JAVA. I only know Java. The work can be a desktop application (i.e.,
it inputs the URLs, scans the data, then outputs the CSV text.).
No GUI needed.
Please do not bid more than the maximum budget allocated.
Please see my demo at: [login to view URL]
It works with all test cases given by you, however there is no 100% accurate solution and the algorithm may fail occasionally if significantly different free form text is encountered. If you find the case for which the demo fails or you can deliver more test cases let me know. This demo is written in server-side Java, but the final product will be a desktop Java app (as you requested).
$200 USD på 2 dage
5,0 (14 anmeldelser)
5,2
5,2
5 freelancere byder i gennemsnit $165 USD på dette job
I have done Masters in NLP
5+ years Hands on Experience in Text Data Mining, Pattern Matching, Regular Expressions
Solid professional experience in writing rules to extract SemanticNamed Entities such as Company/Employer Name, Job Title, Skills, Experience, Salary, Job URL Address details from Job Ads.
Well versed with Applying NLP techniques to recognize named entities using POS(parts of speech), parsing etc
Good knowledge of computational linguistics such Lexical, semantic parsing etc