Experienced C# programmer for a library that parses as fast as possible
$10-30 USD
Lukket
Slået op over 6 år siden
$10-30 USD
Betales ved levering
Good morning! Looking for an experienced C# programmer to build a library in DotNET 4.0 client profile (no other version will work).
I have this written already, but I'm sure it's very inneficient and will like someone more experienced than myself to rewrite.
- Receive a list of files
- Remove all those that match a list of directories, file names and extensions
- Of the remaining ones, match based on:
* Directory
* Extension
* File name
* Content, based on simple string match
* Content, based on regex
Whenever a match is found, an event should be sent, informing the path, the type of match and the string used to match and it should continue to the next file.
The idea is to find the most efficient way to parse through these files.
It will implement a file size limit for each category, because doing a complex regex in a series of 32 MB files is impractcal and not needed for this
Should take into account for parsing the content (the most complicated problem) things like:
- How to read the file? Right now I'm using a stream reader, to read line by line. Is it the most efficient way?
- How to parse the lines? Character by character? Leave it to the DotNET framework and use [login to view URL]()?
- How to order the regex for matches? They will vary in processing time, so that should be taken into account
- How to remove lines for regex matches, it it will obviously not match? Example, if it's matching hexadecimal strings, does makes sense to remove all non-numbers and letters A to F or it's better to leave that to the regex object?
- Can some of the regexs be converted to a state machine or an hybrid between simple string matching and then running a regex over those that matched the first step? That might need to be seen regex by regex
If we work fine together and like your coding style, I have several projects that I need help with.
All projects will be done by milestones, in which the whole code produced so far will be delivered.