We have a series of text files which contain lists of log file information in this format ->
There is no delimitation around the fields other than the full colon ":" separating the 2 values and obviously a new line at the end of the 2nd value. There are however some errors in these files, some of the lines don't follow the correct format above and we need these removing otherwise we wont be able to use LOAD DATA into MySQL when we come to import them.
These files are stored in multiple nested directories, some of which are 5 or 6 levels deep and each directory can contain between 50 and 200 text files.
We are looking for someone to write a simple BASH script to run on ubuntu 16.04 in the root directory of all these files. The script will traverse its way down each file and directory and cleanse the data. The script will need to open each file, read it one line at a time and check that each line conforms to the above format. If it is in the correct format, add it to a new "cleansed data" text file in the same (value1:value2) format in the current directory being checked. When its finished reading all the files in that current directory, it will save the cleansed data text file and move onto the next dorectory.
We have 98 directories of this data and hundreds of files within them. The output of the script will be 98 "cleansed" text files, one in each directory containing all the correctly formatted lines from all the files in that directory.
We'll then use each new cleansed data file with the LOAD DATA facility to upload the new value1:value2 checked pairs.
11 freelancere byder i gennemsnit £57 på dette job
I have worked as server admin for big names like [login to view URL] and servercentre.net. Over 8yrs of working with cpanel, directadmin and plesk servers. Can help you with this. Let me know.
Hi, I wrote many such nifty content aware scripts. This is doable with shellscript, ruby, perl and almost any language. I will choose the one that will get the job done best. I hope you decide for me and be amazed!