The approach I would take would be to process half of the emails manually. During this process, trends in the titles used would generate a handful of commonly used titles, such as director of marketing, or director of relations. After this point, I would run a search through the bodies of the rest of these emails, and make the search kick back any emails that do not include the previously gathered titles. This would separate the repeat data points from the unique hits, allowing for more efficient mining.
The manner of implementation, of course, largely depends on the format the emails are in. I am versed in Visual Basic, specifically in Microsoft applications, Java and a handful of scripting languages.
This method naturally allows for an ongoing, faster mining of the data as more hits are collected, as well as batch processing for additional emails, as you suggest need work.