I have 5000 individual news articles that currently contain text plus xml, html and inline tags. These documents contain about 1 page of text each. In order to ingest the documents into a text analysis engine I need to remove all xml, html and inline tags. The task is basically text editing the corpus to remove all the xml, html and inline tags. I don't think the task is hard, so I would like a bid.