I need a script which can search the internet for RSS news feeds and save the feeds in a database. The script will run twice per day to look for new feeds (update). The crawler must only search the feeds I provide (which I can easily input in a database).
After the feeds are in a database the script must analyze the feeds for subject and keywords, so I can easily search trough the database and find the articles I need.
The script must be compatible with Joomla where I can install a component to select the news I want to display by keywords.
The feeds I need are in Dutch and English.
Please give me a private message if you want more details
The function of de spider / crawler is to search the internet for feeds. The spider can only go to the website’s I want. The website’s where the spider goes must be in a database (name of website, url, and a list of news feeds).
When a spider finds a new article I want it to go in my database. Of course I need the source/title/url/description and so on. After the article is in my database I need to index the article (find keywords, maybe a subject and related items) to make the search / selection easy.
I will need a Joomla! component so I can display the news on a website.
In the back-end I need to configure the component. First I need to input the database settings (because the spider runs on a different database). Then I want the component to make a page (as many as I want) where I can make a search query to find the news I want to find.
The search query must be able to select by keyword, but also to select by source website’s.
Front-end is very simple. Display the news title + little intro and place a link to the original website. Also I need the site to search for related items in the database and make a link to that item.