Custom Web Scraping

What was the problem:

Our customer is a used motorcycles dealer and is interested in collecting information on motorcycles from websites that buy and sell. They specifically wanted to track several websites every few minutes and to find the best deals in as little time as possible. The problem was that there were too many adverts. This made it difficult to keep track of them eg.  price changes or new ads. It was also difficult to keep track of the history of their searches and the exchange of information between the sales representatives .

The solution:

A console application installed on 6 linux servers, which are located in different geographical locations.
The application runs every 10 minutes from a different server so it is not blocked from websites or considered as a bot or spam . Our application accepts a url with the search conditions that the customer has provided and then scans the adverts for information of interest to our client. There is no limit to the number of ads. The information of each ad is written into a central database, the application then sends email’s with the information of the new advertisements found to the sales representatives of the company.

How it works:

Here you can see the 6 first ads that the application found based on the search criteria that the client provided.

Here are the entries made in the database with the information of the advertisements shown above.

The image below shows the first ad and how the data was written in the database (the same was done for the other ads on the site). The search criteria for the advertisement was chosen by our customer according to his needs.

And here is the email that the sales representatives received for this particular ad.

The process you’ve seen is repeated for all ads that follow the search conditions. The result of having concentrated input in a database, made access to the information easier and faster. But the most important thing is that the business is notified within a few minutes of any changes to the site.

Show Buttons
Hide Buttons