Collecting Public Procurement Data with the help of Microservices

A SaaS-product must keep all public procurements that are being published in a country up to date.
The web service uses this information to forecast who will be the winner in a public procurement, the probability of winning, and a supplier's statistic of wins and participations.

Public procurements are continuously being published on multiple public internet resources. Information about new procurements and updates are published with different frequencies or depends on a customer’s activity. For example, sometimes there is no new information, but at other times there are too many changes being made at one time.

The web service's IT architecture should smooth out overloaded peak times and decrease the use of cloud resources during idle times as well. We applied microservice architecture to manage the load depending on the power of an incoming information stream.

We developed the pipeline for delivering analytics information on time to the users:

  1. Data collection services aka "spiders"
  2. Download HTML from found pages
  3. Extract information and save it in different datastores
  4. Analyze data
  5. Show analyzed data on the web service.

Byndyusoft built infrastructure on Amazon Web Services to automatically provide the scaling out of cloud resources.

Нужно отметить вдумчивость и желание понять бизнес-задачу, которая есть у каждого члена команды Byndyusoft. В качеастве бонуса мы получали регулярные консультации по процессу и архитектуре от Александра Бындю. Горячо рекомендую это компанию для создания IT-продуктов.
Владимир Ивичев, директор в Закупки360

Information in the analytical system is almost the same with public data sources. The delay in new information publicized is less than 30 mins.

Microservices continuously collects the following data:

  1. Data on new public procurement tenders and auctions
  2. Changes in current tenders and auctions
  3. Changes in details of suppliers and customers
  4. Data on signed agreements

If the power of the information stream is suddenly increased, the system automatically scales out bottlenecks by creating more instances of an overloaded service.

120,000 changes of procurements
are being processed everyday
99.9% availability
of the system
1.5 TB
database size with all procurements
Microservices · Cloud Infrastructure · Data Analysis · SaaS