Monitoring and Analysis of Public Procurement

ЗАДАЧА
The System must collect information on public procurement, analyze suppliers and customers, and analyze attached documents. Users are given access to analytics on public procurement, documents search, they can subscribe to mail and SMS notifications on changes in public procurement tenders/auctions they are interested in.

Data analysis allows to forecast the participants and prices for new tenders and auctions. Analytics is used for studying of customers and suppliers activities. Users see the statistics of wins and participations, in which public procurement category the supplier participated and who his major customers are.

The System reached Byndyusoft after the initial attempted development by a team of one of the largest outsourcers. The way it was handed over, it was not fulfilling the purposes in view.

Information on public procurement occasionally disappeared, information was coming with big delay, and auctions and tenders search was working incorrectly.

  1. The core of the whole system was MSSQL database (Shared DB integration style)
  2. Full-text search public procurement auctions and tenders was based on Sphinx, other search was developed with the use of Fulltext Search MSSQL
  3. Subsystems for downloading and processing of data were using DBMS for saving of status and exchange of with each other
  4. Data denormalization was implemented with the use of triggers, calculated column and View, which were recalculated according to schedule.

Overall, the right set of technologies was employed, however the architecture turned out to be non-scalable. Server with DBMS, that was the pivotal link, required constant upgrades in the form of new hardware. In the end, further increasing of performance of the single server became economically unviable. This resulted in interruptions in data delivery, «hanging» of the DB and slow response of web interface.




It was impossible to sell the system like this. The client suffered losses and investor confidence. It was necessary to turn the tide and release a stable version for sale to users.

РЕШЕНИЕ

Byndyusoft decided to stabilize the income of public procurement data, gradually rewrite the current code, and cover the code with tests.


Transition to horizontal scalability

Architecture of the decision was changed in two stages and moved to microservice architecture. At first, as an alternative to integration in Shared DB style message queues were used.




Data on documents were removed from the DB to the cloud storage AWS S3. This significantly reduced the load upon the DB, flow of information on documents started to come directly from the cloud.

Sphinx was reconfigured according to best practices relating to this engine to get maximum performance out of it. Full-text Search was replaced with search through Sphinx, this reduced the load on the DB.

Data flows in subsystems of data collection and analysis started to come through the queue, which significantly reduced the load on the database server. IronMQ cloud queue was employed, which is a part of AWS infrastructure.

All of this allowed to horizontally scale the load on all subsystems at the cost of acquisition of the most cheap micro-servers. Basically, scaling was achieved with just a couple of mouse clicks.


Current load

Project services continuously collect the following data from the official site of public procurement:

  • Data on new public procurement tenders and auctions
  • Changes in current tenders and auctions
  • Changes in details of suppliers and customers
  • Data on signed agreements

About 100 thousand of different changes are processed every day. All data are analyzed with a delay not more than 10 minutes. Data processing includes:

  1. Analysis of document text and name of tender/auction
  2. Clusterization and tagging of texts, location of statement of work in the documents
  3. Searching for similar tenders/auctions of the customer, updating of data for similar tenders/auctions in the DB for the purpose of displaying to user
  4. Recalculation of analysis relating to predictions of participants and prices
  5. Sending of notifications to users

Changes after release

After the system was released there arose a demand for reduction of load on the DB, because its volume grew up to 500 GB and optimization of queries started giving problems.

Moreover, full-text search service Sphinx grew out of a single server because of large index volume equal to about 900 GB.




It was decided to move part of manuals, lists and other information, which did not affect the analytics, to NoSQL storage.

Sphinx was cloned to several servers, which allowed to find data in less than 0,5 seconds.

Нужно отметить вдумчивость и желание понять бизнес-задачу, которая есть у каждого члена команды Byndyusoft. В качеастве бонуса мы получали регулярные консультации по процессу и архитектуре от Александра Бындю. Горячо рекомендую это компанию для создания IT-продуктов.
Владимир Ивичев, директор в Закупки360
РЕЗУЛЬТАТ

Within 8 months, a team of 7 persons rewrote existing subsystems and implemented key features of Zakupki360 project. Currently, the project has switched to paid subscription and is successfully selling data to users.

After adoption of 44-FL from January, 2014 it took Byndyusoft team only 4 weeks to add processing of new tenders/auctions to the procedure of processing and delivery of data to users. This allowed the project to gain competitive advantage.

500 GB
RELATIONAL DATABASE
900 GB
SPHINX DATABASE INDEX
2 TB
AMAZON WEB SERVICES S3 DOCUMENT REPOSITORY
Cloud · Microservices · Big Data · Web · SaaS