• Architectures, scalability (17)

    Photo

    Alexander Horoshilov

    Yandex

    Yandex Query - serverless federated query system. Inside view

    15 December, 13:30, «04 Hall. Ashot Yerkat»

    Result of 5-year experience developing Yandex Query to bring batch/stream processing service into Yandex Cloud. YQ can run SQL-like queries over endless dataflow.

    We’ll talk about details of design trade-offs: - capacity vs. isolation - performance vs. reliability - security vs. UX.

    We present our 5-year experience developing Yandex Query. It is a query processing service in Yandex Cloud. Both batch and stream processing are handled similarly with the same syntax. One can debug their queries on batch data samples and run them for the production stream without changes.

    YQ reuses our internal system for distributed query processing. The job can spawn over hundreds of nodes to meet load requirements. YQ can fetch data from and upload it into external systems (i.e. object storage or message queues) to join heterogeneous sources in a single query.

    We’ll reveal details of our multitenant system design. Design choices we had and decisions we made:

    1. Capacity vs. isolation
    Control plane is isolated from the compute plane(s). Processing cluster includes several compute planes (like tenants) to reduce blast radius. It reduces the risk of system downtime in a shared environment.

    2. Performance vs. reliability
    YQ uses cloud compute nodes and enforces limits and quotas to mitigate DDOS in presence of high-load queries. Data is processed as fast as possible, and we provide exactly-once guarantees under certain conditions.

    3. Security vs. UX
    YQ conforms to strict cloud policies on data privacy. All data sources support service accounts for flexible access control. The compute plane uses time-limited tokens only. Service is available from cloud console UI and provides API for integration with other services.

    Finally, like everything in Yandex, our system is distributed, scalable, and fault-tolerant, with all benefits and complexity of this design.

    The talk was accepted to the conference program

    Photo

    Nikolay Izhikov

    Apache Ignite PMC, Apache Kafka Contributor

    Practical aspects of B+ trees

    15 December, 12:20, «04 Hall. Ashot Yerkat»

    Many engineers are familiar with B+ tree data structure and its application inside DBMS. But how does it work internally in order to provide concurrency, reliability, high throughput and all the other great features? In the talk I will try to give brief overview of methods and tweaks from real-world DB.

    The talk was accepted to the conference program

    Photo

    Antony Polukhin

    Yandex

    Microservices on C++, or why we made our own framework

    16 December, 13:30, «04 Hall. Ashot Yerkat»

    We write IO bound applications, that have CPU intensive parts, may require a lot of memory and should be highly available.

    Unfortunately, existing solutions did not match our needs, so we made our own framework, with coroutines and dynamic configs.

    From this talk you'll get hints on how to combine usage simplicity and C++, production ready coroutines and language without support for them, high development speed, efficiency and safety.

    The talk was accepted to the conference program

    Photo

    Oleg Anastasyev

    Odnoklassniki

    Effective and Reliable Microservices

    16 December, 17:00, «02 Hall. Ararat»

    Odnoklassniki is one of the most popular social networks in CIS and the top 6 globally. It is in the top 20 sites among similar web’s top global websites list. More than 70 million people use Odnoklassniki regularly to share their valuable stories with friends and family, watch and stream videos, listen to music, and play games together.

    Odnoklassniki employs hundreds of different microservice applications to serve users’ requests. Many of these services are built as stateful applications - they store their data locally, embedding a Cassandra database into the application’s JVM process. This challenges the usual way of building applications - a stateless microservice with a separate remotely accessible database cluster.

    In this talk Oleg will try to cover the advantages of stateful vs stateless microservices, discuss how statefulness affects reliability and accessibility of services and how it helps to build faster applications. We’ll go step-by-step through building a stateful application service, delving into its architecture, major components as well as significant challenges and their solutions.

    The talk was accepted to the conference program

    Photo

    Vasily Pantyukhin

    VeUP Ltd

    Cheats & mistakes to read and create SLAs

    16 December, 12:20, «01 Hall. Tigran»

    Trust matters. We rely on our providers’ SLAs and share our “designed for” SLOs. We need to trust and gain trust to deliver trustworthy solutions. Availability and Durability are essential system reliability SLAs. Unfortunately, quite often we mismeasure, hide, or even distort them.

    During the session we’ll discuss common mistakes and problems with reliability of SLAs. Examples illustrate tips to read, share and compare the numbers of 9s.

    The talk was accepted to the conference program

    Photo

    Alexander Sibiryakov

    Zyte

    Kafka architecture: performance

    16 December, 11:10, «01 Hall. Tigran»

    Kafka is a distributed messaging system that is capable of delivering high performance. During the talk I’ll explain the architecture of a broker and client parts, emphasizing on design concepts enabling high performance. It will be of use for system designers and overall understanding of Kafka.

    The talk was accepted to the conference program

    Photo

    Vladislav Shpilevoy

    Ubisoft

    Fair threaded task scheduler verified in TLA+

    15 December, 14:40, «03 Hall. Queen Erato»

    Algorithm for a multithreaded task scheduler for languages like C, C++, C#, Rust, Java. C++ version is open-sourced. Features: (1) formally verified in TLA+, (2) even CPU usage across worker threads, (3) coroutine-like functionality, (4) almost entirely lock-free, (5) up to 10 million RPS per thread.

    Key points for the potential audience: fair task scheduling with multiple worker threads; open source; algorithms; TLA+ verified; up to 10 million RPS per thread; for backend programmers; algorithm for languages like C++, C, Java, Rust, C# and others.

    "Task scheduling" essentially means asynchronous execution of callbacks, functions. Some kind of a "scheduler" is omnipresent in most services - an event loop; a thread-pool for blocking requests; a coroutine engine - you name it. Scheduler is an important basis on top of which the service’s logic can be built.

    Gamedev is no exception. I work at Ubisoft, we have miles of code used in thousands of servers, mostly C++. There is a vast deal of task types to execute: download a save, send a chat message, join a clan, etc. They often compose one multi-step task: (1) take a profile lock, (2) download a save, (3) free the lock, (4) respond to the player. There is a waiting time between each step until the operation is done.

    One of the game engines’ backend codes had a simple scheduler generic enough to be used for every async job in all the servers. It juggled tasks across several internal worker threads. But it had the following typical issues:
    - Unfairness. Tasks were distributed to worker threads in a round-robin. If tasks differ in duration, some threads can appear choking while others are idle.
    - Polling. In a naive scheduler multi-step task execution works via periodic wakeup of the task. When awake, the task checks if the current step is done and if it can go to the next one. With many thousands of tasks this polling eats notably more CPU than the actual workload.

    The talk presents a new highly efficient general purpose threaded task scheduler algorithm, which solves these problems and achieves even more:
    - Complete fairness - even CPU usage across worker threads and no task pinning;
    - Coroutine-like - API to wake a task up on a deadline and for immediate wakeup;
    - No contention - operation is mostly built on lock-free algorithms;
    - Formal correctness - the scheduler is formally verified in TLA+.

    After the scheduler was implemented in C++ and embedded into several highly loaded servers, it gave N-fold improvement of both RPS and latency (more than x10 speed up for one server).

    At the same time the talk is not C++-specific. It is rather a presentation of several algorithms combined in the scheduler. They can be implemented in many languages: at least C, C++, Rust, Java, C#. But there is a need of support of atomics and threads.

    All that is completely open-source - https://github.com/ubisoft/task-scheduler.

    The talk was accepted to the conference program

    Photo

    Mons Anderson

    Tarantool & VK Cloud

    How to choose a queue properly

    15 December, 12:20, «01 Hall. Tigran»

    I will talk about approaches of queue usage and key parameters worth looking at like scalability, durability, guaranteed delivery, availability vs consistency and throughput.

    Most microservice architectures and distributed applications require some kind of messaging service. It is called message broker or message queue as an inevitable element of system design.

    There are quite a few of them, with their own pros and cons. The wrong choice of this component could lead to problems with scalability or fault tolerance in your application.

    In my talk I point out key moments in choosing a queue, as well as guide you through the comparison of RabbitMQ, Kafka, NATS and other candidates.

    The talk was accepted to the conference program

    Photo

    Andrei Vasilenkov

    Yandex

    Not your ordinary CDN

    16 December, 11:10, «03 Hall. Queen Erato»

    Most of you have read something about classical approaches to CDN: anycast, GeoDNS or just a plain web server with enabled cache layer. And it works great for common web applications — reading text or scrolling through doge memes. But when it comes to video streaming — that’s a whole new story!

    Scale always brings new challenges. Having a dozen nodes serving your users without fail — we’ve been there. Moving up to horizontal scaling using different locations in our data centers — long gone. Now we have a massive CDN with external locations serving hundreds of thousands of users simultaneously, distributing terabits of media data per second. In my talk I’ll explain the reasoning behind our CDN architecture and tell you how a basic automated systems algorithm can keep you sane.

    In this talk I will:
    — introduce basic network-related problems of our video streaming platform;
    — talk about why standard CDN building approach is no good for our system;
    — iterate through our approaches on distributing traffic via our CDN locations;
    — show how we use the PID-controller algorithm to control traffic flow.

    The talk was accepted to the conference program

    Photo

    Lia Yepremyan

    AMD Armenia

    FPGA Basic Principles: An Introduction to How It Works

    16 December, 14:40, «03 Hall. Queen Erato»

    Here we will cover different aspects related to FPGAs. First of all, an overview of the basic FPGA architecture is presented. The purpose of this presentation is to focus on the FPGA design process and tools which are required to program an FPGA, in addition to that, we will also discuss programming languages and how to create your first code for FPGA. Later we will provide a practical example and dive into FPGA design optimization.

    The talk was accepted to the conference program

    Photo

    Alexander Makarov

    ASAPIRL

    Theory of programming: packaging principles

    15 December, 11:10, «03 Hall. Queen Erato»

    Everyone knows SOLID programming principles, the essence of modern object-oriented programming. But there are additional higher-level principles coined by Robert C. Martin that help to determine and measure isolation boundaries between packages, modules, microservices etc.

    In this talk you’ll get into principles of package cohesion and coupling. We’ll highlight the shortcomings, tradeoffs and key points of usage and dive into D-metrics.

    After the talk you’ll add more tools that help you write better code and design better systems overall.

    The talk was accepted to the conference program

    Photo

    Denis Filippov

    Coins.ph

    Kafka for Golang developers: tips and tricks

    16 December, 15:50, «02 Hall. Ararat»

    It is not Kafka 101. On the contrary, you are familiar with Kafka and use it in your projects. I’ll demonstrate some traps and pitfalls we ran into. We’ll discuss them, have a look at how it works under the hood and try to figure out if Go philosophy can help or may harm you when working with Kafka.

    After a short introduction to Kafka (only things we will need for the understanding of the discussions) I will show some cases with diving into details: - Partition rebalancing (and how you can handle it) - Asynchronous commit: can we make message processing more concurrent? - Batch producing: don’t let default settings slow down your app.

    This talk is sort of the story of a survivor. No boring theory, only practical use cases with deep diving into details. We’ll have a look into some issues, figure out why it happens and discuss how Kafka libraries and engine works under the hood.

    The talk was accepted to the conference program

    Photo

    Artem Trofimov

    CloudIL

    Having cake and eating it too: painless and efficient cluster utilization for data scientists

    16 December, 10:00, «03 Hall. Queen Erato»

    A data science team may become an abyss for expensive hardware. One can allocate Docker/Jupyter instance with GPU but spend most of its time on code writing or data visualization. In this talk we will discuss how to ensure efficient hardware utilization while avoiding unpopular restriction policies.

    The talk was accepted to the conference program

    Photo

    Ignas Bagdonas

    Equinix

    Everyday Practical Vectorization

    15 December, 10:00, «03 Hall. Queen Erato»

    Free performance boost! Yes, free - you have already paid for your platform of choice that supports fancy vector processing extensions such as AVX2, AVX-512, SVE, RV-V, and the like - but you were not aware of what those extensions could offer you. Or maybe not that much free? Let’s check and see.

    Vectorization has been around for a good while now, and sadly has been undervalued in the software domain - for a multitude of reasons. Trends in compute platforms evolution unanimously have vectorization as the leading performance increase mechanism in hardware domain. There gap between the views of software and hardware worlds is quite enormous - and that is something that needs to be addressed. /
    Historically vector processing mechanisms were a domain of floating point calculations. While still being of major relevance, FP is becoming a specialty fragment of what contemporary vector processing approaches are able to provide to general integer computation domain.
    Everyday tasks such as pattern matching, endianess conversion, hash function calculation, cryptography operations, adaptation of inherently scalar algorithms for vector domain, impact and restrictions of data structures layout for vector performance - a set of subtopics to discuss in a form of questions and answers, with a focus on analysis of performance boost or limitation factors.

    The talk was accepted to the conference program

    Photo

    Aleksei Dashkevich

    X5 Tech

    From MVP to Reality. Transition Problems and Solutions

    16 December, 10:00, «04 Hall. Ashot Yerkat»

    We can describe product development as a struggle between business, technology, marketing and others. When launching an MVP, we usually sacrifice quality for the sake of quickly testing a hypothesis. And what’s next? What technical challenges will we have to face and how to solve them?

    How to live and what to do after successfully testing a hypothesis? What technical challenges will we have to face and how to solve them? What if it’s possible to build an architecture that will make life easier for us in the future at the MVP stage? We asked ourselves these questions, so we would like to share our experience in the technical development of fast-growing products: * how to maintain a balance of development and technical debt when scaling up to 10 times? What problems did we encounter and how did we solve them; * we’ll also talk about what you could think of at the beginning of development to make the transition from MVP to Reality easier.

    We’ve seen a lot of products on different stages of development. Participated in large-scale roll outs and developed processes to deal with high degree of uncertainty. This includes low level staff, such like custom pod scaling, optimisations, technologies restrictions, metrics collection and so on. And High Level understanding of business process monitoring, data architecture, component architecture, microservices and so on.

    The talk was accepted to the conference program

    Photo

    Kirill Alekseev

    Mail.Ru Email Service, VK

    Push-notifications in RuStore: how we built an alternative transport to replace Google Firebase

    15 December, 15:50, «02 Hall. Ararat»

    We have built a complete transport for push-notifications that can be used instead of (or in conjunction with) Google Firebase. A notification flow in our systems excludes Google APIs which means that if some app gets banned from Google’s push transport, their users can still be reached through our service. In a more optimistic world, you can continue using both systems to increase delivery rate and improve latencies. We will also deliver notifications in real time, with text/pictures etc, like Google does. Our service is free to use but the app that wants to use it is required to be deployed to RuStore.

    There are 2 main components: Android SDK and backend API.

    Android SDK provides the same interface as Firebase SDK does. It encapsulates registering a new device token, fetching and showing notifications.

    Backend API is a drop-in replacement for Firebase API, in Mail (Почта Mail.ru) we managed to integrate RuStore push-notifications by simply changing an API host from Google’s to RuStore’s. We have a stateless API that is deployed to a distributed k8s cluster, pub/sub system and a web socket server (for real time notifications delivery), Scylla to store notifications and a Redis Cluster to store device tokens.

    The talk was accepted to the conference program

    Photo

    Alik Kurdyukov

    UnitedTraders

    To Rust or not to Rust: 3 years in production with exchange matching engine

    15 December, 17:00, «02 Hall. Ararat»

    When you need to implement new system with tight latency requirements you face the problem of selecting the right implementation ecosystem. Most of low-latency systems nowadays are implemented in C/C++. There are upcoming rivals like Golang. But one contender in not so popular in main-stream – Rust.

    I’ll tell a story of implementing exchange matching engine in Rust which started in 2018 and is in production for more than 3 years now. We faced different problems starting from team hiring to selecting the right architecture and libraries and then testing methods. We’ll cover lots of practical question like: When it is reasonable considering Rust? How to hire Rust developers? What kind of training does team needs? What kind of architecture Rust force? How can you benefit or loose with ecosystem? How your ways of reasoning change?

    The talk was accepted to the conference program

  • Databases and storage systems (11)

    Photo

    Daniël van Eeden

    PingCAP

    An introduction to TiDB

    15 December, 15:50, «03 Hall. Queen Erato»

    TiDB scales writes without adding extra load on the developers working with the database. It can also combine OLTP and OLAP workloads in a hybrid solution (also known as HTAP). And all of this while being compatible with the MySQL protocol.

    The talk was accepted to the conference program

    Photo

    Chris Bohn

    MicroFocus LTD

    Designing a more efficient OLAP database data flow and architecture

    15 December, 10:00, «01 Hall. Tigran»

    Modern database systems feature OLTP databases for recording business facts and dimensions, and OLAP databases for data analytics. These database types feature different fundamental storage architectures. OLTP databases are designed for fast single-record lookup, while OLAP databases are designed for fast analytics like aggregation. This leads to different data storage approaches. To enable fast aggregation, OLAP databases usually feature immutable data storage containers, especially in cloud environments like AWS. This makes update and delete operation very expensive, because those immutable storage containers must be destroyed and rebuilt. Excessive updates and deletes can severely impact OLAP database performance. Most transactional middleware applications make use of frequent update and delete operations, which have less performance impact on OLTP databases compared to OLAP. Most businesses feature OLTP databases for running the business, the transaction data then feeding OLAP databases to subsequently analyze the business. OLTP and OLAP databases need to live together nicely, but there is an impedance mismatch due to the data storage differences. At MicroFocus, we set about to minimize the impedance mismatch. We settled on a data flow design where all data loaded into Vertica (our OLAP database) is append-only. We also determined that data integrity would be inherited from the upstream OLTP systems - so why do it again? We thus decided to use no primary or foreign key constraints, because the benefits would be redundant. This allows for much faster ELT data loading and query processing because there is no constraint checking. Again, we are inheriting constraint checking from the upstream OLTP databases, which are much better suited for that. By accepting that the upstream OLTP database has already done referential integrity checks, our OLAP database is freed from the constraint checking overhead. That is a large performance gain.

    As mentioned, our ELT is append-only. That means that our Vertica OLAP database has all the iterations of all the records. That means we have a complete record of the whole change history for all the records. The change history has become a hot topic in data analytics because the effect of changes to things such as product description and correlating that to sales revenue is an important data point. Keeping complete change history is becoming essential to data analytics. The OLTP/OLAP design that MicroFocus has taken yields efficient OLAP database performance and retains change history - an important win that comes at no cost.

    Summary: The design approach we have taken at MicroFocus with our OLTP/OLAP design has yielded a robust and performant holistic system that enables our Vertica OLAP EDW to perform at its potential, while providing benefits like change history.

    The talk was accepted to the conference program

    Photo

    Daniil Gitelson

    OW Service

    How we cook Foundation DB

    16 December, 17:00, «01 Hall. Tigran»

    FoundationDB is a low-level ACID database with nice guarantees designed as a ‘foundation’ for high level DBMS’es. None was actually created, so we had to roll our own

    FDB is a simple key-value ACID database with nearly 6 operations. So we had to build high level API around it supporting
    – Document-like storage with indexes
    – Time-based and client-centric partitioning of historical data (e.g. payments history)
    – Queues

    We evolved this layer from a simple Kotlin library to a separate service. In this talk I will speak on how FDB works and how we implemented that layer squeezing max performance out of it.

    The talk was accepted to the conference program

    Photo

    Arshak Matevosyan

    PicsArt

    How to create a fully transparent MongoDB database cluster holding terabytes of data serving hundreds of millions of users simultaneously

    16 December, 14:40, «04 Hall. Ashot Yerkat»

    Working in a fast-growing microservice hybrid infrastructure from various perspectives and topics, our team had to overcome challenges when the databases received enormous amounts and a variety of queries, which could slow down or even crash the servers.

    This experience was unacceptable as database layer issues can cause slowness or even unresponsiveness of the whole application.
    The problems can be very different, starting with enormous amounts of bulk queries and ending with unstructured queries or even queries with infinite loops.

    Our team developed a solution that allows us to analyze and monitor what's going on in the database using open-source tooling and provide monitoring capabilities for application engineers to see how their queries perform in real time.

    The talk was accepted to the conference program

    Photo

    Vladislav Pyatkov

    GridGain

    How did we build rebalance into the distributed database architecture

    15 December, 12:20, «02 Hall. Ararat»

    I will describe how the rebalance procedure is changed due to developing replication protocols, and how former processes modified in the new circumstances. All the material based on experience of maintaining and developing Apache Ignite.

    The talk was accepted to the conference program

    Photo

    Vladimir Bukhonov

    Miro

    Miro canvas content migration from Postgres to the in-memory DB + S3

    16 December, 17:00, «04 Hall. Ashot Yerkat»

    Transferring canvas data to Miro is a long and unexpected process. The current migration process is already the second one in Miro. In my presentation, I want to explain why, how, and why we moved the contents of our canvas from one storage to another and why we came to such a final decision.
    I’ll tell you about the criteria for choosing bases, how we compared them, and why we finally came to the conclusion that it’s better to write your own solution.
    I will talk about the silly, but non-obvious data errors that we encountered on the way to integrate our new database, as well as the limitations that arose as a result of moving to an in-memory database and how we got around them.
    Well, as a bonus, I’ll tell you how we saved about 90% of the financial costs of the database infrastructure.

    The talk was accepted to the conference program

    Photo

    Konstantin Osipov

    ScyllaDB

    NoSQL and transactions: getting the numbers out

    15 December, 11:10, «01 Hall. Tigran»

    We built an open-source instrument that benchmarks transactional workload with multiple NoSQL vendors in the cloud. In this talk I’ll present the tool, the method, and the benchmarking results. MongoDB, CockroachDB, FoundationDB and YDB are covered to a different extent.

    In an effort to provide both consistency and scalability the NoSQL ecosystem has been rapidly providing transaction support. From pioneers, focused squarely on scaling transactional workloads, to followers, adding transaction support to eventually consistent data stores, more and more vendors are trying to get on board of the relational database train. The idea of serverless scalability of a strongly consistent workload is quite attractive, so we evaluated a few vendors from the cost per transaction perspective, comparing their performance with one popular open-source relational database. In this talk I’ll present the method, the tool, which we made available online, and the evaluation results.

    The talk was accepted to the conference program

    Photo

    Jordan Pittier

    Gorgias

    PostgreSQL a journey from 0Tb to 40Tb in 4 years

    15 December, 13:30, «02 Hall. Ararat»

    PostgreSQL is an amazing database, very versatile and capable of processing both online analytics and transactional workloads. Yet operating PG past a certain scale (>1Tb) is challenging and mistakes are costly. In this talk, we will share our experience as our PG databases grew from 0Tb to 40Tb.

    The talk was accepted to the conference program

    Photo

    Anton Zhukov

    ManyChat

    The 2% Solution

    16 December, 15:50, «04 Hall. Ashot Yerkat»

    After we started receiving insane AWS invoices for our cold events databases, we decided to optimize the data and store it like compact encrypted and compressed chunks. I’ll tell you about an engineering way of solving the task without using any ready solution like branded database or data storage.

    In this talk I will share a full history of custom data storage creation. I will speak about 2 sides of processes. About a simple concept of integrating stateless component beside a database driver and about the migration complexity with parallel processes where each mistake has a monthly expected cost. Research, development, and troubleshooting in the custom data storage which allows us to cut our costs down to 2% from based PosgtreSQL instance.

    The talk was accepted to the conference program

    Photo

    Alexey Palazhchenko

    FerretDB Inc

    Building an open-source MongoDB-compatible database on top of PostgreSQL

    16 December, 15:50, «01 Hall. Tigran»

    MongoDB is a life-changing technology for many developers, empowering them to build applications faster than using relational databases. However, MongoDB abandoned its open-source roots, changing the license to SSPL and making it unusable for many open-source and commercial projects. We decided to change that, so we started working on FerretDB – an open-source proxy written in Go. It accepts connections and handles queries from unmodified MongoDB clients, and stores data in PostgreSQL.

    In my talk, I will briefly discuss our reasoning for starting this project, our vision, and our plans for the future. I will also cover a lot of technical aspects of FerretDB, such as:
    • How did we implement the MongoDB wire protocol?
    • How do we store MongoDB/BSON documents in PostgreSQL/jsonb columns?
    • How do we query and filter data using SQL, and what problems have we encountered?
    • How do we test our implementation?
    • And others.

    The talk was accepted to the conference program

    Photo

    Igor Loban

    Toloka.ai

    Transactional queues in PostgreSQL

    16 December, 14:40, «01 Hall. Tigran»

    Most modern web applications use external message brokers like RabbitMQ, Kafka, or something similar. Developers have to solve a problem of atomic change in a DB and sending a message to a queue.

    In my talk, I’ll show the Transactional Outbox pattern, how it solves the problem and our recipe for its reliable implementation for PostgreSQL.

    The implementation is pretty challenging, so there is a recommendation to use a ready-to-go solution like PgQ. But PgQ has some disadvantages: requires a daemon process, provides generic queues that are redundant, has a lack of documentation, and doesn’t fit for all (maybe unavailable for managed PostgreSQL).

    The talk was accepted to the conference program

  • BigData and Machine Learning (6)

    Photo

    Dmitrii Kamaldinov

    Qrator Labs

    On one interesting generalization of the Leaky Bucket algorithm and Morris's counters

    15 December, 14:40, «04 Hall. Ashot Yerkat»

    The task of reducing the intensity of the event flow often arises in practice. Often it takes the form of limiting internet traffic to reduce the load on a particular service.

    In my talk I will cover a slightly more complicated problem: reduce the intensity of the flow by removing only the most frequent elements (or, equivalently, by removing as few unique elements as possible).

    It turns out this alternative approach to rate limiting is quite reasonable in some cases. And the talk will cover a bunch of application examples including several from our experience with traffic filtering at Qrator Labs. We will discuss the pros and cons of this approach and will also compare it with such limiting instruments as NGINX.

    I will propose an algorithm developed by our team at Qrator Labs which solves this problem. Based on the famous Leaky Bucket algorithm, the algorithm is incredibly simple and requires no prior knowledge from the audience. Yet we find it quite interesting and elegant because it achieves the goal by spending O(1) time on processing each element, despite the apparent complexity of the task which in some way combines the task of rate limiting with the task of searching for the most frequent elements (so-called heavy hitters), that at first glance would require some kind of sorting.

    In the second part of my talk, I will give a brief overview of Morris’s counters that allow counting a large number of events using a small amount of memory by introducing a probabilistic approach to updating when an event hits.

    These counters can find their application in systems where the memory is a critical resource (in particular as a part of the algorithm mentioned above). In addition to describing the working principle and properties of the classical Morris’s counters, the talk will also present some novel ideas obtained in the course of our research.

    The talk was accepted to the conference program

    Photo

    Dmitry Petrov

    Data Version Control (DVC)

    ML experiment tracking with VScode, Git and DVC

    16 December, 12:20, «03 Hall. Queen Erato»

    The machine learning space brings extra challenges in the form of the hundreds and thousands of ML experiments and the large datasets involved. This can be accomplished right from VScode using the DVC extension. We will show how Git can be used as a source of truth for ML experiments.

    The machine learning space brings extra challenges in the form of the hundreds and thousands of ML experiments that must be tracked and the large datasets involved. This can be accomplished right from VScode code editor using the DVC extension for VScode. We will show how Git can be used as a source of truth for ML experiments and how teams can collaborate by sharing modeling code and metrics using Git and GitHub.
    ML teams will learn how to use the existing tools such as Git, GitHub or GitLab in ML teams and how to better collaborate with software engineering and DevOps teams.

    The talk was accepted to the conference program

    Photo

    Roman Grebennikov

    An independent search engineer

    Building an open-source online Learn-to-Rank engine

    15 December, 15:50, «04 Hall. Ashot Yerkat»

    Building a CTR-optimized ranker takes ~6 months. Most of the time you will be gluing different ML libraries together, repeating the same mistakes everyone made before.

    We got tired of it and made Metarank: an open-source LTR service doing 90% of the most typical ranking tasks with only 10% time.

    The talk was accepted to the conference program

    Photo

    Roman Smirnov

    Exness

    Machine Learning in the audio domain: when the neural network is overkill or where are the limits of lightweight models

    15 December, 13:30, «03 Hall. Queen Erato»

    Machine learning engineers and data scientists typically use neural networks when the task is about media-data: texts, images, sounds/voices. There are many great and pretrained architectures for voice processing, e.g. Wav2Vec2 or Whisper. However, such models are really huge and require expensive computational resources or take too long to process data. I am going to describe several audio processing tasks from classification and regression on audio sequence to diarization and speech recognition with focus on the first two mentioned tasks - experiments with poor and rich datasets to solve these tasks using lightweight gradient boosting on decision trees model and pretrained Wav2Vec2 neural network (that is current SotA in many voice processing tasks). My main goal is to discuss where the limits of gradient boosting algorithms are in the audio domain.

    The talk was accepted to the conference program

    Photo

    Anatoly Starostin

    Yandex Plus Funtech

    Machine learning in media services

    16 December, 13:30, «03 Hall. Queen Erato»

    The report examines the technological problems faced by modern media services and shows how machine learning helps to cope with them. We will talk about a whole range of technologies used in Yandex media services, such as music recognition by short and noisy audio fragments, actors’ faces recognition in movie frames, full-text search of musical compositions etc. The recently released music generation technology will also be discussed. Examples from real services with a multi-million audience will be given.

    The report provides an overview of the technologies used in Yandex Media Services related to media data processing and discusses the role of machine learning and crowdsourcing methods in the implementation of each of them. Some of these technologies work directly with audio and video and some, in contrast, use only their metadata (usually text). Examples of both cases will be given. We will talk about recognition of a musical composition based on short audio fragments recorded from the microphone of a client device or taken from the audio track of a certain movie. The recognition of actors' faces in the movie frames will also be discussed. We will also cover several tasks that ensure the functioning of the musical scenario of Alice voice assistant and required machine learning or crowdsourcing techniques to implement. Finally, we will present the technology of automatic music generation, which became the basis for a new product of the Yandex Music service, called Neuromusic. This technology is a hybrid of algorithmic methods based on expert knowledge and machine learning methods. Machine learning is used to generate melodic fragments, which are later incorporated into an algorithmically controlled musical canvas. The report discusses the structure of the technology in general and the generation of melodies, in particular.

    The talk was accepted to the conference program

    Photo

    Ashot Vardanian

    Unum.cloud

    Designing the fastest ACID Key-Value Store

    15 December, 12:20, «03 Hall. Queen Erato»

    One node. One CPU socket. 20 GB/s of mixed random I/O in ACID transactions in persistent memory on 10 TB+ collections. In production. 35 GB/s in the lab.

    How did Unum reach those numbers? What can GPUs bring to the table? And how does the Linux kernel stand in our way?

    The talk was accepted to the conference program

  • Neural networks (1)

    Photo

    Alexey Voropaev

    Evocargo

    Perception system of a truly autonomous truck

    15 December, 17:00, «03 Hall. Queen Erato»

    Autonomous cars wear a bunch of sensors to detect obstacles in time, day or night. But perception isn’t only about cameras or lidars. We train neural networks, design data pipelines, annotate data (not draining your budget), build the infrastructure (surely, with micro-services at its base).

    Object: autonomous light trucks without a driver’s cabin that transport cargo 24/7
    Setting: an enclosed area at a logistics hub
    Our mission: to develop a perception system for such trucks

    I’ll talk about how we have solved major challenges in building a perception system and learned to…
    - detect people, cars, e-scooters, etc.
    - recognize debris on the roadbed to exclude them from the drivable area
    - annotate data used in neural networks training, and reduce annotation costs
    - and, finally, ensure that algorithms and calculations won’t fry the onboard computer.

    Bonus: No bragging about theoretical concepts or how others do it — I’ll show you how our autonomous vehicles see the world based on our own experience in the fields and give real-life examples of how they operate at our clients’ sites.

    The talk was accepted to the conference program

  • Enterprise Systems Performance (4)

    Photo

    Igor Solovyov

    Yandex

    Level up your Optimization Process: How to Implement Distributed Profiling and Why you Want to Have It

    15 December, 10:00, «04 Hall. Ashot Yerkat»

    In the Yandex advertising infrastructure team we have learned how to solve some problems related to profiling during optimization. The solution is quite different from more common trace-based like jaeger and local profiling using tools like gdb.

    I’m going to cover the following topics in my talk:
    — What is distributed profiling
    — How distributed profiling works in Yandex Recommendation Systems Infrastructure Service
    — Why you might want to use distributed profiling even if you know how to profile locally
    — How to implement distributed profiling even if you don’t know how to profile locally
    — Interesting features for profiling
    — Usage examples

    The talk was accepted to the conference program

    Photo

    Daniel Podolsky

    Microavia

    TTM as a main KPI: pain and humiliation

    15 December, 17:00, «01 Hall. Tigran»

    First of all: I’m an engineer and I’m looking at almost everything from the engineers perspective.
    We all know how hard the work in the go-go-go project could be. I was there as a developer, lead, CTO and do remember every lovely day I was there.
    And I think we could make the situation better!

    The talk was accepted to the conference program

    Photo

    Pavel Lakosnikov

    Avito

    Documentation as a way not to fail in microservices

    15 December, 11:10, «04 Hall. Ashot Yerkat»

    Without a good documentation process your microservice architecture will be doomed. Documentation processes are the best way of creating lots of microservices, sharing context between teams.

    You’ll see what exactly engineers want to see in good documentation to be successful and happy.

    Different types of Codegen process will be on focus.

    The talk was accepted to the conference program

    Photo

    Daniele Frasca

    ProSiebenSat.1

    Blazing fast serverless with Rust

    16 December, 11:10, «02 Hall. Ararat»

    Even if you can use almost any language to build your serverless app, some choices provide significant advantages in terms of speed, which translates into more cost-effective functions. In this talk, we explore how to match Rust programming language and AWS Lambda millisecond billing.

    As a developer, I am requested to optimize the following:
    - Bootstrap of the runtime
    - Code run
    - Use better processor architecture

    Being a Serverless engineer means I need to take care of many moving parts, software, architectures, security and so on.

    The best practices developing with AWS Lambda functions are:
    - The Lambda package should be as small as possible;
    - Initialize my classes, SDK clients and database connections outside of the function handler;
    - Cache static assets locally in the /tmp directory.

    Doing all this and many others will save execution time and cost for subsequent invocations (Warm start).

    When we are talking about serverless, speed is essential. I am not talking about spending hours of development time saving a few ms, but I am talking about the relationship between speed and cost.

    Speed in a hyperconnected world where everything must be available in real-time is essential, and for few that use this concept at scale, it will result in a cheaper cloud cost.

    Rust is the key to unlocking the performance of the next generation of serverless applications. Thanks to the Single Responsibility Function that reduces the code complexity, Rust has become just a tool, another language syntax that allows me to increase the throughput while reducing cost.

    The talk was accepted to the conference program

  • DevOps and Maintenance (7)

    Photo

    Chris Travers

    DeliveryHero SE

    Lessons Learned from Running Infrastructure at Scale: Human Factors

    16 December, 14:40, «02 Hall. Ararat»

    We all know that the people who run infrastructure at scale are critical to many organizations’ success. While frameworks like Google’s Site Reliability Engineer framework have become popular in recent years, there is still a lack of focus on the human factors. This talk attempts to change this.

    For the last half decade, I have been running infrastructure at high, even massive scales. While Google’s SRE framework helps to merge the systemic and business needs, what of the human needs? What can we do to help ensure that people are successful and well supported? This is extremely important as high velocity systems tend to be extremely difficult to reason about at scale, and individuals may have difficulty determining how to react during emergencies. And yet it is the human factor that keeps things running.

    In this talk we will cover:
    - Limitations of the SRE system at scale
    - The need for human factors training of operational staff
    - Collaboration at the heart of incident management
    - Importance of Crew Resource Management in Operations At Scale
    In this we will cover a number of low hanging fruit that you can take away for your own operational environments. These will include writing standard operating procedures for late-night incident response, standardizing emergency communication, and separating incident command from troubleshooting.

    When I was heading the IT Operations department at Adjust, I looked at what we lacked and concluded it was human factors training. We brought in aviation-grade training in this area and this was a massive help. Many of the lessons the aviation industry has learned at the cost of loss of life we can apply in modified form adapted to our industry. I am here to pass on problems I have seen along with solutions which this and other training from other fields have brought me to implement.

    The talk was accepted to the conference program

    Photo

    Addison Schultz

    Miro

    Developing a best-in-class deprecation strategy for your features or products

    16 December, 10:00, «02 Hall. Ararat»

    Nobody likes ambiguity—especially when it comes to the stability of an endpoint or a feature, and the expectations for availability long term. Avoid common pitfalls and explore a critical area where trust is built with developers through thoughtful, best-in-class deprecation strategy.

    The talk was accepted to the conference program

    Photo

    Viktor Vedmich

    AWS

    Karpenter: Efficient scaling of Kubernetes clusters

    16 December, 17:00, «03 Hall. Queen Erato»

    Karpenter - the new groupless cluster autoscaler, that can dramatically improve the efficiency and cost of running workloads on your cluster.

    The cloud is all about elasticity and right-sizing. Microservice architectures are keen to be implemented in containers, and Kubernetes arguably the most common orchestrator to run them. These containers, being (usually) stateless, are good candidates to use the cloud elasticity, so scaling them should be a well-known task. We’ll talk about different scaling approaches and focus on Karpenter - the new groupless cluster autoscaler, that can dramatically improve the efficiency and cost of running workloads on your cluster.

    Note: The session will include demo part - to compare with autoscaler and how to work with karpenter.

    The talk was accepted to the conference program

    Photo

    Andrei Kvapil (kvaps)

    Palark

    KubeVirt, its networking, and how we brought it to the next level

    16 December, 15:50, «03 Hall. Queen Erato»

    Short abstract
    When choosing KubeVirt as our main virtualization solution, we were unsatisfied with the existing networking implementation. We developed and contributed some enhancements to simplify the design and get the most performance out of the network using KubeVirt.

    Full abstract
    In this talk I’ll show you how the network operates in KubeVirt as well as the unique features we implemented to enhance it. The following topics will be covered:
    - How to run mutable VMs in immutable K8s Pods;
    - What is the difference between KubeVirt and traditional cloud platforms;
    - What’s wrong with networking in KubeVirt;
    - How a live VM migration is performed;
    - Communication with the community and contributing process.

    The talk was accepted to the conference program

    Photo

    Vadim Ponomarev

    cloudification.io

    You need Cloud to manage Cloud: Kubernetes as best way to manage OpenStack cloud

    16 December, 11:10, «04 Hall. Ashot Yerkat»

    We will discuss OpenStack as a microservice application. What you will have to face if you want to launch your own cloud based on OpenStack. What problems arise when you deploy such a large and complex system, how to maintain it, and provide a highly available cloud for your client. And how Kubernetes can (or cannot) help with this. In this talk, based on experience, Vadim will tell you about the tricks, the most common mistakes, and how they can be solved.

    The talk was accepted to the conference program

    Photo

    Igor Latkin

    KTS

    How we reduced logs costs by moving from Elasticsearch to Grafana Loki

    16 December, 13:30, «01 Hall. Tigran»

    Elasticsearch cluster with billions of log lines can consume terrabytes of disk size. Grafana Loki can be a good candidate for storing and querying logs in large environments. In this talk we will focus on maximizing Loki’s performance and on a task of transferring logs to it in an efficient manner.

    Elastic Stack was until a certain time the de facto standard for collecting and processing logs for Kubernetes clusters. However, it is known to be pretty demanding on computing resources such as CPU, RAM and disk usage. Therefore, new players appear on the market, offering alternative solutions, and one of them is Grafana Loki. In case you decide that you need to change the logging stack, there are several problems and questions that need to be answered.

    In this talk I will share our experience at KTS of migrating logs from Elasticsearch cluster to Loki, what difficulties we encountered along the way, how we solved them, and how much money we saved in the end.

    We’ll also discuss topics such as:
    * Architectural differences between the ELK/EFK stack and Grafana Loki
    * How Loki allows you to save a lot on the logging infrastructure
    * How not to get into a cloud provider’s vendor-lock - here we will analyze the principles of boltdb-shipper and interaction with an S3 storage
    * What “knobs” you can tweak in the Loki configuration for it to work at maximum performance
    * And most importantly - what to do if the logs are currently in the Elasticsearch cluster, and how to transfer them to Loki in adequate time - I will share our own experience and solution.

    The talk was accepted to the conference program

    Photo

    Anton Bystrov

    Percona/Simbirsoft

    How to create dashboard-"story" for highload

    16 December, 12:20, «04 Hall. Ashot Yerkat»

    I'll tell our story how we create a new home dashboard for monitoring instances that contains 100…200…300…+ nodes. How we catch issues with performance in old version. Methods that were used to find “bottleneck” in performance. Also want to tell about differences in monitoring strategies and how we compiled methods in our new dashboard. We also solved our main problem with performance and created new dashboard with a story for drilling down into more detailed level.

    The talk was accepted to the conference program

  • Security, DevSecOps (4)

    Photo

    Alon Kiriati

    Dropbox

    The clashes of the titans - Usability vs. Security. Can they live together?

    16 December, 13:30, «02 Hall. Ararat»

    Security is critical for every app, but can also be annoying and complicate simple flows. In this talk we will see how using a smart algorithm can make your UX simple without compromising security, by focusing on an example use case - password estimation

    Security is critical for each and every app. One mistake or one breach can kill a business. However, security can sometimes be annoying and complicates simple flows. In this talk we will dive into one of these use cases - password strength. We will cover the importance of your customer’s passwords, learn how to estimate them and see tools that will help you evaluate those. We will also discover the trade-offs between product & security and how using a smart algorithm can make your UX simple without compromising security.

    The talk was accepted to the conference program

    Photo

    Vas Soshnikov

    Quantil Inc.

    How to handle with DDoS attack on CDN-, MDM-, Edge- networks

    15 December, 14:40, «01 Hall. Tigran»

    Here is the short answer: If somebody wishes to make aDDoS attack to the infrastructure, then he or she can probably do that if resources are enough. But the good news is that there are some methods to avoid huge impacts. The key idea behind this lecture is to show protection methods, show new things provided by the Linux kernel.

    The talk was accepted to the conference program

    Photo

    Edgar Mikayelyan

    Qrator Labs

    Evolution of Distributed Denial of Service attacks on the Internet: 1994 up to the present

    15 December, 13:30, «01 Hall. Tigran»

    Working with customers from a variety of industries our team has accumulated a big scope of data about evolution of tools for organizing DDoS attacks and methods of their mitigation. I will share our key findings in the field of DDoS attacks problematics and our vision of their future development.

    At Qrator Labs we are deeply engaged in research of DDoS attacks, real-time traffic filtering, bots’ activity extending expertise in our core competencies year by year. In my keynote presentation I would like to bring some results and conclusions of our longstanding study in the field of network security and business resilience: ⁃ Most important time milestones of DDoS attacks evolution in terms of technology and public problem perception: Panix, Sony, XboxLive/PSN, Mirai, Memcached. ⁃ How web resources and APIs parsing tools were being developed alongside with DDoS attacks, how scraper bots were created and grew in sophistication and large-scale use. ⁃ Technologies and innovations improving our life on the Internet and breaking new grounds for DDoS and bot attacks. ⁃ Lessons we learned and conclusions we drew from many years’ experience in this field.

    The talk was accepted to the conference program

    Photo

    Artem Bachevsky

    Independent expert

    Breaking license

    16 December, 12:20, «02 Hall. Ararat»

    We often meet with licensed software, but not always dive into how it works.

    In talk we’ll discuss:
    - How is software licensed?
    - Pros and cons of different ways of software protection
    - How to break it?
    - And finally, how to develop unbreakable software protection and not to “break” your customers?

    The talk was accepted to the conference program

  • Internet of Things (1)

    Photo

    Alina Dima

    Amazon Web Services

    Moving beyond prototypes: Building resilience at scale in your IoT application

    15 December, 14:40, «02 Hall. Ararat»

    If 1% of your 100-device fleet goes offline, it’s 1 device. Maybe your use-case can live with that. But can your use-case be bulletproof with 1% of 10 million devices (100000) going offline? If the answer is no, then it is time to learn about resilience at scale.

    At scale, the overall health of your IoT application (Edge and Cloud) can be affected by events outside of your control. Here are some examples: the network provider drops the connection, or a high % of your fleet comes online at the same time. In the IoT problem space it is your responsibility as an engineer to handle not only the resilience of an Edge application instance and its interaction with the Cloud, but also the collective resilience of 10s of 1000s of Edge application instances connecting and communicating with your Cloud application, and the impact of them all performing or not performing an action simultaneously. Problems that seem minor at small scale, such as: 1% of your fleet going offline and coming online at the same time, might become major at large scale. Is your IoT application safe from a self-inflicted DDos attack?

    This session will focus on explaining resilience at scale, and how scale uncovers problems you don’t see otherwise, and providing examples of mitigation strategies you can build with AWS IoT, to ensure that your IoT application in its entirety is operating reliably at scale.

    Key takeaways:
    - Understanding what resilience at scale is, with concrete examples of what could go wrong
    - Learn how to ensure your IoT application is resilient at scale using AWS IoT
    - Take home a mental model for building resilience at scale

    The talk was accepted to the conference program

  • System administration, hardware (1)

    Photo

    Crux CONCEPTION

    CRUX CONCEPTION

    Insider Threat: What is Social Engineering?

    16 December, 10:00, «01 Hall. Tigran»

    Retired Criminal Profiler & Hostage Negotiator, Crux Conception has taken his years of training, education, and experience to develop a method that will allow individuals within The Tech Community to utilize social, people, and observation skills to detect potential theft and acts of company espionage.

    By converting your ordinary social and observational skills into simple criminal/psychological profiling techniques.

    When we hear the word “Ransomware,” is it possible that before a cyberattack is initiated and hackers/cyberthieves penetrate through your online security system, someone on the inside offered valuable information to the hackers, giving them the ability to hold your company for RANSOME?

    Is it possible to think that an individual or organization, worlds away, actually-know how much a company is willing to pay? Are we that unaware to think that someone on the inside supported the hackers/cyberthieves with valuable information regarding your company’s security system and protocols?

    Is it possible that someone inside has specific information about how much a company is willing to pay, and its product is considered a valuable resource to its vast customer base?

    For example, what if AT&T had a disgruntled employee (with pertinent knowledge regarding AT&T’s security system and protocol). To a SOCIAL ENGINEER, this is the perfect candidate to recruit, gather valuable data, and then relay that information to a team of hackers/cyberthieves.

    The talk was accepted to the conference program

  • Video, streaming video (2)

    Photo

    Anton Kortunov

    Yandex

    Things they never tell you about video streaming

    15 December, 15:50, «01 Hall. Tigran»

    There are IT engineers that make view streaming services. There are video engineers that know how to shoot video. These two sets almost don’t intersect. In this talk I, as an IT guy, will share with you what I have learned from video guys over the last 5 years.

    Almost nobody knows how easy it is to drop video quality inside the video processing pipeline. In my talk I’ll go through these sections:
    * why is audio track more important than video track
    * video is not just a sequence of images, how to use shutter speed on video cameras, stroboscopic effects
    * interlaced videos: why does it exist and how to deal with it
    * frame rate conversion and its disadvantages
    * image gamma and image resize: why do you resize images incorrectly

    The talk was accepted to the conference program

    Photo

    Olga Popova

    Yandex

    Reducing traffic wastage in video player

    15 December, 17:00, «04 Hall. Ashot Yerkat»

    Content delivery is expensive for video streaming. I will tell you how to reduce video traffic wastage from the client side by changing technical parameters of the video player. By varying the buffer size and the selected video quality in the player we'll save the company money without loss of QoE.

    In my speech I’m going to talk about:
    1. Theory:
    - How do we lose traffic? Basic scenarios in terms of video player processes
    - On what video player aspects can we influence? On the buffer size and chosen bit rate
    - QoE (quality of experience metric) vs reducing data wastage
    - Few words about our logs. How do we connect traffic with product metrics?
    - The evolution of the reducing traffic KPI-metric

    2. Harsh reality (our hypotheses and results of real experiments):
    - Hypothesis: Buffer limit to X seconds
    - Hypothesis: Dynamic buffer
    - Hypothesis: Skippable fragments
    - Hypothesis: Viewport capping
    - Hypothesis: Aesthete capping
    - Hypothesis: SwitchUp capping

    3. Results and conclusions.
    Let’s talk about which hypotheses are suitable for online cinemas, and which are suitable for video hosting.

    The talk was accepted to the conference program

  • QA, Stress testing (2)

    Photo

    Evgeny Potapov

    DevOpsProdigy, USA

    10 mistakes of a (high)load testing in 2022

    15 December, 10:00, «02 Hall. Ararat»

    Surprisingly, professionally organized load testing is the area of enterprise projects, banking, and government systems. Professionals and professional teams dedicated specifically to load testing processes work and this process is put on stream.

    What is even more interesting is that the niche of load testing is the area of QA specialists, there is one group of people who write and perform such tests, and separately there are engineering teams that do something based on the results of testing. The situation is reminiscent of the distinction between Devs and Ops before 2008.

    In commercial web development the situation is different: in most projects, with the exception of very large ones, load testing is carried out "in so far as", most often by the engineers themselves who developed the project. Time for this is allocated according to the residual principle, test scenarios are often worked out “by eye”.

    While there are attempts to build load testing into CI/CD, this comes with its own challenges. Business people and management want to have builds and deploys as fast as possible, and adding a service load testing is a huge overhead. Even more, people are not ready, actually, to DDoS production environment every time when someone is doing a deployment. Load testing on such projects happens really rarely, from time to time, on special occasions, and software engineers don’t get the required experience, that they would get doing this regularly.

    The results of the load testing in such cases might be completely incorrect, and the problem is not just the wrong numbers. Those wrong results might say that the limit is the sky and business decisions are based on these. It might be a huge marketing campaign that will overflow the servers, it might be a decision to restrain from investing in architecture scaling, or just the decision to release the project when it’s not ready to accept the traffic.

    Common mistakes:
    * The project was tested for 5 minutes instead of a long time;
    * The test profile was defined incorrectly;
    * Staging/Pre-prod environments were tested and production has a completely different infrastructure;
    * site returned HTTP 200 when it wasn't actually working;
    * some of the microservices were tested in isolation from others and in production they work in connection to each other;
    * and a million more reasons.

    In my presentation I want to go over the main problems that we see in our work and which lead to incorrect results of load testing or to incorrect interpretation of test results. I will tell you how to avoid them both from a technical point of view and from an organizational one (in a conversation with a business), and how to try to integrate the testing process into the regular process of development, breaking down silos.

    The talk was accepted to the conference program

    Photo

    Karen Mkhitaryan

    Ameriabank

    The Role of API Testing and the Importance of API Automation

    15 December, 11:10, «02 Hall. Ararat»

    We all write Automated tests for our applications, but...

    1. Do we really understand the importance and the advantages of API test automation?
    2. Do we write API automated tests?
    3. Do we use API automation in UI tests?

    During the speach we will discuss a lot of points related to API automation and getting familiar with API automation using Java and REST Assured.

    The talk was accepted to the conference program

  • Backup talks (1)

    Photo

    Valentin Udaltsov

    Happy Inc.

    ID battle: UUID vs auto increment

    For almost eight years, while developing web applications, I used exclusively auto increments to identify entities. A few years ago I tried UUID in a pet project. Since then, my team and I choose UUIDs for identification in most of the cases. We learned how to correlate entities of different modules by ID, we took advantage of different UUID types, and were among the first to use UUID v6 and v7.

    At the conference we will discuss pros and cons of using auto increments and different UUID versions in various situations, study database benchmarks, and find a new winner in a good old battle of identifiers.

    The talk was accepted to the conference program

  • TechTalks (2)

    Photo

    Oleg Bondar

    Yandex

    YDB — Open-Source Distributed SQL Database from Yandex

    There are many well-known open-source projects by Yandex. There are frameworks for writing your own services like userver, complex, math-heavy projects for machine learning like CatBoost, frameworks for frontend developers like DivKit, trained machine learning models (YaLM), resilient scalable databases for high loads, and much more.
    In addition to open-source projects, Yandex is engaged in developing various standards. For example, thanks to the WG21 initiative, the new C++ standards now contain ideas and improvements suggested by our developers.
    I’m going to talk about open source and YDB — our open-source distributed SQL database.

    The talk was accepted to the conference program

    Photo

    Anton Kortunov

    Yandex

    Videos in Yandex and where to find them

    Yandex services have been working with video for more than a decade. That includes video search, Kinopoisk, and even an experiment with our own small video hosting service.
    In 2017, Yandex Efir came into the world in line with all the latest in technology and powerful infrastructure, priming it to handle Kinopoisk HD and all our other video projects.
    In this tech talk, we’ll be going back to the project’s genesis, looking at its infrastructure and development challenges, and taking a peek into the future to see what lies ahead for streaming services.

    The talk was accepted to the conference program