Computer Vision in Healthcare
AI decreases time and increases recall during routine CT examination
Arkady Sandler,
True Click Technologies
Stanislav Moiseev, Tinkoff
Large Language Model Fine-Tuning Acceleration with Data Reduction via Losses
Alexander Demidovsky,
Huawei RRI
Fast Implementation of the Node2Vec Algorithm
Foundation models in medical imaging.
Evgeny Sidorov,
Third Opinion Platform
Anastasia Semyonova, Smile2impress
Overview of Federated Learning Methods
Denis Afanasyev, CrossOverMarkets
Human-AI interaction in healthcare
Automated system for analysis if OCT retina images development and testing
Kirill Aksenov,
LLC PREDICT SPACE
Yury Chernyshov,
CyberLympha
Multi-Agent Reinforcement Learning - overview
Anton Plaksin,
Yandex Research
Reinforcement Learning in Zero-Sum Differential Games.
Andrey Filchenkov,
ITMO University
Deep Reinforcement Learning-based Congestion Control for File Transfer
Alexander Blokhin, Huawei
Pavel Braslavski,
Nazarbayev University
You Told Me That Joke Twice: A Systematic Investigation of Transferability and Robustness of Humor Detection Models
Linguistic and logical structures for text analysis
Maria Tikhonova,
SberDevices, HSE
mGPT: LLM speaking 61 languages including Georgian and Russian
Neovascular age-related macular degeneration (n-AMD) is a form of AMD that is responsible for most cases of severe vision loss. Anti-VEGF therapy, which is the gold standard for the treatment of this pathology, is accompanied by OCT monitoring. However, this process is hampered by the lack of methods for accurately quantifying OCT images. The aim of this study is to develop and evaluate the accuracy of the automated calculation of the quantitative characteristics of PED, SRF and IRF biomarkers. The study material included OCT B-scans of patients with n-AMD and pigment epithelial detachment who underwent anti-VEGF therapy from 2014 to 2021. OCT B-scans obtained from a CirrusHD-OCT 5000 Carl Zeiss Meditech device. The neural network for OCT image segmentation was trained on a dataset including 251 and 385 images from Experiments 1 and 2, respectively. The images were annotated by experts highlighting PED, SRF and IRF biomarkers using Labelme software. Data preprocessing included image resizing, normalization, and conversion to grayscale format. The data set was divided into training and validation. To segment retinal structures, the UNET architecture with the Adam optimizer and the Categorical Cross-Entropy loss function was used. The algorithm for calculating quantitative biomarker characteristics was based on edge detection using the method of Satoshi Suzuki and KeiichiA be. Testing data set for access the efficiency of system that included algorithms for segmentation and calculation of quantitative characteristics of biomarkers, included 241 images for which the length and height of the PED were measured by a physician using built-in software. Also, the image data were marked with respect to 3 anatomical treatment outcomes: attached PED; non-attached PED; PED tear. The developed method for processing OCT images made it possible to segment the biomarkers PED, SRF and IRF with high accuracy. The segmentation model shows the best results for PED (0.9), but also shows good accuracy for SRF and IRF (0.72 and 0.69) with increasing number of training data in experiment 2. Automated algorithm for calculating quantitative characteristics of biomarkers on the test set data from patients with n-AMD showed no statistically significant difference when comparing measurements with a physician. The study also showed that the attached and non-attached PED groups were statistically significantly different regarding the height, extent and area of the PED. In addition, IRF area may also be a predictor of PED tear, since its values are statistically significantly different for groups 2 and 3. Thus, automated segmentation and calculation of biomarkers can achieve performance comparable to an ophthalmologist in assessing the quantitative characteristics of biomarkers in cases of neovascular macular degeneration.
The main part of the presentation will address the problem of effective planning of radiation therapy. For planning, it is necessary to segment a large number of anatomical structures. The task of segmentation is complicated by the fact that 1) three-dimensional medical images are used and 2) the organs of patients are abnormal. For these reasons, the results of automatic segmentation require manual corrections. An approach will be presented to optimize the segmentation correction process in real time based on information about the doctor's view. In the additional part of the presentation, the problem of interpretability of deep models will be considered.
Radiologists dedicate more than half of their diagnostic time to in- terpreting computed tomography (CT) scans, with chest and abdominal scans being particularly detailed and time-intensive due to the need to meticulously identify and describe a variety of diseases. Our cutting-edge product simultaneously analyzes 10 different diseases in these scans, in- cluding disorders affecting the lungs, heart, bones, and abdominal regions. In this study, we demonstrate how introducing an AI-assisted study pro- vides a substantial time-saving advantage and lessens the heavy workload currently borne by radiologists. Specifically, it saves up to 20% of the time spent on CT examinations (≈ 2.5 mins on average), and increases the average recall by over 29%, while preserving the same level of positive predictive value.
In this talk we will describe the challenges congestion control for file transfer has, propose an implementation of congestion control algorithm based on Reinforcement Learning techniques and show how it was applied in real life
Over the past years, foundation models and LLMs have demonstrated enhancements in measurable aspects and the development of new qualitative features, creating a need for their comprehensive evaluation and analysis of the associated risks. To address these issues, we present MERA, a new instruction benchmark for evaluating foundation models oriented toward the Russian language. The benchmark encompasses 21 evaluation tasks for generative models. The talk presents the new evaluation methodology, an open-source code base for the MERA assessment, a leaderboard with a submission system, and the evaluated baselines' results.
This presentation aims to provide a comprehensive overview of Federated Learning, highlighting its recent developments, applications, and trends as of 2023. Federated Learning, a rapidly evolving field in machine learning, involves training algorithms across decentralized devices or servers while keeping data localized. The talk will commence with a brief introduction to Federated Learning, elucidating its core principles and significance.
Following this, the presentation will delve into various key cases and application areas, demonstrating the practical utility and versatility of Federated Learning in diverse sectors. A significant portion of the talk will be dedicated to discussing the advancements in this domain over the course of 2023. This examination is grounded in a thorough study of the general informational landscape on this topic, encompassing an analysis of thematic conferences, academic publications, updates to open-source tools, and GitHub repositories.
Additionally, the presentation will showcase a curated collection of news from companies developing solutions in this area, aiming to provide insights into the business and technological implications of these developments. A critical evaluation of the maturity level of Federated Learning technology will be offered, assessing its readiness for widespread adoption. This assessment will touch upon the challenges faced, potential risks, and the future prospects of Federated Learning, providing a well-rounded perspective on its current state and future trajectory.
Node2Vec is a widely used algorithm for learning feature representations of the graph nodes. This algorithm is intensivelly used in multiple highload applications. Thus its performance is very important. There are two reference implementations of the Node2Vec in C++ and Python from Stanford Network Analysis Project (SNAP). However, their performance is not optimal. We introduce an optimized implementation of the Node2Vec algorithm, which performance is 2.5-5.1 times higher than the reference ones. We also prove that the accuracy of the optimized algorithm stays the same by solving a multi-label node classification problem on several datasets.
Linguistic and logical text structures are very useful for some applied tasks like dialogue generation, argument mining and fact verification. We will consider several cases of such tasks: multi-party dialogue generation by means of discourse structure and also fact correction based on information retrieval combined with logical reasoning.
Robust Reinforcement Learning (RRL) is a promising Reinforcement Learning (RL) paradigm aimed at training robust to uncertainty or disturbances models, making them more efficient for real-world applications. Following this paradigm, uncertainty or disturbances are interpreted as actions of a second adversarial agent, and thus, the problem is reduced to seeking the agents' policies robust to any opponent's actions. This paper is the first to propose considering the RRL problems within the positional differential game theory, which helps us to obtain theoretically justified intuition to develop a centralized Q-learning approach. Namely, we prove that under Isaacs's condition (sufficiently general for real-world dynamical systems), the same Q-function can be utilized as an approximate solution of both minimax and maximin Bellman equations, and we also indicate conditions when this Q-function can be decomposed. Based on these results, we present the Isaacs Deep Q-Networks (IDQN) and Decomposed Isaacs Deep Q-Networks (DIDQN) algorithms, respectively. We analyze their performance by comparing them with other baseline RRL and Multi-Agent RL algorithms. We consider both simple environments with known accurate solutions and complex large-dimensional MuJoCo environments. In each experiment, we thoroughly evaluate the agents' policies obtained after learning, training opponents against them using various RL algorithms with various parameters. The experiment results demonstrate the superiority of the presented algorithms in all experiments under consideration.
As industry needs to process growing amounts of training data, reduce the cost of fine-tuning a single model, and minimize the environmental effects, the task of accelerating the fine-tuning of large language models (LLM) has become extremely demanding. DAREL is a novel training data reduction method that operates with training samples based on losses obtained from a currently trained model or a pre-trained one. The proposed method is devoted to Large Language Models fine-tuning and is designed primarily to be combined with Parameter-Efficient fine-tuning methods, such as LoRA. The results of computational experiments provide compelling evidence of the enhancement of the fine-tuning quality and time of Large Language Models. DAREL allows an average 1.26x fine-tuning acceleration for GPT2-S, GPT2-M and GPT2-L on a variety of datasets, including E2E-NLG, DART and WebNLG, with an average BLEU drop of 1.44 p.p.
Automatic humor detection is a highly relevant task for conversational AI. To date, there are several English datasets for this task, but little research on how models trained on them generalize and behave in the wild. To fill this gap, we carefully analyze existing datasets, train RoBERTa-based and Naïve Bayes classifiers on each of them, and test on the rest. Training and testing on the same dataset yields good results, but the transferability of the models varies widely. Models trained on datasets with jokes from different sources show better transferability, while the amount of training data has a smaller impact. The behavior of the models on out-of-domain data is unstable, suggesting that some of the models overfit, while others learn non-specific humor characteristics. An adversarial attack shows that models trained on pun datasets are less robust. We also evaluate the sense of humor of the chatGPT and Flan-UL2 models in a zero-shot scenario. The LLMs demonstrate competitive results on humor datasets and a more stable behavior on out-of-domain data. We believe that the obtained results will facilitate the development of new datasets and evaluation methodologies in the field of computational humor. We've made all the data from the study and the trained models publicly available.
Reinforcement Learning is used for solving of different problems and tasks in different subject areas (traffic control, behavior modelling, SW testing, cybersecurity etc.). There are a lot of real-world tasks when a single agent have to cope with other agents (to coordinate or compete) and multi-agent systems (MAS) is used for such situations. High-dimensional RL-MAS environment causes "curse of dimension" problem and deep learning helps to solve this problem efficiently. This presentation covers some examples of using RL and DeepRL for multi-agent systems.
We will discuss why we decided to combine multimodal networks, unlabelled data, and a fresh perspective on the DICOM format into a single fundamental model. We'll explore what this has brought us and why the future lies in this direction.
Alexey Trutnev,
Huawei RRI