NSF Rings: Federated Learning Research at WNCG

Share this content

Published:
October 30, 2024
HAV_GAD

The principles of Machine Learning (ML) power countless tools we use in our daily lives. ML methods for interpreting and learning from collected data are being implemented in everything from collaborative medical diagnosis and cancer detection to preventing financial crime. As engineers and researchers continue to seek out ways to refine ML models, data privacy concerns become increasingly relevant. Federated Learning (FL) provides a potential solution in its approach to model training.

With FL, access to sensitive data is decentralized; each client maintains its own data rather than directly share it with any central server. The ML model is constructed in an iterative manner, where a server broadcasts its current model to participating clients, who then evaluate the model based on their local data and determine how it might be improved. The clients then send their updates to the server which aggregates the feedback to refine the current model. 

FL enables a potentially large number of clients to collaborate in learning how to perform a task while preserving their privacy. However, depending on the size of the model, clients may be sharing sizeable files with the server, from 10s to 100s of Mb. Thus, FL might require massive amounts of data to be exchanged. Other factors, such as availability of the clients, network connectivity, and congestion pose major challenges to realizing FL based systems in practice. 

WNCG professors Gustavo de Veciana, Haris Vikalo, and their research teams have been working on improving network support for FL models by addressing these issues in several key ways:

  1. Client sampling: limiting congestion by judiciously sampling amongst clients with intermittent availability to participate in each round of updates, and in doing so ensuring that the learning process converges steadily and is not affected by variability.   
  2. Lossy compression: exploring ways to leverage adaptive lossy compression for clients’ server updates. This facilitates the learning process while reducing traffic and minimizing the time for model convergence.
  3. Data aggregation networks (DANs): using overlay networks to efficiently aggregate updates from a large number of clients in a fault tolerant manner, substantially reducing traffic to the server. This type of solution is complementary to content delivery Networks (CDNs) which are used to disseminate information on the Internet.
  4. Model compression for FL systems: introducing mixed-precision quantization to resource-heterogenous FL systems, allowing implementation of the FL framework in resource-constrained settings.

The researchers are also exploring the training of personalized models with FL. By grouping together clients with similar characteristics, they can tailor models to the needs of these groups while maintaining broader collaboration and reducing the amount of data processed by the server. This balance allows for more effective model training, as it captures individual user preferences without sacrificing the benefits of collective data insights.

“Clients may have access to different resources, meaning their devices might be of different capabilities, and they may be dealing with data skewed one way or another. To enable personalization when both the data distribution and hardware capabilities vary from one client to another, we've been pursuing a self-supervised learning platform that lets clients customize the model to their needs while still allowing for collaboration. Everyone collaborates to learn how to represent data but the part of the model used in final decision-making is trained and shared only among the clients that face similar data distribution – this is a step the system takes towards personalization,” Vikalo states.

Drs. de Veciana and Vikalo core work in this area has been collaborative, including amounts others WNCG alumna Monica Ribero of Google Research, PhD student Parikshit Hegde, and fellow UT Austin professors Aryan Mokhtari and Hyeji Kim. These collaborations and the contributions of have been instrumental in the progress of the project.

The research has been funded by the National Science Foundation (NSF) under Grant No. 2148224 and is supported by funds from OUSD R&E, NIST, and industry partners as specified in the Resilient & Intelligent NextG Systems (RINGS) program, as well as the 6G@UT Industrial Affiliates Program

The work has been very productive so far and is ongoing; de Veciana says “I think that we're coming to a much better overall understanding of the approaches that can be used to address practical challenges and understanding how to address those challenges at scale.”

 

Publications 

Federated Learning at Scale: Addressing Client Intermittency and Resource Constraints. M. Ribero, H. Vikalo and G. de Veciana. IEEE Journal of Selected Topics in Signal Processing. To appear.

Optimal Aggregation via Overlay Trees: Delay-MSE Tradeoffs under Failures, P. Hegde and 

G. de Veciana. In Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) December 2024. To Appear.

Heterogeneity-Guided Client Sampling: Towards Fast and Efficient Non-IID Federated Learning, H. Chen and H. Vikalo. In Advances in Neural Information Processing Systems (NeurIPS), December 2024. To Appear.

 Federated learning under intermittent client availability and time-varying capacity constraints. M. Ribero, H. Vikalo, and G. de Veciana. IEEE Journal of Selected Topics in Signal Processing, 17(1):98-111, January 2023.

Mixed-precision quantization for federated learning on resource-constrained heterogeneous devices. H. Chen and H. Vikalo., iIn Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024, pp. 6138-6148.

Reducing communication in federated learning via efficient client sampling. M. Ribero and H. Vikalo, Pattern Recognition, April 2024;148:110122.

Clustered federated learning via gradient partitioning. Heasung Kim, Hyeji Kim, and G. de Veciana  In Proceedings of the International Conference on Machine Learning. (ICML), pages 1-11, July 2024.

Recovering labels from local updates in federated learning. H. Chen and H. Vikalo., iIn Proceedings of the International Conference on Machine Learning (ICML), July 2024.

Fed-QSSL: A framework for personalized federated learning under bitwidth and data heterogeneity. Y. Chen, C. Wang and H. Vikalo., iIn Proceedings of the 38th AAAI Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada, February 20-27, 2024, pp:11443-11452.

Mohawk: Mobility and heterogeneity-aware dynamic community selection for hierarchical federated learning. A.-J. Farcas, M. Lee, R. Kompella, H. Latapie, G. de Veciana and R. Marculescu. In Proc. 8th ACM/IEEE Conference on Internet of Things Design and Implementation, pages 1-12, May 2023.

Network adaptive federated learning: Congestion and lossy compression. P. Hegde, G. de Veciana, and Aryan Mokhtari. In Proc. IEEE INFOCOM, pages 1-10, May 2023.

News category:
Research