CRII: OAC: Enabling Quantities-of-Interest Error Control for Trust-Driven Lossy Compression
PI: Xin Liang
Time period: 06/01/2022 - 05/31/2024
Scientific simulations and instruments are producing data at volumes and velocities that overwhelm network and storage systems. Although error-controlled lossy compressors have been employed to mitigate these data issues, many scientists still feel reluctant to adopt them because these compressors provide no guarantee on the accuracy of downstream analysis results derived from raw data. This project aims to fill this gap by developing a trust-driven lossy data compression infrastructure capable of strictly controlling the errors in downstream analysis theoretically and practically to facilitate the use of data reduction in scientific applications. Success of this project will promote the progress of science in multiple disciplines via effective data reduction, and contribute to resolving important societal problems including electric generation, weather forecasting, material design, and transportation. Moreover, this project will contribute to the growth and development of future generations of scientists and engineers through educational and engagement activities, including development of new curriculum and recruitment of K-12 students.
Existing lossy compression techniques either overlook error quantification or provide error control only for raw data, leaving uncertainties in the outcome of downstream quantities of interest (QoIs) computed from the raw data. This greatly concerns many computational scientists who wish to reduce their data while preserving necessary information, preventing them from adopting lossy compression in their applications. This research will address these problems through an integration of theory and implementation via three tasks. First, a novel theory enabling error control on downstream QoIs will be developed. This will fundamentally address the trustability issues of existing error controlled lossy compressors that provide error control only on raw data. Second, an optimization method ensuring tight error control will be applied based on rigorous analysis, to achieve higher compression ratios under the same requirements. Third, a scalable infrastructure will be built through a careful integration with advanced compression frameworks and tailored parallelization based on target QoIs, in order to take full advantage of existing compression algorithms and computational patterns in the target QoIs. The project will enable application scientists to store the most valuable information in their data based on their unique needs, creating opportunities for novel findings in multiple scientific disciplines including climatology, cosmology, and seismology.
Machine Learning for Secure and Resilient Information Management in Combat Cloud
Sole PI: Sanjay Madria
Time period: 1/2021 - 05/2022
In a battlefield zone, Command-and-Control (C2) capabilities can be improved using the Combat Cloud paradigm which can enable network-aware disruption tolerant information flows, provide distributed data stores, and efficient data dissemination across several groups of forces deployed with different mission goals. To continue battlefield missions appropriately, and get a better understanding of the situation, the forces as well as C2 need to collect timely information generated in the war zone. However, due to the damaged/degraded network infrastructure, or the unavailability of information-servers connectivity especially in the hostile area, it is a challenge to forward information in this extreme situation. In this dynamic surrounding, any sudden important event-related information or possible mission updates (as the mission evolves) should also be sent to C2 with the help of intermediate nodes regardless of their mission interests. We propose to learn the mission interests dynamically, and optimally store and forward the information generated by the nodes as the mission evolves to C2 using Reinforcement Learning (RL). In this forwarding process, we will focus on identifying the trending mission interests (related to updated missions or events) for continuous decision-making by considering the mobility and connectivity of the nodes and considering the changes of the perceived network model and accordingly modify appropriate ‘reward’ functions based on past learning. The machine learning will help in learn-and-adapt to space-time evolution of data requests as well as local policies used by at nodes in determining the currently cached objects and what should be prefetched next along with determining how many objects need to be prefetched based on determined mission priorities, expected latency, etc. These features, in practice, usually exhibit unknown and temporal dynamics because the most popular content at the current epoch may not receive the highest attention in the future; and mobile users could change locations as time passes. The combat cloud in contested environments faces challenges while making the prioritized data available to different groups in a timely and secure fashion (authenticated, un-tempered, and trusted). For secure, end-to-end, mission-oriented, data dissemination, it needs a resilient and secure information processing layer for Information Exchange Requirements (IERs) (tasks, operational elements, and information flow). The security and resiliency need machine learning-based methods for targeted content dissemination, and, proactive dissemination/caching (TA1), and dynamic mission-oriented data discovery (TA2). Secure information processing in combat cloud needs efficient and dynamic fine-grained Attribute-based key distribution, verification, and revocation for group-based coordination (TA3) for a collaborative DIL environment. We will design algorithms, and develop a system prototype in a Delayed/Disconnected, Intermittently-Connected, Low-Bandwidth (DIL) environment to validate the Combat Cloud design discussed here.
CNS Core: Small: FLINT: Robust Federated Learning for Internet of Things
PI: Tony Luo
Co-PI: Sajal Das
Time period: 10/1/2020 - 09/30/2023
Federated learning enables machine learning on distributed datasets without needing the learner to access directly the datasets owned by respective stakeholders. The Internet of Things (IoT) provides a fertile ground for applying federated learning, where distributed IoT devices produce a plethora of data that are often private. However, IoT devices are vulnerable to environments with inaccurate data samples and malicious attacks, which is a significant challenge for federated learning. Agglomerating data in a federated and robust manner may produce benefits to the economy and society.
Objectives of the Robust Federated Learning for Internet of Things (FLINT) project include: (1) Formulate federated learning (FL) in heterogeneous, dynamic IoT environments with unreliable and adversarial clients. (2) Design new FL algorithms that are robust against hostile conditions with benign, unreliable, and malicious clients injecting erroneous or poisonous data. (3) Design novel incentive mechanisms to ensure rational clients gain non-negative utility by contributing training data and resources. (4) Analyze complexity, performance, and theoretical bounds of proposed algorithms. (5) Build an IoT testbed to study the learning performance of robust FL solutions. (6) Simulation experiments on real-world datasets to evaluate performance scalability.
The FLINT project will offer graduate and undergrad students a unique opportunity to gain interdisciplinary education in the design of robust FL algorithms for IoT. Research findings will enrich courses on cyber-physical systems and machine learning for IoT.
Secure Monitoring and Reconnaissance for CBRN and other Threats
Sole PI: Sanjay Madria
Time period: 10/2021 - 08/2023
In this proposal, the focus is to enhance Chemical, biological, radiological, and nuclear defense (CBRN) reconnaissance capability to detect and track CBRN hazards on the move without transmitting GPS data, which increases the survivability of CBRN reconnaissance forces. It also integrates machine learning into CBRN reconnaissance platforms, increasing the quality of service (QoS), privacy, and capability to conduct continuous monitoring and reporting in real-time. The objective of this proposal is to design, develop and demonstrate trajectory-based data access control protocol for GPS-free information detecting and dissemination to enable real-time target tracking, CBRN hazards marking, and real-time remote control. As the battlefield environment requires minimal radio footprint and maximum local anonymity, we aim to eliminate the location information in all the radio communication while still allow different groups of forces to share the trajectory of CBRN intelligence.
SANDY: Sparsification-Based Approach for Analyzing Network Dynamics
PI: Sajal K. Das
Award amount: $163,067
Award date: 9/1/17 to 8/31/20
The goal of this three-year project, Sparsification-based Approach for Analyzing Network Dynamics (SANDY), is to develop a suite of scalable parallel algorithms for updating dynamic networks for different problems that can be executed on a wide range of HPC platforms. Dynamic network analysis will enable researchers to study the evolution of complex systems in diverse disciplines, such as bioinformatics, social sciences, and epidemiology. The SANDY project is expected to initiate a new direction of research in developing parallel dynamic network algorithms that will benefit multiple analysis objectives (e.g., motif finding and network alignment) and application domains (e.g., epidemiology, health care). [Read more] [Sept 24, 2018]
NeTS: JUNO2: Collaborative Research: STEAM: Secure and Trustworthy Framework for Integrated Energy and Mobility in Smart Connected Communities
PI: Sajal K. Das
Award amount: $91,090
Award date: 9/1/18 to 8/31/19
The rapid evolution of data-driven analytics, Internet of things (IoT) and cyber-physical systems (CPS) are fueling a growing set of Smart and Connected Communities (SCC) applications, including for smart transportation and smart energy. However, the deployment of such technological solutions without proper security mechanisms makes them susceptible to data integrity and privacy attacks, as observed in a large number of recent incidents. If not addressed properly, such attacks will not only cripple SCC operations but also influence the extent to which customers are willing to share data. This in turn will make trustworthiness in SCC applications very challenging. To address this, a synergistic team of researchers from the US and Japan, under the JUNO2 program, will collaborate on this project, called STEAM (Secure and Trustworthy framework for integrated Energy and Mobility) to develop a framework to ensure data privacy, data integrity, and trustworthiness in smart and connected communities. [Read more] [Sept 24, 2018]
CPS: TTP Option: Medium: Collaborative Research: Trusted CPS from Untrusted Components
PI: Bruce McMillin
Co-PI's: Rui Bo and Jonathan Kimball
Time Period: 10/1/2018 - 09/30/2021
The nation's critical infrastructures are increasingly dependent on systems that use computers to control vital physical components, including water supplies, the electric grid, airline systems, and medical devices. These are all examples of Cyber-Physical Systems (CPS) that are vulnerable to attack through their computer systems, through their physical properties such as power flow, water flow, chemistry, etc., or through both. The potential consequences of such compromised systems include financial disaster, civil disorder, even the loss of life. The proposed work significantly advances the science of protecting CPS by ensuring that the systems "do what they are supposed to do" despite an attacker trying to make them fail or do harm. In this convergent approach, the key is to tell the CPS how it is supposed to behave and build in defenses that make sure each component behaves and works well with others. The proposed work has a clear transition to industrial practice. It will also enhance education and opportunity by opening up securing society as a fascinating discipline for K-12 students to follow.
GAANN: A Doctoral Program in Big Data, Machine Learning, and Analytics for Security and Safety
This GAANN research program is to contribute toward the national need to address Big Data, Machine Learning and Analytics for Cyber Security and Safety in terms of Research, Education and Training. Big data is transforming science and engineering with applications in cyber security, infrastructure monitoring, and ultimately society itself. Global technological leadership in this area is important for the United States and can be sustained by educating future leaders. With new paradigms and technologies, big data and machine learning research continues with new innovative outcomes from both industry and academia. We recognize that one of the most effective means for designing, developing, and deploying big data systems and analytical solutions for cyber security is by making technical solutions and applications relatively quickly, while making these accessible to local and state entities as well as to the government in a timely manner. [Read more] [Dec 17, 2018]