
Refine Results

(Filters Applied) Clear All

Backdoor poisoning of encrypted traffic classifiers


Significant recent research has focused on applying deep neural network models to the problem of network traffic classification. At the same time, much has been written about the vulnerability of deep neural networks to adversarial inputs, both during training and inference. In this work, we consider launching backdoor poisoning attacks against an encrypted network traffic classifier. We consider attacks based on padding network packets, which has the benefit of preserving the functionality of the network traffic. In particular, we consider a handcrafted attack, as well as an optimized attack leveraging universal adversarial perturbations. We find that poisoning attacks can be extremely successful if the adversary has the ability to modify both the labels and the data (dirty label attacks) and somewhat successful, depending on the attack strength and the target class, if the adversary perturbs only the data (clean label attacks).


Significant recent research has focused on applying deep neural network models to the problem of network traffic classification. At the same time, much has been written about the vulnerability of deep neural networks to adversarial inputs, both during training and inference. In this work, we consider launching backdoor poisoning attacks...


Predicting ankle moment trajectory with adaptive weighted ensemble of LSTM network

Published in:
2022 IEEE High Perf. Extreme Comp. Conf. (HPEC), 19-23 September 2022, DOI: 10.1109/HPEC55821.2022.9926370.


Estimations of ankle moments can provide clinically helpful information on the function of lower extremities and further lead to insight on patient rehabilitation and assistive wearable exoskeleton design. Current methods for estimating ankle moments leave room for improvement, with most recent cutting-edge methods relying on machine learning models trained on wearable sEMG and IMU data. While machine learning eliminates many practical challenges that troubled more traditional human body models for this application, we aim to expand on prior work that showed the feasibility of using LSTM models by employing an ensemble of LSTM networks. We present an adaptive weighted LSTM ensemble network and demonstrate its performance during standing, walking, running, and sprinting. Our result show that the LSTM ensemble outperformed every single LSTM model component within the ensemble. Across every activity, the ensemble reduced median root mean squared error (RMSE) by 0.0017-0.0053 N. m/kg, which is 2.7 – 10.3% lower than the best performing single LSTM model. Hypothesis testing revealed that most reductions in RMSE were statistically significant between the ensemble and other single models across all activities and subjects. Future work may analyze different trajectory lengths and different combinations of LSTM submodels within the ensemble.


Estimations of ankle moments can provide clinically helpful information on the function of lower extremities and further lead to insight on patient rehabilitation and assistive wearable exoskeleton design. Current methods for estimating ankle moments leave room for improvement, with most recent cutting-edge methods relying on machine learning models trained on...


Multimodal physiological monitoring during virtual reality piloting tasks


This dataset includes multimodal physiologic, flight performance, and user interaction data streams, collected as participants performed virtual flight tasks of varying difficulty. In virtual reality, individuals flew an "Instrument Landing System" (ILS) protocol, in which they had to land an aircraft mostly relying on the cockpit instrument readings. Participants were presented with four levels of difficulty, which were generated by varying wind speed, turbulence, and visibility. Each of the participants performed 12 runs, split into 3 blocks of four consecutive runs, one run at each difficulty, in a single experimental session. The sequence of difficulty levels was presented in a counterbalanced manner across blocks. Flight performance was quantified as a function of horizontal and vertical deviation from an ideal path towards the runway as well as deviation from the prescribed ideal speed of 115 knots. Multimodal physiological signals were aggregated and synchronized using Lab Streaming Layer. Descriptions of data quality are provided to assess each data stream. The starter code provides examples of loading and plotting the time synchronized data streams, extracting sample features from the eye tracking data, and building models to predict pilot performance from the physiology data streams.


This dataset includes multimodal physiologic, flight performance, and user interaction data streams, collected as participants performed virtual flight tasks of varying difficulty. In virtual reality, individuals flew an "Instrument Landing System" (ILS) protocol, in which they had to land an aircraft mostly relying on the cockpit instrument readings. Participants were...


Self-supervised contrastive pre-training for time series via time-frequency consistency

Published in:
arXiv, June 16, 2022.


Pre-training on time series poses a unique challenge due to the potential mismatch between pre-training and target domains, such as shifts in temporal dynamics, fast-evolving trends, and long-range and short cyclic effects, which can lead to poor downstream performance. While domain adaptation methods can mitigate these shifts, most methods need examples directly from the target domain, making them suboptimal for pre-training. To address this challenge, methods need to accommodate target domains with different temporal dynamics and be capable of doing so without seeing any target examples during pre-training. Relative to other modalities, in time series, we expect that time-based and frequency-based representations of the same example are located close together in the time-frequency space. To this end, we posit that time-frequency consistency (TF-C) — embedding a time-based neighborhood of a particular example close to its frequency-based neighborhood and back—is desirable for pre-training. Motivated by TF-C, we define a decomposable pre-training model, where the self-supervised signal is provided by the distance between time and frequency components, each individually trained by contrastive estimation. We evaluate the new method on eight datasets, including electrodiagnostic testing, human activity recognition, mechanical fault detection, and physical status monitoring. Experiments against eight state-of-the-art methods show that TF-C outperforms baselines by 15.4% (F1 score) on average in one-to-one settings (e.g., fine-tuning an EEG-pretrained model on EMG data) and by up to 8.4% (F1 score) in challenging one-to-many settings (e.g., fine-tuning an EEG-pretrained model for either hand-gesture recognition or mechanical fault prediction), reflecting the breadth of scenarios that arise in real-world applications. The source code and datasets are available at


Pre-training on time series poses a unique challenge due to the potential mismatch between pre-training and target domains, such as shifts in temporal dynamics, fast-evolving trends, and long-range and short cyclic effects, which can lead to poor downstream performance. While domain adaptation methods can mitigate these shifts, most methods need...


Fun as a strategic advantage: applying lessons in engagement from commercial games to military logistics training


Digital games offer many elements to augment traditional classroom lectures and reading assignments. They enable players to explore concepts through repeat play in a low-risk environment, and allow players to integrate feedback given during gameplay and evaluate their own performance. Commercial games leverage a number of features to engage players and hold their attention. But do those engagement-improving methods have a place in instructional environments with a captive and motivated audience? Our experience building a logistics supply chain training game for the Marine Corps University suggests that yes; applying lessons in engagement from commercial games can both help improve player experience with the learning environment, and potentially improve learning outcomes.


Digital games offer many elements to augment traditional classroom lectures and reading assignments. They enable players to explore concepts through repeat play in a low-risk environment, and allow players to integrate feedback given during gameplay and evaluate their own performance. Commercial games leverage a number of features to engage players...


Graph-guided network for irregularly sampled multivariate time series

Published in:
International Conference on Learning Representations, ICLR 2022.


In many domains, including healthcare, biology, and climate science, time series are irregularly sampled with varying time intervals between successive readouts and different subsets of variables (sensors) observed at different time points. Here, we introduce RAINDROP, a graph neural network that embeds irregularly sampled and multivariate time series while also learning the dynamics of sensors purely from observational data. RAINDROP represents every sample as a separate sensor graph and models time-varying dependencies between sensors with a novel message passing operator. It estimates the latent sensor graph structure and leverages the structure together with nearby observations to predict misaligned readouts. This model can be interpreted as a graph neural network that sends messages over graphs that are optimized for capturing time-varying dependencies among sensors. We use RAINDROP to classify time series and interpret temporal dynamics on three healthcare and human activity datasets. RAINDROP outperforms state-of-the-art methods by up to 11.4% (absolute F1-score points), including techniques that deal with irregular sampling using fixed discretization and set functions. RAINDROP shows superiority in diverse setups, including challenging leave-sensor-out settings.


In many domains, including healthcare, biology, and climate science, time series are irregularly sampled with varying time intervals between successive readouts and different subsets of variables (sensors) observed at different time points. Here, we introduce RAINDROP, a graph neural network that embeds irregularly sampled and multivariate time series while also...


Probabilistic coordination of heterogeneous teams from capability temporal logic specifications


This letter explores coordination of heterogeneous teams of agents from high-level specifications. We employ Capability Temporal Logic (CaTL) to express rich, temporal-spatial tasks that require cooperation between many agents with unique capabilities. CaTL specifies combinations of tasks, each with desired locations, duration, and set of capabilities, freeing the user from considering specific agent trajectories and their impact on multi-agent cooperation. CaTL also provides a quantitative robustness metric of satisfaction based on availability of required capabilities for each task. The novelty of this letter focuses on satisfaction of CaTL formulas under probabilistic conditions. Specifically, we consider uncertainties in robot motion (e.g., agents may fail to transition between regions with some probability) and local probabilistic workspace properties (e.g., if there are not enough agents of a required capability to complete a collaborative task). The proposed approach automatically formulates amixed-integer linear program given agents, their dynamics and capabilities, an abstraction of the workspace, and a CaTL formula. In addition to satisfying the given CaTL formula, the optimization considers the following secondary goals (in decreasing order of priority): 1) minimize the risk of transition failure due to uncertainties; 2) maximize probabilities of regional collaborative satisfaction (if there is an excess of agents); 3) maximize the availability robustness of CaTL for potential agent attrition; 4) minimize the total agent travel time. We evaluate the performance of the proposed framework and demonstrate its scalability via numerical simulations.


This letter explores coordination of heterogeneous teams of agents from high-level specifications. We employ Capability Temporal Logic (CaTL) to express rich, temporal-spatial tasks that require cooperation between many agents with unique capabilities. CaTL specifies combinations of tasks, each with desired locations, duration, and set of capabilities, freeing the user from...


Fast decomposition of temporal logic specifications for heterogeneous teams

Published in:
IEEE Robot. Autom. Lett., Vol. 7, No. 2, April 2022, pp. 2297-2304.


We focus on decomposing large multi-agent path planning problems with global temporal logic goals (common to all agents) into smaller sub-problems that can be solved and executed independently. Crucially, the sub-problems' solutions must jointly satisfy the common global mission specification. The agents' missions are given as Capability Temporal Logic (CaTL) formulas, a fragment of Signal Temporal Logic (STL) that can express properties over tasks involving multiple agent capabilities (i.e., different combinations of sensors, effectors, and dynamics) under strict timing constraints. We jointly decompose both the temporal logic specification and the team of agents, using a satisfiability modulo theories (SMT) approach and heuristics for handling temporal operators. The output of the SMT is then distributed to subteams and leads to a significant speed up in planning time compared to planning for the entire team and specification. We include computational results to evaluate the efficiency of our solution, as well as the trade-offs introduced by the conservative nature of the SMT encoding and heuristics.


We focus on decomposing large multi-agent path planning problems with global temporal logic goals (common to all agents) into smaller sub-problems that can be solved and executed independently. Crucially, the sub-problems' solutions must jointly satisfy the common global mission specification. The agents' missions are given as Capability Temporal Logic (CaTL)...


Tools and practices for responsible AI engineering


Responsible Artificial Intelligence (AI)—the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability—represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries—hydra-zen and the rAI-toolbox—that address critical needs for responsible AI engineering. hydra-zen dramatically simplifies the process of making complex AI applications configurable, and their behaviors reproducible. The rAI-toolbox is designed to enable methods for evaluating and enhancing the robustness of AI-models in a way that is scalable and that composes naturally with other popular ML frameworks. We describe the design principles and methodologies that make these tools effective, including the use of property-based testing to bolster the reliability of the tools themselves. Finally, we demonstrate the composability and flexibility of the tools by showing how various use cases from adversarial robustness and explainable AI can be concisely implemented with familiar APIs.


Responsible Artificial Intelligence (AI)—the practice of developing, evaluating, and maintaining accurate AI systems that also exhibit essential properties such as robustness and explainability—represents a multifaceted challenge that often stretches standard machine learning tooling, frameworks, and testing methods beyond their limits. In this paper, we present two new software libraries—hydra-zen and...


Selective network discovery via deep reinforcement learning on embedded spaces

Published in:
Appl. Netw. Sci., Vol. 6, No.1, December 2021, Art. No. 24.


Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream learning tasks given resource collection constraints are of great interest. In this paper, we formulate the task-specific network discovery problem as a sequential decision-making problem. Our downstream task is selective harvesting, the optimal collection of vertices with a particular attribute. We propose a framework, called network actor critic (NAC), which learns a policy and notion of future reward in an offline setting via a deep reinforcement learning algorithm. The NAC paradigm utilizes a task-specific network embedding to reduce the state space complexity. A detailed comparative analysis of popular network embeddings is presented with respect to their role in supporting offline planning. Furthermore, a quantitative study is presented on various synthetic and real benchmarks using NAC and several baselines. We show that offline models of reward and network discovery policies lead to significantly improved performance when compared to competitive online discovery algorithms. Finally, we outline learning regimes where planning is critical in addressing sparse and changing reward signals.


Complex networks are often either too large for full exploration, partially accessible, or partially observed. Downstream learning tasks on these incomplete networks can produce low quality results. In addition, reducing the incompleteness of the network can be costly and nontrivial. As a result, network discovery algorithms optimized for specific downstream...