EPSON: Embracing Parallel Networks and Storage for Predictable End-to-End Data Movement (NSF Funded) [PI: Michael Papka (NIU), Co-I: Venkatram Vishwanath (NIU)] - Geographically distributed scientific communities require increasingly sophisticated data transfer mechanisms that can handle the challenges of sharing large datasets over heterogeneous networks. These challenges include optimization of networks with different configurations and protocols, I/O mechanisms to efficiently read and write to parallel storage, and the varying demands of widely different data transfer workloads.

To address these challenges, the EPSON project is developing, implementing, and evaluating application programming interfaces and tools that facilitate end-to-end parallel data transfers. EPSON researchers focus on three areas: (1) enabling parallel network data movement, by taking into account the diversity of parallel network characteristics of both shared networks and infrastructures with dedicated circuits and paths and effectively balancing the flows among paths for more predictable performance; (2) developing a GridFTP data storage interface, enabling scalable I/O to and from parallel filesystems -- critical for campus infrastructures to deal with large-scale datasets; and (3) devising mechanisms that overlap network transfers with storage I/O and incorporate data-staging heuristics, matching the impedance between storage and networking capabilities to improve end-to-end data transfers. The project involves close collaboration with application scientists, with the objective of providing advanced networking tools to support the requirements of the applications deployed at the University of Chicago and Northern Illinois University campuses.

Development of an Urban-Scale Instrument for Interdisciplinary Research (Array of Things) 

(NSF Funded) [PI: Charlie Catlett (Chicago), Co-I: Pete Beckman (Chicago), Kathleen Cagney (Chicago), Michael Papka (NIU), Daniel Work (Illinois)]- This Major Research Instrumentation (MRI) grant will enable the creation of a new tool - The Array of Things - for continuously measuring many aspects of the physical environment of urban areas at the city block scale. Data collected will fill gaps in understanding across many disciplines such as how air pollution is controlled by urban form and traffic patterns, how different materials and surroundings affect the magnitude of the urban heat island, what correlations exist between weather, noise, pollution and traffic and social and behavioral trends, and how urban infrastructure faults and failures can be better detected and predicted. The instrument opens an opportunity to experiment with new sensors and data collection strategies such as will be critical to understanding the urban microbiome or tracking disease outbreaks. Ultimately, the objective is to not only better understand elements of the built and natural infrastructure, but also the interactions of infrastructure systems with people and the environment, to understand complex city dynamics.

The instrument will comprise 500 nodes deployed in the City of Chicago, each with power, Internet, and a base set of sensing and embedded information systems capabilities, with capacity to evolve annually to incorporate new technologies and experiments from an already growing scientific partner community. The project will directly support scientific data infrastructure services for specific communities - engineering, physical, life, social, and climate sciences - with a data repository and tools to support workflows that enable combination with other urban data sources, analytics, and calibration and validation of computational models. In addition, the instrument's open data and application programming interface architectures will support science and education communities incorporating the data into their existing algorithms, tools, and methodologies, to merge the data with their community data resources, and to develop applications that directly access the instrument data in real time. The instrument will provide better spatial and temporal resolution of multiple phenomena simultaneously and at larger scales than what is available from other sensing modalities in urban infrastructure networks. The instrument will be the first instance of a general-purpose research infrastructure that allows researchers to rapidly deploy networks of sensors, embedded systems, computing, and communication systems at scale in urban environments.