BigData Express Design 

BigData Express typically runs in a large data center, such as the DOE Leadership Computing Facilities. As illustrated in Figure 1, such a site typically features a dedicated cluster of high-performance DTNs, an SDN-based BigData-Express LAN, and a large-scale storage system.

  • The dedicated DTNs are deployed using the Science DMZ architecture. A high-performance DTN is typically a NUMA-based multicore system with multiple NICs configured. mdtmFTP (a high performance data transfer tool) runs on each DTN.
  • The BigData-Express LAN is a network slice dedicated to the DTNs for bulk data transfer. It consists of either physical or virtualized SDN-enabled switches or routers and connects the DTNs to a GW -- a gateway router or switch that connects to external networks.
  • At Leadership Computing Facilities, the storage system is typically a center-wide shared parallel file system (e.g., Lustre File System). All DTNs access this shared storage via a high bandwidth and well-connected Infiniband interconnect.

Figure 1 BigData Express architecture

A logically centralized BigData Express scheduler coordinates all activities at each BigData Express site. This BigData Express scheduler manages and schedules local resources (DTNs, storage, and the BigData Express LAN) through agents (DTN agents, storage agents, and network agents). Each type of resource may require one or multiple agents. This architecture offers flexibility and scalability. BigData Express Schedulers at different sites collaborates to execute data transfer tasks.

The service interface authenticates, authorizes, and audits users and user applications and allow them to access BigData Express services.

DTN agents collects and reports the DTN configuration and status. They will also assign DTNs to data transfer tasks as requested by the BigData Express scheduler.

Network agents (i.e., AmoebaNet) keep track of the BigData Express LAN topology and traffic status with the aid of SDN controllers. As requested by the BigData Express scheduler, it programs networks at run-time to provide custom network services.

SDN controllers are open-source network operating systems, such as ONOS. AmoebaNet accesses the SDN controllers through northbound APIs (NB-APIs).

Storage agents keep track of the usage of local storage systems, provide information regarding storage resource availability to the scheduler, and execute storage assignments.

The resource broker implements a distributed rate-based resource brokering mechanism to coordinate resource allocation across autonomous sites.

The BigData Express scheduler implements a time-constraint-based scheduler to schedule resources for data transfer tasks. Each resource is estimated, calculated, and converted into a rate that can be apportioned to data transfer tasks. On an event-driven or periodic basis, the scheduler performs the following tasks:

  • Resource estimation and calculation. Estimating and calculating the local site resources that can be assigned to data transfer tasks.
  • Resource pre-allocation. Implementing a time-constraint-based resource allocation mechanism to pre-allocate the local site resources—in terms of rates—to data transfer tasks.
  • Resource brokering. Launching the resource broker to implement the resource brokering mechanism to coordinate rate pre-allocation across sites for a particular data transfer task and determining the coordinated end-to-end data transfer rate for the task.
  • Resource assignment. Assigning the local site resources to data transfer tasks based on their coordinated end-to-end data transfer rates.

BigData Express requires an SDN-based on-demand site-to-site path service to provide paths between data source and destination sites, with guaranteed bandwidth and in designated time slots. ESnet provides the underlying on-demand path services required by BigData Express.

  • Last modified
  • 10/09/2018