Difference between revisions of "Enterprise Hyperscale Lab"

From CDOT Wiki
Jump to: navigation, search
(Pidora Koji System)
(Pidora Koji System)
(26 intermediate revisions by 2 users not shown)
Line 1: Line 1:
[[Category:Enterprise Hyperscale Lab]]
+
[[Category:Enterprise Hyperscale Lab]][[Image:EHL.jpg|thumb|200px|right|The EHL in January 2015.]]
 
The Enterprise Hyperscale Lab is operated by the [[OSTEP]] team to perform research on open source technologies and emerging hyperscale SOC-based systems.
 
The Enterprise Hyperscale Lab is operated by the [[OSTEP]] team to perform research on open source technologies and emerging hyperscale SOC-based systems.
  
 
== Equipment Overview ==
 
== Equipment Overview ==
  
The EHL consists of a dual-thermal-zone (cold/hot) rackmount cabinet with power conditioning and backup, power distribution, thermal monitoring, and 1- and 10-gigabit network services. This cabinet supports a number of hyperscale and SOC-based computers.
+
The EHL consists of two dual-thermal-zone (cold/hot) rackmount cabinets with power conditioning and backup, power distribution, thermal monitoring, and 1- and 10-gigabit network services. These cabinets supports a large number of hyperscale and SOC-based ARM computers for various applied research projects.
  
For details on the equipment in the EHL, see the [[:Category:Enterprise Hyperscale Lab|Enterprise Hyperscale Lab category page]].
+
The first EHL cabinet was installed in Summer 2014, and the second cabinet was installed in Summer 2015.
 +
 +
== Equipment Detail ==
 +
 
 +
=== Cabinet ===
 +
 
 +
Each of the two EHL cabinets is a [http://www.silentium.com/?page_id=33 Silentium Accoustirack Active], a full-height acoustically-insulated rackmount cabinet with two fan units. The lower fan unit takes air from outside the cabinet and blows it up the front of the rackmount equipment area (cold zone). Air passes through the individual devices and is vented out the back of each unit into the hot zone. A second fan unit exhausts air from the hot zone out the top of the cabinet.
 +
 
 +
Each of the fan units includes active noise cancellation so that the loaded rack can be operated in a software development lab context.
 +
 
 +
After the fan units are installed, there are [http://en.wikipedia.org/wiki/Rack_unit 33u] available for other devices.
 +
 
 +
=== Power ===
 +
 
 +
Power to each cabinet is provided by two 115 volt, 15 amp, 60 cycle circuits that power two independent APC 1.5 kW rackmount power supplies. Each of the power supplies has a network interface for remote monitoring and control.
 +
 
 +
The UPSes feed two Raritan Dominion power distribution units (PDUs), mounted vertically up the sides of the cabinet at the back, plus a third 1u horizontal unit. The PDUs act like long power bars, but have control and monitoring systems so that the per-outlet current consumption can be measured and remotely monitored over the network, using snmp or http protocols. Each outlet can also be switched on or off under network control.
 +
 
 +
Where possible, devices have been configured with dual power supplies. This means that many of the devices in the EHL are plugged into two PDUs and can continue to operate when power is cut to one of the two supplies. This permits equipment to be rewired without any downtime; it also guards against downtime due to PDU, UPS, or PSU failure.
 +
 
 +
Devices which are not configured with dual PSUs are connected to just one of the PDUs.
 +
 
 +
The total draw of the EHL equipment installed in the first cabinet as of January 2015 was approximately 1.8 kWh under load. The current power system can support a little over 3 kWh; the cabinet supports thermal exchange of about 8 kWh.
 +
 
 +
=== Environmental Monitoring ===
 +
 
 +
One of the PDUs in each cabinet is equipped with a string of three environmental sensors. These are laid out diagnonally across the EHL, so that the air intake temperature, mid-cabinet hot zone temperature, and air exit temperatures are monitored, as well as the humidity.
 +
 
 +
In January 2015, typical intake temperatures on the EHL were 25-27C and exhaust temperatures were 36-37C.
 +
 
 +
=== Networking ===
 +
 
 +
At least one Cisco 24-port gigabit switch and one Netgear 24-port 10-gigabit switch are installed in the back of each EHL cabinet. The 10g switches provide both 10GBASE-T and SFP+ connections. SFP+/DA (Direct Attach) copper cables and LC fibre optic cables are used where because they offer much lower latency than 10GBASE-T connections (2 nS vs 2 mS - one million times less latency); other connections are made with 10GBASE-T copper connections as required. Devices which do not support 10 gigabit connection are connected with 1 gigabit or 100 Mbit ethernet.
 +
 
 +
The connection between the EHL cabinets is made by a fibre optic 10 gigabit connection.
 +
 
 +
Connections between the EHL LAN and the outside world are provided by an ARM64 [http://www.diablotin.com/librairie/networking/firewall/ch04_02.htm dual-homed host] that provides firewall, NAT, forwarding, DNS, and VPN endpoint services.
 +
 
 +
=== Storage ===
 +
 
 +
Storage is provided by a Synology Rackstation in cabinet 1, which provides both storage area network (SAN, raw block devices over protocols such as iSCSI) and network-attached storage (NAS, filesystem-level shared block devices over protocols such as NFS and SMB). It is populated with twelve 1 TB SSDs and equipped with dual power supplies and dual 10-gigabit ethernet.
 +
 
 +
=== Terminal Server ===
 +
 
 +
Many of the computers installed in the EHL do not have video output (because they are not intended for desktop applications). Most of these have a serial port; in many cases, this is a virtual serial port which is accessed using the IPMI SOL (Serial-over-LAN) or similar protocol on the network. Since this is not a TCP/IP protocol, the client must run somewhere on the LAN.
 +
 
 +
For systems that do not have a working IPMI engine, each EHL cabinet is equipped with a [[Cyclades Terminal Server]] which provides remote access to 32 serial ports. A remote user can connect to a selected port to monitor and control the connected system.
 +
 
 +
=== Display ===
 +
 
 +
Cabinet 1 is equipped with a 15" 4:3 LCD monitor, bolted to a 4u blanking panel. This display is driven by a Raspberry Pi, and can be used to show educational information about the rack, current system status information, or diagnostic data.
 +
 
 +
=== Calxeda/Boston Viridis ARM System ===
 +
 
 +
32-bit ARM compute is provided by a Calxeda Energy Core ECX-1000 system from Boston Limited. There are three installed "Energy Cards", each with 4 ECX-1000 nodes, which each have a quad-core ARM Cortex-A9 processor and a small (Cortex-M) ARM management processor. This system runs the [[Pidora]] build system (except for the Koji hub and web nodes).
 +
 
 +
=== 64-Bit ARM Compute ===
 +
 
 +
The EHL has over 100 cores of ARM64 compute, provided by a number of computers from multiple vendors. These computers provide a build system and testing platforms for [http://leapproject.ca the LEAP project], software optimization, and applied research on ARM64 systems.
  
 
== Funding ==
 
== Funding ==
  
 
The EHL is generously funded by an NSERC Applied Research Tools and Instruments (ARTI) grant under the CCI program.
 
The EHL is generously funded by an NSERC Applied Research Tools and Instruments (ARTI) grant under the CCI program.
 +
 +
Additional systems installed within the EHL have been provided by OSTEP applied research partners.
  
 
== Location ==
 
== Location ==
Line 22: Line 82:
 
=== Pidora Koji System ===
 
=== Pidora Koji System ===
  
The Koji buildsystem used for the [[Pidora]] project, [http://koji.pidora.ca koji.pidora.ca], uses Calxeda nodes in the EHL as build servers, and will in the future use a system within the EHL as a hub. This system is publicly accessible.
+
The Koji buildsystem used for the [[Pidora]] project, [http://koji.pidora.ca koji.pidora.ca], uses Calxeda nodes in the EHL as build servers, and may in the future use a system within the EHL as a hub. This system is publicly accessible.
  
 
=== Student Access ===
 
=== Student Access ===
  
Students in [[SPO600]] and [[SBR600]] utilize EHL systems for specific projects and labs.
+
Students in [[SPO600]] and [[SBR600]] are given remote access to some EHL ARM computers for specific projects and labs, when they are not being used for OSTEP applied research projects.

Revision as of 11:47, 13 September 2015

The EHL in January 2015.

The Enterprise Hyperscale Lab is operated by the OSTEP team to perform research on open source technologies and emerging hyperscale SOC-based systems.

Equipment Overview

The EHL consists of two dual-thermal-zone (cold/hot) rackmount cabinets with power conditioning and backup, power distribution, thermal monitoring, and 1- and 10-gigabit network services. These cabinets supports a large number of hyperscale and SOC-based ARM computers for various applied research projects.

The first EHL cabinet was installed in Summer 2014, and the second cabinet was installed in Summer 2015.

Equipment Detail

Cabinet

Each of the two EHL cabinets is a Silentium Accoustirack Active, a full-height acoustically-insulated rackmount cabinet with two fan units. The lower fan unit takes air from outside the cabinet and blows it up the front of the rackmount equipment area (cold zone). Air passes through the individual devices and is vented out the back of each unit into the hot zone. A second fan unit exhausts air from the hot zone out the top of the cabinet.

Each of the fan units includes active noise cancellation so that the loaded rack can be operated in a software development lab context.

After the fan units are installed, there are 33u available for other devices.

Power

Power to each cabinet is provided by two 115 volt, 15 amp, 60 cycle circuits that power two independent APC 1.5 kW rackmount power supplies. Each of the power supplies has a network interface for remote monitoring and control.

The UPSes feed two Raritan Dominion power distribution units (PDUs), mounted vertically up the sides of the cabinet at the back, plus a third 1u horizontal unit. The PDUs act like long power bars, but have control and monitoring systems so that the per-outlet current consumption can be measured and remotely monitored over the network, using snmp or http protocols. Each outlet can also be switched on or off under network control.

Where possible, devices have been configured with dual power supplies. This means that many of the devices in the EHL are plugged into two PDUs and can continue to operate when power is cut to one of the two supplies. This permits equipment to be rewired without any downtime; it also guards against downtime due to PDU, UPS, or PSU failure.

Devices which are not configured with dual PSUs are connected to just one of the PDUs.

The total draw of the EHL equipment installed in the first cabinet as of January 2015 was approximately 1.8 kWh under load. The current power system can support a little over 3 kWh; the cabinet supports thermal exchange of about 8 kWh.

Environmental Monitoring

One of the PDUs in each cabinet is equipped with a string of three environmental sensors. These are laid out diagnonally across the EHL, so that the air intake temperature, mid-cabinet hot zone temperature, and air exit temperatures are monitored, as well as the humidity.

In January 2015, typical intake temperatures on the EHL were 25-27C and exhaust temperatures were 36-37C.

Networking

At least one Cisco 24-port gigabit switch and one Netgear 24-port 10-gigabit switch are installed in the back of each EHL cabinet. The 10g switches provide both 10GBASE-T and SFP+ connections. SFP+/DA (Direct Attach) copper cables and LC fibre optic cables are used where because they offer much lower latency than 10GBASE-T connections (2 nS vs 2 mS - one million times less latency); other connections are made with 10GBASE-T copper connections as required. Devices which do not support 10 gigabit connection are connected with 1 gigabit or 100 Mbit ethernet.

The connection between the EHL cabinets is made by a fibre optic 10 gigabit connection.

Connections between the EHL LAN and the outside world are provided by an ARM64 dual-homed host that provides firewall, NAT, forwarding, DNS, and VPN endpoint services.

Storage

Storage is provided by a Synology Rackstation in cabinet 1, which provides both storage area network (SAN, raw block devices over protocols such as iSCSI) and network-attached storage (NAS, filesystem-level shared block devices over protocols such as NFS and SMB). It is populated with twelve 1 TB SSDs and equipped with dual power supplies and dual 10-gigabit ethernet.

Terminal Server

Many of the computers installed in the EHL do not have video output (because they are not intended for desktop applications). Most of these have a serial port; in many cases, this is a virtual serial port which is accessed using the IPMI SOL (Serial-over-LAN) or similar protocol on the network. Since this is not a TCP/IP protocol, the client must run somewhere on the LAN.

For systems that do not have a working IPMI engine, each EHL cabinet is equipped with a Cyclades Terminal Server which provides remote access to 32 serial ports. A remote user can connect to a selected port to monitor and control the connected system.

Display

Cabinet 1 is equipped with a 15" 4:3 LCD monitor, bolted to a 4u blanking panel. This display is driven by a Raspberry Pi, and can be used to show educational information about the rack, current system status information, or diagnostic data.

Calxeda/Boston Viridis ARM System

32-bit ARM compute is provided by a Calxeda Energy Core ECX-1000 system from Boston Limited. There are three installed "Energy Cards", each with 4 ECX-1000 nodes, which each have a quad-core ARM Cortex-A9 processor and a small (Cortex-M) ARM management processor. This system runs the Pidora build system (except for the Koji hub and web nodes).

64-Bit ARM Compute

The EHL has over 100 cores of ARM64 compute, provided by a number of computers from multiple vendors. These computers provide a build system and testing platforms for the LEAP project, software optimization, and applied research on ARM64 systems.

Funding

The EHL is generously funded by an NSERC Applied Research Tools and Instruments (ARTI) grant under the CCI program.

Additional systems installed within the EHL have been provided by OSTEP applied research partners.

Location

EHL is located in CDOT.

Student and Open Source Community Access to EHL Systems

Select systems within EHL are accessible to both open source community members and to students when they are not in use for other OSTEP research.

Pidora Koji System

The Koji buildsystem used for the Pidora project, koji.pidora.ca, uses Calxeda nodes in the EHL as build servers, and may in the future use a system within the EHL as a hub. This system is publicly accessible.

Student Access

Students in SPO600 and SBR600 are given remote access to some EHL ARM computers for specific projects and labs, when they are not being used for OSTEP applied research projects.