The recent discovery that hackers have inserted software into the US electrical grid, which would allow the grid to be disrupted at a later date from a remote location, clearly demonstrates the fact that the utility infrastructure is quite vulnerable and that its overall mission of serving the population could be severely compromised as a result of unexpected man-made or natural disasters. As other industry sectors are already experienced with arming automation systems with modern information technology (IT), the power grid is also facing the trend of integrating the electrical infrastructure with information infrastructure, and it is experiencing a profound change towards the smart grid. This change not only moves power automation systems from outdated, proprietary technology to using common technologies—personal computers, Microsoft Windows, and TCP/IP/ Ethernet, but also brings the isolated and closed network of power control systems into the public network. The integration of different systems, originally without consideration of security protection and mechanism, results in tremendous cost and performance benefit to the power industry, as well as arduous challenges of protecting the automation systems from security threats, especially when the public network is used. It is misleading to suggest that IT people should take the full responsibility for power grid network security including automation and control networks. Compared with regular IT systems, power automation systems have different goals, objectives, and assumptions on what needs to be protected. It is important to understand what ‘real time performance’ and ‘continuous operation’ of a power automation system really means and to recognise that power automation systems and applications were not originally designed for the general IT environment. Therefore, it is necessary to embrace and use existing IT security solutions where they fit, such as communication within a control center, and develop new solutions to fill the gaps where IT solutions do not work or apply. This article discusses conceptual layered framework for protecting power grid automation systems against cyberattacks.
Scope and Functions of the Power Grid
From the power flow viewpoint, the input to the power grid is high voltage (100 kV or above) power, stepped up by the power plant transformer from the low voltage power produced by the generators. The output of the power grid is electricity at medium or low voltage (less than 100 kV), stepped down by transformers in substations, and delivered to commercial, industrial, and residential consumers.
The major functions of power grid are performed in three different levels: corporate, control center, and substation. At the corporate level, the following major functions of both business management and operation management are performed:
- Planning—plan of equipment and line upgrades based on forecast of load and generation sources, market conditions, and system utilisation;
- Accounting—management of contracts and bids with other market participants;
- Engineering—system design and engineering for transmission and distribution lines and automation systems;
- Asset management—monitoring, replacement, and maintenance plan of equipment and lines;
- Historical information system—an online historical database is commonly used to retain all telemetry data, operator actions, alarm summaries, etc., for a periodic of time that ranges typically from 3 to 24 months.
At the control center level, the following major real-time and non-real-time functions are performed:
- Forecast—short-term forecasting of load and power generation sources;
- Monitoring—monitoring of system state, activity, load, equipment conditions;
- Operation—switching operation, changing setups, starting emergency procedure, performing system restorations, etc.;
- System analysis—model update, state estimation, contingency, and stability analysis, power flow analysis;
- Recommendation—recommendation of preventive, corrective, and optimised operations;
- Fault or alarm processing—locating fault and intelligent processing of alarms;
- Training—operator training;
- Logging—archiving logs and reports;
- Data exchange—exchanging data with ISO (independent system operator) or RTOs (regional transmission organisation), power plants, consumers, and peer transmission and distribution system operators.
At the substation level, the following major real-time and non-real-time functions are performed:
- Normal operation—collecting data and alarms and sending them to control center, executing commands issued by control center;
- Exchange of protection data between the RTU and IEDs within the substation. Relay devices perform protection, control and indication gathering functions;
- Emergency operation—power system protection, load shedding, recovery from load shedding, shunt control, compensation control, etc.;
- Engineering—protection engineering, automation engineering, line engineering;
- Logging—archiving logs;
- Maintenance—equipment and line maintenance.
Power Grid Automation Systems
A typical grid automation system, as shown in Fig. 1, is an integration of one or more control centers, with each center supervising multiple substations. The power grid automation system is a layered structure and performs data collection and control of electricity delivery. At corporate level, some functionalities are related to the automation system. For example, planning plans for the amount of electricity that must be generated for the next day. A control center typically includes devices such as an energy management system (EMS), human-machine interfaces (HMIs), and a front-end processor (FEP) which translates different communication protocols—for instance some legacy RTUs in substation use DNP 3 and new ones use IEC 61850. Substations contain remote terminal units (RTUs), programmable logical controllers (PLCs), global positioning system (GPS) sync timers, HMIs, communication devices (switch, hub, and router), log servers, data concentrators, and a protocol gateway. Other intelligent electronic devices (IEDs) include field devices, such as instrument transducers, meters, tap changers, circuit reclosers, phase measuring units, and protection relays.
Power Grid Communications Systems
In order to deliver electrical power from power producers to consumers economically, power grid system operators have to exchange data with power producers, ISOs, RTOs, consumers, and peer system operators. As shown in Fig. 1, at control or operation center level, dedicated lines are widely used and Inter Control Center Protocol (ICCP) is deployed as the communication protocol. Local area network (LAN) and Internet Protocol (IP)-based protocols are usually used for the communication link between corporate and control center; IP-based protocols, such as DNP3.0 over TCP, are widely used for the communication link between control center and substations. Wireless technology is also deployed for this communication link. The current SCADA systems provide HTTP(S), Secure Shell (SSH) ports for remote connections.
Current Security Solution
A gateway security solution, as shown in Figure 2, is currently widely used to protect power automation system from external cyberattacks. All incoming data packets to a substation are inspected by the security gateway.
Major Challenges and Strategies of Security Solutions for the Smart Grid
To address security issues of the smart grid, it is necessary to identify those unique challenges. There are four major challenges when developing new network security solutions for power grid automation systems.
1) Many automation components (such as RTU) use proprietary operating systems, which are designed for control functionality and performances, but not security.
2) Automation systems use heterogeneous network technologies, such as ProfiBus, ModBus, ModBus Plus, ICCP, DNP. Most technologies and protocols were designed for connectivity, without consideration of cybersecurity.
3) Most automation systems are combinations of new and legacy components with many systems expected to run up to 20 or 30 years, perhaps even longer.
4) Since power grid is experiencing a profound change and moving to smart grid, there are new applications (such as using phasor measuring units and smart meters) and corresponding new requirements for data communication in terms of bandwidth, delay, and new communication protocols.
The strategies to design a security solution for smart grid are as follows:
- Scalability is the system’s ability to increase or decrease its capacity to protect larger or smaller size of power grid automation systems in a graceful manner.
- Extensibility—which refers to a system designed to include hooks and mechanisms for expanding and enhancing the system without having to make major changes to the system infrastructure.
- Interoperability—a property referring to the ability of diverse systems to work together. Since power grid automation systems use various technologies with respect to hardware, operating systems, and communications protocols, the security framework and components must be able to work together regardless of the technology on which they are executed or developed.
- Non-intrusiveness—which refers to the system’s ability to be subjected to security activities without compromising its control functionalities and performance.
- Flexibility—which is the ability to adapt to various needs in the development and at runtime.
The Integrated Security System
To meet the challenges discussed in the previous section, we propose an integrated security framework with three layers (power, automation and control, and security), also called common security platform, as shown in Figure 3. The automation and control system layer monitors and controls power grid processes, while the security layer provides security features. Since the security layer provides clear demarcation of responsibilities, control functionalities and security functionalities can be decoupled during design stage. Data related to security management flows on this layer. Another important idea is that the proposed security solution replaces the gateway security solution by a security service proxy solution, which is shown in Figure 4. There are three key security subsystems: Security Agent, Security Switch, and Security Manager. Security Agents and Security Switches, which are security enforcement devices, run as security service proxies; and Security Manager runs as a security management device either in the control center or in a substation. The proposed integrated security framework operates on three hierarchical levels, as shown in Figure 5. Each of these levels is protected by a component of our security system listed below:
- Device level, in which electronic devices, such as RTU and IED, are protected by the Security Agent;
- Network level, in which communication bandwidth is protected and delay is guaranteed by the managed security switch;
- Operation level, in which security policies are orchestrated and managed by the security manager.
The security agents bring security to the edges of the system by providing protection at the device level—it applies to both wired and wireless devices. These agents are firmware or software agents depending on the layer of the control hierarchy. At the field device layer (e.g., IEDs), these agents will be less intelligent—containing simple rules and decision-making capabilities—and whose primary responsibilities consist of event logging and reporting. At the substation level (eg, RTUs), these software agents will be more intelligent with more complex rules for identification and detection of intrusive events and activities within the controllers. In particular, a security agent will be commissioned to accomplish the following functions:
- To translate between different protocols;
- To acquire and run the latest vulnerability patches from its security manager;
- To collect data traffic patterns, system log data and report to the security manager;
- To analyse traffic and access patterns with varying complexity depending on the hierarchical layer;
- To run host-based intrusion detection;
- To detect and send alarm messages to the security manager and designated devices, such as HMI;
- To acquire access control policies from the security manager and enforce them;
- To encrypt and decrypt exchanged data (end-to-end security).
Managed Security Switch
Managed switches are used across the automation network to protect bandwidth and prioritise data packets. These switches, working as network devices, will connect controllers, RTUs, HMIs, and servers in the substation and control center. Managed security switches possess the following functionalities:
- To separate external and internal networks, trusted and non-trusted domains;
- To run as a dynamic host configuration protocol (DHCP) server;
- To run network address translation and network port address translation (NAT/NPAT) and to hide the internal networks;
- To acquire bandwidth allocation patterns and data prioritisation patterns from the security manager;
- To separate data according to prioritisation patterns, including operation data, log data, trace data, and engineering data;
- To ensure qos for important data flow, such as operation data, guaranteeing its bandwidth, delay, etc.;
- To manage multiple virtual local area networks (VLANS);
- To run simple network-based intrusion detection.
Security managers, with a graphical user interface (GUI), reside in the automation network and directly or indirectly connect to the managed switches across the automation networks. They can be protected by existing IT security solutions and will be able to connect to a vendor’s server and managed switches via VPN. The security manager will possess the following functionalities:
- To collect security agent information;
- To acquire vulnerability patches from a vendor’s server and download them to the corresponding agents;
- To manage keys for VPN;
- To work as an authentication, authorisation, and accounting (AAA) server, validating user identifications and passwords, authorising user access rights (monitor, modify data), and recoding user changes to controllers;
- To collect data traffic pattern and performance matrix from agents;
- To collect alarms and events;
- To generate access control policies based on collected data (using data mining techniques) and download them to agents;
- To run complex intrusion detection algorithms at control network levels;
- To generate bandwidth allocation patterns and data prioritization patterns (possibly through data mining techniques) and download them to managed switches.
Security Engineering System
Security engineering system is used to create, configure, manage, monitor, and troubleshoot the integrated security system project. The project navigator is the common view for all tools of the engineering system. It offers a common list of all controls and data and generates the runtime configuration data. The engineering system acts as a centralised data and program administration. In this project, the security engineering system was not developed. Some functions of the security engineering system will be implemented in the security manager.
Traditionally, automation systems are subjected to more constrained behaviour as compared to enterprise networks. Automation networks possess:
- relatively static topology;
- regular communication patterns;
- limited number of protocols;
- simple communications protocols.
Implementations of Security Agents
It would be very costly if one standalone security agent protects each electronic device. To make it more cost-effective, security agents are implemented in four different ways, as shown in Figure 6:
- An individual two-port module—the internal is connected to the protected device (such as RTU and HMI), the external port connected to a network device (such as switch, router);
- An individual two-port module—the internal is connected to a group of electronic devices (such as meters or protection relays), the external port connected to a network device (such as switch, router);
- Agent resides in a managed security switch—a virtual security agent runs on an internal port connected to the protected device;
- Agent resides in all newly developed devices (such as log server, PLC, RTU) — runs independently of control firmware.
This article first introduces functionalities of power grid, its automation and control system, communications. Potential cyberattacks and their adverse impacts on power grid operation are discussed, a general SCADA cyberattack process is presented. Further, it discusses the major challenges and strategies to protect smart grid against cyberattacks.