The conundrum of need for WiFi 6E

Recently, Apple has released its new flagship 14 series iPhones and to a lot of surprise to the WiFi Community, these phones did not seem to have support for 6E technology. A lot of people came in support of Apple arguing 6E products does not have stable code versions and Apple not wanting to take the blame for poor performance while others expressed disappointment that an industry leader in developing products with futuristic vision did not have support to a feature that has already been on the market with other products (not talking about swipe typing here). To this my personal opinion is that Apple always does what it “thinks” is best for the customer and not what the customer is wanting or asking for. It turned out to be the case with 6E as well. Customers were expecting it from an year ago during the iPhone 13 release but the wait continues for at least another year. The conversations in the WiFi community led to bigger questions. Do we need 6E capable devices and infrastructure to support today or in the near future? There is no right or wrong answer to this. It depends mainly on the individual cases and what one is trying to solve. To get a definitive answer one must find it through multiple questions.

Firstly, let’s take a look at what 6E is offering. Depending on the country you are in, an additional spectrum of up to 1200 MHz in 6 GHz band is offered for WiFi use. This in itself is a big boon for all the environments that are saturated on 2.4 GHz and 5 GHz with high utilization WiFi client devices. So the first question that needs to be answered if 6E is needed today is how much of your spectrum in these bands is saturated. With spectrum limitations before 6E, lot of high bandwidth devices were recommended to be connected via ethernet especially the ones that are not mobile. If an environment has such high bandwidth applications especially that need mobility, then 6E is a must. It is important to note these applications will have better experience on wider 80 MHz channels than 20 MHz or 40 MHz used to an extent on 5 GHz band. Mobile phones are probably at the end of the list (excluding IOT devices) that need higher throughput. Even if they support 6E, is allowing them on 6 GHz especially in BYOD cases a wiser option than reserving this new spectrum for mission critical bandwidth intensive applications? The answer is we are probably better off not allowing them on 6E. There might be a scenario eventually where 2.4 GHz is used for IOT, 5 GHz for BYOD, guest and non critical applications and 6 GHz for corporate high bandwidth and mission critical devices. Only time will tell but now is the right time to envision the right fit.

The next major consideration with today’s products is the firmware stability and efficiency. With 1200 MHz spectrum comes challenges in discovering, connecting and roaming on the network. The question here to be answered is, do you have the time and resources to perform exhaustive testing of these devices that you are enabling the network for. 802.11 client device testing has not been popular to this day mainly because of the need for multiple test data gathering devices (logs, pcaps, traffic generation etc) and time needed to analyze all these pieces of the puzzle. A lot of engineers certainly perform basic testing that involves validating device support for different 802.11 protocols, ensuring they connect to networks, check data loss during roaming etc but not necessarily a deep dive. Cost benefit analysis always indicated deep dive is not a worthy option for most use cases. That could change with WiFi 6E which introduced a lot of new concepts in addition to the secret sauce every vendor tries to add. It is important to study and understand whats connecting to the network and what is expected or ideal behavior before enabling them to connect on different vendor infrastructures. To do that it helps to have devices for testing rather than companies releasing products that do not support it (Hello Apple again !!).

The next question that often comes is “we don’t have a use case for 6E yet, but we have an upcoming infrastructure refresh. Should we procure 6E access points?” There are three ways to approach this. One is to procure WiFi 6 access points that have been in the market for couple of years now. Second is to procure WiFi 6E access points and third is to use whats currently in production and wait for WiFi 7 access points. There is a lot of uncertainity with WiFi 7 timelines and hence the third option is my least preferred. Among the first two options, the main considerations are the budget and product lead time. WiFi 6E access points are tri-band and tend to be more expensive than other models. But this option does provide longer lifecycle and could save money in the long run. WiFi 6E access points have a different chipset than the previous models and depending on the vendor you are working with they might also have better lead times on these access points. Wireless infrastructure refresh may require switching upgrade mainly for 802.3bt power requirement and multigig switch ports (may be). Although wireless vendors are offering different features to make access points operational in limited capacity with 802.3at power, one must review these in detail to ensure they meet their requirements and determine the need for switching upgrades.

There may or may not be a use case for everyone to leverage 6E today. But it is important to think about the strategy and roadmap to use this technology in improving the mobility experience. Answering some of these questions could be a great start in this process.

WPA3 – SAE in Action

In my previous blog , I have discussed some of the concepts of Diffie Hellman (DH) key exchange and elliptic curve cryptography. In this post, I will be discussing how these work together to enable secure connectivity with WPA3-SAE. To understand this better, I have configured an SSID on juniper mist access point with authentication protocol setting toggled between WPA3-PSK + WPA2 and WPA3-PSK modes for different packet captures. WPA3-PSK+WPA2 is the transition mode in which the SSID supports both legacy WPA2(PSK) only clients and WPA3 SAE supported client devices whereas WPA3-PSK mode supports SAE only. Juniper mist access point AP41 is on version 0.9.22801 and the client device iPhone XR which supports WPA3 is on iOS version 14.7.1.

First lets examine the RSN Information element (IE) in the beacons with transition mode.

RSN IE SSID in Transition Mode

Notice the Authentication key management suite had two elements: 00-0F-AC-02 for PSK and 00-0F-AC-08 for Simultaneous Authentication of Equals (SAE). The management frame protection is enabled but not mandatory in transition mode. This enables backward compatibility with WPA2 PSK devices that don’t support management frame protection. The group management cipher suite has a suite type 00-0F-AC-06 indicating it uses BIP-CMAC-128 for management frame encryption.

Now lets look at the RSN information element when the SSID is configured for WPA3-PSK mode.

RSN IE SSID in WPA3-SAE mode

The main differences between the transition mode and this one is it supports only one type of authentication key management which is 00-0F-AC-08: SAE and management frame protection is mandatory. Examining the RSN IE on the beacons should help in identifying the SSID authentication settings.

Now lets look into the SAE between access point and client device.

Authentication Frame Exchanges

After the initial probe request and response, there are four frames exchanged in SAE in place of two frames in case WPA2-PSK which uses open system authentication. This four frame exchange is embodied by the principles of public key cryptography and Elliptic Curve Diffie Hellman (ECDH) Groups. The first two frames are commit messages and the last two are confirm messages. Below is a snippet of authentication element from a commit message.

Frame 1: Commit message 1

The authentication element of the frame has 6 fields. The first field indicates the type of authentication algorithm which in this case is SAE followed by a sequence number and status code. The next field, Group ID plays a critical role in the SAE process. This ID refers to a set of parameters defined by IANA that will help both client device and access point determine the point on an elliptic curve without having to exchange the password and other details over an insecure channel. Group ID 19 uses ECP that defines the math behind mapping the PSK to a point P on an elliptic curve (EC) and mandates the use of 256 bit keys for high security. ECP provides higher security with less compute than other DH groups with MODP which uses modulus functions to determine P. Diffie Hellman exchange only works when both parties can agree to a common variable, in this case a point on an elliptic curve. The next field elements are scalar and finite field element. Scalar is randomly chosen by the device and finite field element (FFE) is a result of calculation with P determined by using ECP.

The second commit frame is from the access point to the client device that contains the same elements with a scalar and FFE of its own.

Frame 2: Commit message 2

The third frame in the sequence is the confirm message from client device to the access point.

Frame 3: Confirm message 1

The fourth frame is the confirm message from access point to client device.

Frame 4: Confirm message 2

Each device uses the scalar and FFE received from the other device to calculate the shared secret that is the seeding material for PMK calculation and send these confirm messages. It is important to note that the confirm field value in frames 3 and 4 is different because the order of values hashed by client and AP is different. However, each device can calculate the hash of other device to confirm they are using the same key. The entire SAE exchange that calculates the shared secret and confirms is also called dragon fly key exchange. If you are interested in specific math details of the exchange they can be found here . After the SAE exchange, the devices proceed with association process followed by the 4 way handshake. The following summarizes SAE frame exchange process.

Client             --Commit-->           Access Point
Client             <--Commit--           Access Point
Client             --Confirm-->          Access Point
Client             <--Confirm--          Access Point

Comparing this with the DH paint analogy makes it easier to understand better.

Public Key Cryptography Demonstration

Alice and Bob agree on using group ID that is defined by IANA to determine P on a curve which can be treated as common paint. They exchange scalar and FFE over public transport which can be treated as part of public key. It is not possible to determine private keys/secret colors from these values. They use this information to calculate the common secret and in turn PMK that is used to derive session keys using the 4 way hand shake. Because the secrets are device and session specific, even if the password is compromised, the attacker cannot decrypt the traffic of other users.

References:

  1. https://mrncciew.com/2019/11/29/wpa3-sae-mode/
  2. Presentation by Hemant Chaskar at WLPC 2019 https://www.youtube.com/watch?v=fCsR8aK4mqE
  3. https://sarwiki.informatik.hu-berlin.de/WPA3_Dragonfly_Handshake
  4. Packet Capture

Wi-Fi Feasibility for your IoT Application

Today, every enterprise has Wi-Fi infrastructure in place to enable connectivity and mobility in the environment. It is safe to say Wi-Fi is ubiquitous and is the primary mode of connectivity for mobile phones, laptops, and a multitude of other things. But is using Wi-Fi for connecting all things a good solution? It certainly helps the consumers/end users to have a single connectivity solution to use and manage. But if you are product developer or someone involved in decision making of a connectivity solution for your Internet of Things (IoT) application this might be helpful. In this blog post, I dive into some of things that need to be considered to evaluate the suitability of Wi-Fi for different applications. For this blog purposes, any object that needs network connectivity is treated to be part of IoT. It can be a laptop that has great compute resources or a sensor that merely detects temperature and transmits it. The following factors will help in determining how effective it would be to use Wi-Fi for connectivity.

Scale

When it comes to IoT, scalability is a key requirement that drives much of the technology conversations. How many devices does your application require? We can talk a lot about the theoretical number of simultaneous connections an 802.11a/b/g/n/ac access point (AP) can support but in reality, the number of devices that can have reliable connection simultaneously depends on the throughput requirements which we will be discussed later. A few motion detector sensors, temperature sensors, security cameras and other smart home devices work great on home Wi-Fi but when you scale the numbers into hundreds and thousands in a smart enterprise environment, Wi-Fi may not scale well enough to meet the requirements. As a rule of thumb, you can expect a typical 802.11a/b/n/ac access point to serve tens of client devices with individual throughput requirements of less than 1 Mbps. But capacity analysis and determining how many reliable connections an access point can provide is a much more complicated discussion. Advanced algorithms might have to be implemented to make the client devices turn on/off their radios when necessary so that the number of simultaneous connections at any time is within the Wi-Fi limits. However, things change when you consider 802.11ah (Halow) standard. This Wi-Fi technology operates in sub 1 GHz band and was developed specifically for internet of things to be able to support large number of devices with low throughput requirements. Theoretically each Halow access point can support 8,191 client devices but I haven’t seen products in the market that support more than 250 connections. While this technology has promising features, there are not a lot of products in the market today.

Support for IPv4/IPv6

Devices need to support IPv4/IPv6 to be able to communicate over Wi-Fi. Supporting these protocols might require the client devices to have more compute than they would typically need to perform their intended tasks. Adding support for IP can add more overhead than the actual data making the system inefficient. Support for IP on devices like mobile phones and laptops is a necessity without which they can’t transfer the large amounts of data they are typically designed for. But adding IP support to a motion sensor detector can result in more overhead and less data. The overhead only increases with scale and is a good trade off to having a different connectivity solution to manage. One other thing to consider is IPv4 may not be sufficient at scale and could require IPv6. IPv4 is supported by all enterprises but IPv6 is still in adoption phase. So the support might have to be considered on both client device as well as the consumer networks.

Power

802.11 capable devices tend to have less battery life when compared to 802.15.4 based devices that support protocols like LoRa, ZigBee etc. If client devices can be recharged frequently and has support to continue to operate while charging, using them on Wi-Fi can be great but a lot of connected objects like temperature sensors operate on coin cell batteries. Using the low powered devices that are expected to have longer life (months or years) on Wi-Fi may not be an ideal solution. These devices operate more efficiently on 802.15.4 based protocols.

Throughput

One of the best attributes of Wi-Fi is the throughput capabilities it can offer. This is especially true when compared with other wireless protocols like Bluetooth, ZigBee and LoRa. Wi-Fi offers better throughput at an individual client device level as well as an aggregate level although both the values fall with increased number of connections per AP. Throughput requirements must be evaluated along with scale factor because both are interdependent. Thousands of RFID tags might require few Kbps per tag and an aggregate of few Mbps of throughput but a single inventory management robot that requires higher throughout to continuously scans and transmit data is a better client device to be connected on Wi-Fi.

To be able to determine if Wi-Fi is the right connectivity solution for an IoT application a combination of all these factors also need to be considered. More throughput means more resource utilization which translates to more power consumption which makes the need to have client device be recharged frequently more critical. On the other hand, if the throughput requirements are low, adding an IP support can add enough overhead at scale making the solution impractical. 802.15.4 based connectivity solutions might make more sense in those use cases. There could be a hundred other reasons to choose or not choose Wi-Fi for an IoT application but these four factors are foundational to determining the suitability of the solution.

A good Wi-Fi design is about getting four things right..!!

Becoming a good Wi-Fi design professional requires extensive knowledge in different aspects of networking and also some areas of project management. The 400+ pages of CWDP book from CWNP teaches you exactly that. From requirement gathering & analysis to post implementation validation, the CWDP curriculum is designed to make you a well rounded design professional. But are there nuggets to achieving good Wi-Fi design? In this blog post, I explain why nailing four fundamentals is the key to achieving this.

Choice of Access Points

Selecting the right access point drives the entire design process. This sounds a lot easier than done. So how can we get the first step of the process right. Access points are typically two types, the ones with internal omnidirectional antenna and the others with connections to external antenna. Choosing between the two types needs a thorough requirement analysis. Is the goal to provide coverage or capacity? The main intent of coverage design is to be able provide good signal without taking into consideration of how many clients connect. Coverage design works well for guest only or other networks where the WiFi performance is not critical. Access points with internal omnidirectional antennae are a great fit for this purpose. But WiFi performance is more critical in today’s world than ever. That is where capacity design comes into play. Capacity planning requires an understanding of the number of clients expected to be on the network, the applications that will be used and the throughput SLA requirements per user or device based on these applications. Ideally it should also contain room for growth in number of devices in the future. Capacity planner from Andrew is great resource to determine how many access points are needed to meet the capacity requirements. The type of access point for capacity design really depends on the number of required access points and the layout. High density spaces like auditorium, large conference rooms, lecture halls etc, where the number of client devices is high per square foot, using access points with directional external antennae will be highly beneficial. Office spaces with well spaced desks layout can work with APs with omnidirectional antennae. But as the scale increases (devices, additional floors), use of directional antennae may be required for creating smaller coverage cells. I have a blog post that explains the need for these antennae in modern enterprises. This should help in choosing the correct type of access point for your deployment.

Location of Access Points

Once the required types of access points are determined for the deployment, the next step would be determine the ideal placement of these access points on the floor plan. This can be done in a couple of ways. One design survey method is the AP on a stick method. This process involves placing AP in the actual environment, taking the readings from a site survey software and determining the correct placement for optimal signal. The clear to send blog has very good content on how to perform this type of surveys. It is important to note that this is not a scalable way of determining the AP locations. The second method called predictive design is a scalable solution. This requires use of site survey software like Ekahau Pro or iBWave Wi-Fi to identify the ideal AP placements. Most predictive site survey software comes with default attenuation values for walls and other obstacles in the environment. It is recommended to do a combination of both survey methodologies to make the design more accurate. AP on a stick method can be performed in parts of the environment to determine the attenuation values of different obstacles and input these into the site survey software to improve accuracy in the predictive models. As a rule of thumb, never place APs in the hallways and always try to leverage the walls to reduce cell size especially when omnidirectional antennae are in use. The AP placement should not solely depend on the coverage but also client density for capacity designs. High density spaces will require more number of access points in closer proximity than other areas. Designs that involve placing APs every ‘x’ feet might miss some obstacles that prevent signal penetration. Shotgun implementations like adding APs where people need it can result in over engineering and these are only few of the many reasons why determining ideal AP placements is the second fundamental one needs to get right to achieve required performance.

Channel Planning

With continuous improvements to the proprietary Radio Resource Management (RRM) protocols, many vendors today recommend using auto channel settings in any environment. This may not necessarily result in optimal performance. Coming up with a good channel plan that would reduce adjacent and co-channel interference is an important step in achieving better results. 20 MHz channels are widely recommended in enterprise environment. But each case is unique and needs to be evaluated accordingly. Perhaps, there is an area in the environment with clients performing file transfers frequently and can benefit from 40 MHz channels in the area. The environment might be closer to an airport that results in frequent channel switching when using default RRM settings. Such environments could benefit from disabling some of the DFS channels. 2.4 GHz range is better than 5 GHz considerably. So some 2.4 GHz radio might have to be turned off to reduce interference. Even on 5 GHZ channels, channel 36 will have better range than channel 165 although the difference is not too considerable. Some clients may not have support for all channels. All these factors need to be taken into account in the design phase to be able to deliver more predictable performance. Static channel assignment yields best results but performance when using RRM and device profiles with appropriate settings do not fall far off as well. More than anything, using RRM vs static channel assignment is a question of scalability. In any case, coming up with a channel plan manually or using auto channel assignment options on the predictive survey tools will give better insights into what the actual coverage and channel overlap is going to be post deployment.

Transmit Power Setting

One of the frequently overlooked setting is the transmit power on the access point radios. Using default RRM settings can be quite catastrophic in some cases. Especially when access points are transmitting at high power, the network can face multiple issues in the form of interference, asymmetric uplink/downlink connections, hidden node issues etc. Customizing Tx level in the RRM settings can yield best results without having the need to set static power levels on all access points. The ideal maximum Tx power at which APs transmit should be equal to the transmit power of least capable most important device in the environment and the minimum Tx power should be equal to the power at which all APs can provide required minimum coverage. Predictive site survey tools give you the ability to simulate coverage at different power levels and this will help in determining these values that need to be configured on RRM to make best use of it.

There a ton of other requirements for successful planning, implementation and validation of a good WiFi network but the design is always at the core of it. It is the foundation on which the entire process is built on and getting these four fundamentals right is the key to an optimal design.

A Checklist of Expenses for your WiFi project

Looking to install new WiFi infrastructure or upgrade your current system? Wondering what costs are involved for your project? Here is something that might help. Having worked on multiple WiFi projects ranging from tens of access points (APs) to thousands of access points, I thought it might be a good idea to have a checklist of costs involved in these projects. To keep things simple, costs can be categorized into one of 1. Materials and 2. Time.

Let’s take a look at the materials cost first. This will comprise of hardware, software and other miscellaneous expenses. At a basic level, this will include cost of access points and corresponding licenses. Depending on the choice of vendor solution, a controller (physical or virtual machine) or a subscription (for cloud solutions) will have to be purchased for network management. In general, licenses are sold for 1, 3 and 5 year terms. Latest WiFi products are not expected to be End of Life for 5 years from their release date but I have seen companies preferring a 3 year refresh cycle to be able to take advantage of the latest protocols. Depending on the appetite for future upgrades the licenses can be purchased accordingly. For some vendor solutions, a separate support contract might have to be purchased for troubleshooting help and RMA purposes. These contracts are available with different SLAs and can be chosen appropriately. The next material expense is cabling. If you already have wireless infrastructure in place, additional cabling might be required for APs that may have to be added or existing cabling might need an upgrade to Cat 6 cables. Another expense is the need for switching infrastructure. If you already have POE+ capable switches with enough available ports and power budgets on each one, this may not be required. Additional racks might be required to accommodate the new switches. Most access points today require POE+ but there are also some that can fully operate with POE. If buying new switches with these capabilities is not an option, an alternative is to use POE/POE+ injectors. Assessing the existing environment is critical in determining the cabling and switching costs. If the environment primarily consists of a typical grid style drop ceiling , in most cases the mounts included in the access point package should work. Other wise, additional mounting hardware might have to be purchased. If the environment has areas with high density of users, wireless engineer could recommend using access points with external (directional) antennae. It is worth keeping the mounting hardware for these antennae in the checklist as well. Additionally conduits and electrical boxes might be required for mounting access points for certain ceiling types. NEMA enclosures might be required to protect the access point for outdoor installations . If there is no in-house engineering/cabling/project management resources, consultants might have to be hired. So it is important to keep in mind the travel costs that may include flights, rental cars, hotel and food expenses for these consultants. With hiring consulting companies, a maintenance contract might also have to purchased with them for ongoing support post implementation. To summarize, here is a checklist of material expenses involved in a WiFi project:

  1. Access Points
  2. Controllers
  3. Licenses
  4. Vendor Support contracts
  5. Cables
  6. Switches
  7. Racking for switches
  8. POE/POE+ injectors
  9. Antennae
  10. Mounting Equipment for Access Points & Antennae
  11. Conduits
  12. Electrical boxes
  13. NEMA enclosures
  14. Consultant travel related expenses
  15. Consultancy maintenance agreement

Moving on to the time costs. This category will primarily include expenses on engineering & project management along cable technicians. Provided the project involves more than a couple of access points, it will need a minimum of a wireless engineer and a cable technician. If you do not have an IT team with resources capable of performing wireless design, implementation and validation, it is recommended to hire consultants to do these tasks. Each of these steps is critical to providing better performance. A network engineer might be required to configure the switching and routing aspects of the network but, in a lot of cases a wireless engineer will have the skills to do these tasks. Cable technicians need to be hired for cabling and installers for access point installation. Resources for cabling usually can also install the access points. If there is a business requirement to provide outdoor coverage, a certified electrician might have to be hired to drill on the external walls. A lot of small scale projects (< 50 APs) wouldn’t need a dedicated project manager but the larger the project gets the higher the benefits of having project management resources. A systems engineer may also be required for installing/configuring servers for services such RADIUS, Active Directory, LDAP etc if they don’t already exist in the environment. To summarize the time costs, here is a check list:

  1. Wireless Engineer
  2. Network Engineer
  3. Cabling Technician
  4. Access Point Installer
  5. Project Manager
  6. Systems Engineer

The estimates for the costs vary depending on a lot of factors including but not limited to choice of vendor, scale of the project, reseller discounts etc. The goal of this blog post was to provide a checklist of expenses rather than an estimate of expenses and I hope it can be of good help for your project.

Keys to Understanding WPA3 – SAE : Diffie-Hellman Key Exchange, Elliptic Curve Cryptography and Dragonfly Key Exchange

WPA3 certification is introduced by Wi-Fi Alliance in 2018 as a successor to WPA2. It aims to alleviate the vulnerabilities in WPA2 and provide more secure wireless networks.  It introduces new concepts like Simultaneous Authentication of Equals (SAE), dragonfly key exchange, NIST elliptical curve cryptography etc. To make it easier to understand WPA3 as a whole, I will be discussing each component individually in detail. WPA3 replaces Pre-Shared Key with Simultaneous Authentication of Equals (SAE) to derive the Pairwise Master Key (PMK) which enables secure communication even when the password is compromised. To understand how this is achieved, we need to understand how Diffie-Hellman key exchange and elliptical curve cryptography work in conjunction with Dragon fly key exchange.

Diffie-Hellman Key Exchange establishes session key between two entities without actually having to exchange any key information over a public insecure channel. Let’s get into the security terms of Alice and Bob being the two entities. Alice and Bob agree on two numbers g and p where p is a prime number. Alice chooses her private key to be a and Bob chooses b.

Alice calculates gamod p and sends it to Bob. Bob calculates gbmod p and sends it to Alice. This exchange happens over an insecure channel. Alice and Bob will perform the same multiplicative operation with modulo p against the values received.

Alice             <--agree on g and p-->           Bob
gamod p            <----Exchange---->           gbmod p
(gbmod p)amod p      --Derive key--     (gamod p)bmod p

For example, consider a=4 b=3 p=23 and g=5.

Alice             <--agree on g=5 and p=23-->   Bob
gamod p = 4          <----Exchange---->      gbmod p = 10
(gbmod p)amod p = 18   --Derive key--   (gamod p)bmod p = 18

The strength of the algorithm lies in the fact that (gbmod p)amod p is same as gbamod p and with large values of a,b and p it will be computationally close to impossible to obtain gbamod p without knowing the private keys a and b. This is an example of a trapdoor function which is nothing but a one-way function that states for a given x it is easy to calculate y = f(x) but very difficult to find x = f-1(y).  The basic concept of DH Exchange cannot be explained better without the paint analogy.

In this analogy g and p are common paint, a and b are secret colors and gabmod p is the common secret derived. This was one of the earliest implementations of Diffie Hellman algorithm. CWSP-206 study guide explains the same concept with different trapdoor function.

Here George and Billy agree on using 3 and 5 as their commonly agreed numbers and the operation they use is raised to the power.

George (35=243)           ------------         Billy (35=243)
secret 4, 2434           <------------>        secret 7,  2437
(2434)7                   ------------         (2437)4

Now that we have a good idea of what DH key exchange means, let’s take a look at Elliptic Curve Cryptography (ECC).

Elliptic curves like the one shown in the picture are set of points bound by the equation y2 = x3 + ax +b. Different curves use variations of this equation. To derive PMK, WPA2 uses a well-known hash function on the password whereas in WPA3, the password is indexed onto a point on the curve which is then used as generator to hash and derive the PMK. Hashing a password directly can be susceptible to dictionary attack. But it becomes very difficult doing it on generator points on an elliptic curve because change in a single character in the password can lead to a different generator point; hashing of which can result in a totally different PMK.

WPA3 also makes it impossible to derive PMK of individual sessions even when the password is compromised. Knowing the password can help the hacker identify the generator point on elliptic curve but due to the integration of Diffie-Hellman with ECC into Dragonfly key exchange makes it impossible to derive individual session PMK. The trap door function in this case could be scalar multiplication. According to discrete logarithmic problem, for two points Q and P on the elliptic curve where Q = n.P (n times P), it is impossible to determine ‘n’ based on only Q and P.

Let’s take a deeper dive into Dragonfly Key Exchange

The client device and access point in this diagram are both configured with a password for authentication. Client device chooses a secret A and access point chooses secret B. At this point let’s assume the password is already compromised and the hacker knows the generator point for PMK. Client hashes the secret A with generator point and transmits DH Hash A. Access point does similar process with secret B to create DH Hash B and transmits it to client. Having received DH Hash B, client hashes it with secret A to derive the PMK and access point hashes its secret B to derive the same PMK following the DH exchange as described earlier. Without knowing secret A and secret B, the hacker will not be able to derive PMK just from the password.

I hope this helped in understanding the WPA3 – SAE fundamentals. If you are interested in learning more I recommend the video playlist from Mojo networks on youtube which provides a simplified yet informative explanation on WPA3 concepts. I will be writing another blog post on frame exchanges during WPA3 – SAE authentication in the future.

References:

  1. CWSP-206 Study and Reference Guide from Certitrek
  2. Wikipedia
  3. Youtube playlist on WPA3 Enhancements by Mojo Networks

Time Analysis of 802.1X EAP-TLS and 802.11r !!

Ever wondered how much time does an entire EAP-TLS protocol exchange take? How efficient is 802.11r in minimizing packet loss during roaming process? You might have already known 802.11r FT over-the-air takes only four frame exchanges between the client and the AP to complete roaming process. But how long does this process take? This post will answer these questions. 802.1X and 802.11r are complex enough to have deep dive blog posts. So, I will discuss only some basics to give context to this time analysis.

EAP-TLS is considered one of the most secure frameworks for authentication. The high security comes from the requirement of using client-side certificates and maintaining Public Key Infrastructure (PKI) which contains the certificates. An overview of frame exchanges are shown in the picture below

Once the client device (supplicant) goes through the open system authentication and association process, it initiates EAPOL start message. The use of EAPOL start message is optional. The access point (authenticator) sends an EAP Request message asking for the identity of supplicant. The supplicant can send a response with real or a dummy identity depending on the configuration. Authenticator will then initialize an Access Request to the Authentication Server with the identity provided by the supplicant. The authentication server presents the server certificate which the supplicant validates before presenting client certificate to the server. The supplicant may or may not choose to validate the server certificate but validation will provide mutual authentication thereby providing better security. After the supplicant provides its certificate, server validates it and sends an access accept or reject message depending on the authenticity of the client certificate. It must be noted that this is only an overview of the process when in reality there are numerous other handshake messages between supplicant and authentication server before the final access accept/reject message. The end result of a successful EAP-TLS exchange is a Master Session Key (MSK) which is used to generate Pairwise Master Key (PMK) which is in turn used to generate sessions keys through the four way handshake for encrypting packets between client and access point.

Fast BSS Transition (FT) uses the concept of key hierarchy to generate multiple keys that will help in efficient roaming. It uses a three level key hierarchy. The MSK from 802.1X EAP process is used to generate first level PMK which is called PMK-R0. PMK-R0 is used to generate second level keys PMK-R1 which generates Pairwise Transient Key (PTK) which is used for encryption between client and access point. Depending on the WLAN architecture, these keys are stored by different devices. I used Mist Systems infrastructure in this case. Mist architecture does not contain a centralized controller. So the PMK-R0 derived from MSK is stored in the access point the device initially connects to. PMK-R1 keys are generated for each access point in the network and transmitted over a secure channel. The following picture shows a summary of where keys are stored.

In summary for first connection, client device needs to go through open system authentication, association, 802.1X EAP process and 4-Way handshake before being able to successfully send its first data packet. The process is shown below

The device is authenticated during the first connection and so when roaming it should not have to go through the entire 802.1X process again to prove its identity. However, it would still have to go through open system authentication (2 frames), re-association (2 frames) and 4-way handshake (4 frames) procedures to be able to communicate on the new access point. That would be eight frames not including ACKs between the new AP and client device. FT defines two methods to enable enhanced roaming: Over-the-air Fast BSS Transition and Over-the-DS Fast BSS Transition

Mist infrastructure employs over-the-air mechanism by default. With this method, FT effectively combines the 4-way handshake functionality with open system authentication and re-association frames thereby reducing the number of frames by half. The roaming process is shown below:

For this study, I chose an android mobile device that authenticates and authorizes using the EAP-TLS framework. The authentication server is an ISE instance with an average of 25 milli seconds latency to the AP. To collect data for analysis, I performed seven iterations of roaming tests. Each iteration had one initial connection instance when the device goes through Authentication, Association, 802.1X EAP and 4-way handshake and at least one roaming instance when the device goes through authentication and re-association. The graph presented below shows the overall time from beginning of authentication to the end of 4-way handshake. Most iterations took less than one second to complete the connection process. However, there is an outlier which took 1.714 seconds for the process to complete. Accounting the outlier to WAN fluctuations and excluding it from calculations gives the average time to complete the initial connection process as 905.8405 milli seconds.

Time for Auth + Assoc + 802.1X + 4-way Handshake

I was also interested in looking at how long only the 802.1X process took. The following shows the time taken from the first EAP-Request frame to ACK frame of EAP-Success. Excluding the outlier, the average time for 802.1X EAP process is 878.84313 milli seconds.

Time for 802.1X Process

I was able to produce 14 roaming instances in the seven iterations. I considered the time between first authentication frame to the ACK of Re-association response as the time required for handoff between the APs. The average of all instances is 190.512 milli seconds which is about 21% of the total time taken for first connection.

Time for Auth+Re-Association

A lot of factors including the delay between access point and authentication server, number of clients connected to access points, retransmissions, coverage overlap etc will impact these numbers. Every application is designed differently to handle traffic. Some of them might have higher timeout values while others could be very time sensitive. The goal of this study was to have an idea of time for initial connection and roaming handoffs and to be able to use this data to help determine application resiliency to these events. I hope this helps for other engineers in the community as well!

Reference:

  • CWSP Study Guide  by David A Coleman, David A. Westcott and Bryan Harkins

A Framework to Test WiFi Performance of Handheld Devices in Retail Environment

Technology has almost always had a positive impact in everyone’s life. Retail stores have been embracing technology not just to improve the customer experience but also to optimize the operational efficiency and improve the employee work experience. Handheld devices have played a major role in this regard. Be it for inventory management or to help customers checkout items without having to wait in the queue at the register or to function as digital badges for employees, handheld devices play a prominent role in the retail sector. This blog post provides a framework to setup and perform device testing that can help in evaluating the performance on WiFi. However, this piece does not venture into the details of any specific client issues. The post is divided into three sections.

Setting Up the Test Environment

Ideally, any client device testing needs to be performed in production environment where it is intended to be used. If the handhelds are meant to be used inside of store, the testing also have to be performed in an actual store. Some handhelds are designed for outdoor applications. Testing for such devices also needs to be done outdoors accordingly. This is my personal choice for device performance testing. But if that is not feasible, setting up a lab environment that closely replicates the store environment is critical. Most retail companies have mockup stores to demo emerging technologies and such store setup is ideal for a lab. If there are no budget constraints, the entire mockup store can be used as a lab environment. But, that is not always the case. Plan to setup the lab environment with at least four access points. If there are multiple vendor solutions across your retail chain, run multiple cables to each location and install multiple APs next each other (yes, this is bad unless there is only one active network for each iteration of testing).  For example, if you have Cisco 2800s in some stores and Mist AP41s in others, you can run two cables to each location and install the APs next to each other. If you would like to test in Cisco environment, you can shut down the Mist APs and vice versa. The AP placement needs to be identical to that of a store environment.  If mockup store is not available, then setting up APs in the office and replicating placement in stores would be the last option. In any case, we will require at least four APs.

I will be using the following setup for this blog post purposes.

This is a mockup floor plan with resembling AP placement of a store in production. Notice the channel planning scheme contains one channel from each of U-NII-1, U-NII-2, U-NII-2e and U-NII-3 bands in 5 GHz. The benefit of using four access points with the channel scheme is it would provide insights into the client device’s operational capabilities in each of these bands. Adjust the power levels to be able to produce roaming instances during the process.

Tools and Applications Required

Now that we have a test environment set up, next step would be to identify what features we want to test in the process. One way to determine the device capabilities is going through the association request but a more efficient way is to use the profile application on the WLAN Pi. Instructions on how to use the application can be found here. This application provides client reports of the following format and lists all the capabilities of the devices.

————————————————————

Client capabilities report – Client MAC: #MACHere

(OUI manufacturer lookup: #Namehere)

————————————————————

802.11k                   Supported

802.11r                   Supported          

802.11v                   Supported          

802.11w                  Not reported       

802.11n                   Supported (2ss)     

802.11ac                 Supported (2ss), SU BF supported, MU BF not supported

802.11ax_draft      Not Supported      

Max_Power            22 dBm             

Min_Power            8 dBm              

Supported_Channels   36, 40, 44, 48, 52, 56, 60, 64, 100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140, 144, 149, 153, 157, 161, 165

Packet capture tool like Omnipeek or Commview with capabilities of capturing packets on multiple adapters is required. The choice of number of adapters depends on the WLAN. If it is 5 GHz only network, a set of four adapters can be sufficient. If the WLAN is dual-band, a minimum of seven adapters are required. Three adapters will be capturing on each of channels 1, 6 and 11 and the remaining four adapters on channels configured on the APs (36, 64, 100 and 149 in this case). I use Omnipeek with Netgear A6210 adapters.

Multi ping tools like PingPlotter can be very handy to setup continuous pings to the client devices and save data for offline analysis. PingPlotter does a great job of representing ping responses on time axis and traceroute results.

A mobile cart or shopping cart can help in moving the setup from one location to other.

The last of the tools depend on the type of client device being tested. A lot of handheld devices used in retail have Android OS which supports a multitude of WiFi diagnostic apps and logging features. Any diagnostics app like the one below showing RSSI continuously can help in understanding the device performance.

Testing Procedure

After determining the device capabilities from WLAN Pi profiler, we need to identify the WLAN features currently in use. For example, although the client device supports all 802.11k/v/r amendments, all of them may not be enabled on the network. So, it is important to identify the list that we can test and strategize the testing process. Any procedure to evaluate the device performance on WLANs need to be able determine the effectiveness of the following events at a minimum

  • Probing
  • Authentication
  • Association
  • 802.1X EAP Exchange (if the WLAN is configured for it)
  • 4-Way Handshake
  • Power save (depending on the support)
  • Roaming – 802.11k/v/r (depending on the support)

While events from probing to 4-way handshake are common to any client device, handheld devices need special attention when it comes to power save and roaming behavior. Every manufacturer has different drivers and these mechanisms determine the device performance on WiFi to a large extent. The process defined below helps in gathering data required to determine this.

  • Start at TEST_AP01. Set up Omnipeek with adapters capturing on all 5 GHz channels assigned to the APs along with additional adapters on channels 1,6 and 11 in case of dual band WLANs. Do not configure any mac address based filters for capturing the packets. I recommend using a mobile cart or shopping cart for the laptop setup.
  • Connect the client device to the WLAN and note the IP address. Initiate a continuous ping to the client device from PingPlotter.
  • Turn off WiFi on client device and initiate packet captures on AP ethernet ports.
  • Make sure the adapters are about four to five feet away from the client device to avoid any sideband effects on the capture.  Initiate capturing packets on Omnipeek and turn on WiFi adapter on client device.
  • Confirm the device received same IP as before and continuous ping on PingPlotter is working.
  • Follow a path from one AP to the other similar to the one shown below. Open the diagnostics app if available and keep notes on RSSI values especially when a roam is observed.
  • Once you reach back to TEST_AP01 location, stop the continuous ping on PingPlotter and all the captures on Omnipeek and AP ethernet interfaces.  Export the files with any naming convention of your choice.
  • Repeat the same process without a continuous ping for iteration 2. This iteration is to collect data when the device does not have active traffic.

At the end of both iterations, we have gathered most of the data required for analysis that can help in determining the client device behavior in multiple aspects like

  • Probing behavior of client device (sequential, preference of non DFS vs DFS, selective when using 802.11k)
  • Association, Authentication, Re Association and 4 Way handshake
  • 802.1X process from captures on Omnipeek along with AP ethernet interfaces
  • Band preference (2.4 GHz vs 5 GHz)
  • Roaming performance between channels belonging to U-NII-1, U-NII-2 , U-NII-2e and U-NII-3 bands
  • Device power save behavior (with traffic in iteration 1 vs without traffic in iteration 2)
  • RSSI thresholds at which roaming occurred from the diagnostics tool. In the absence of such tools, RSSI values on Omnipeek can help in estimating the threshold to initiate probing to roam.
  • Ping losses and timestamps from PingPlotter to correlate with roaming instances or to detect power save issues.

The tools required and guidelines provided are intended to help evaluate the device performance with minimum inventory in the best way possible. Having more access points and more adapters for packet captures will certainly help in gathering more data which will be very useful for analysis.

The Case for Directional Antenna in Modern Enterprises

Every enterprise going through a real estate or technology refresh is thinking about ways to enhance the work experience and stimulate the creativity and collaboration of its employees by modernizing the office spaces. One of the key components of modern enterprise space is the ability to provide ubiquitous wireless connectivity that is reliable in every corner of the office. All wireless office spaces or in other words spaces with zero visible ethernet cables have gained momentum in the recent years and some of the implementations I worked on taught me the advantages of deploying directional antennae in place of APs with traditional omnidirectional antennae. In this blog, I will cover a few benefits that could justify the additional costs involved in deploying APs with external directional antennae.

Most modern enterprises have an open office design which is bad for WiFi in terms of Co-Channel interference. Without walls, it is not possible to contain signal from omnidirectional antennae and the client device tends to stick to the associated AP longer than ideal. Following is the coverage from a Mist AP41 with integrated omni directional antennae with EIRP at 14 dBm.

The following is the coverage from a Mist AP41E with AccelTex antennae (ATS-OHDP-245-46-4) having the same EIRP.

The use of directional antennae restricts the signal from propagating farther and creates smaller coverage cells which are critical for high performance in an open office environment. Smaller cells help reduce the co-channel interference especially in large open high-density offices where the number of APs exceed 25 (assuming you are using all 25 channels in 5 GHz). If voice applications are used on the wireless network, we need to avoid using some of the DFS channel which will further decrease the flexibility in channel assignment

Optimal Roaming: One of the top things in the wish list of every Wi-Fi engineer is to be able to have control over client device roaming. Directional antennae although do not provide full control over roaming, the smaller cells designed will encourage client devices more often than omnidirectional APs to be always connected to the closest AP.

Flexibility with channel width assignment: Some of the teams require frequent file transfers and the throughput achieved on 20 MHz channels in most cases is less than desirable. In such areas, 40 MHz channels can be configured without having to worry about the channel reuse patterns.

Aesthetics: Every building architect wants to have a WiFi network with highest performance from invisible APs (of course installed above the ceiling). I always found concealed antennae from vendors like AccelTex and Ventev as a great solution to place APs above the ceiling and install the aesthetically pleasing antennae below the ceiling.

Most modern enterprises need to be treated as high density environments. Granted they are not super dense as an arena or stadium but taking into consideration the business criticality of applications, reliability that needs to be delivered and the increased density of user devices, deploying directional antennae can yield the best results. Depending on the number of users and per user throughput SLA requirement, directional antennae with appropriate beam widths can be chosen to come up with an optimal coverage and capacity design.

Hands-On Deep Dive into Opportunistic Key Caching

Opportunistic key caching (OKC) is a fast secure roaming technique that leverages sharing the Pairwise Master Key (PMK) across access points that are under an administrative control. After a client authenticates to an access point and derives a PMK, the access point shares this PMK along with a PMKID with other access points. Protocols defined to share this information between access points are often proprietary. The PMKID is a result of hash function  on the PMK , the client MAC address and the authenticator address. The PMKID allows the creation of unique security associations between the devices.

PMKID

In this demonstration, the client device (windows 10 machine) roams from AP1 to AP2. Both access points are from Aerohive and placed optimally to encourage  client roaming.  The mac address of client device is 0028:f8ab:cb51 and the authenticator address (BSSID) of AP1 is c413:e23d:40e5 and of AP2 is c413:e23d:8965. The following is a step by step procedure to demonstrate the process of roaming using OKC.

Step 1. The client connects to AP1 and uses the full 802.1X/EAP process to derive a PMK and PMKID #1.

Init AP1.PNG

Step 2: AP1 communicates this information to AP2 over LAN using proprietary protocols.

Init AP2

Notice the hop count is 0 on AP1 and 1 on AP2 because the device is initially connected to AP1.

Step 3: When the client device moves away from AP1 and closer to AP2, the client device calculates a new PMKID #2 using the PMK along with the AP2’s address and client mac address. This information in sent in the reassociation request packet.

Roaming process

The PMKID #2 can be found under the RSN information tag of the reassociation request packet.

RSN tag

Step 4: AP2 calculates the PMKID#2 from the client mac address information received through the reassociation request. If the PMKID #2 matches, then reauthentication is not required and AP2 sends a success code on the reassociation response. At this stage, AP2 has the new PMKID#2 and the PMK which will allow for a unique security association.

Post AP12.PNG

Step 5: The encryption keys are generated through the 4-way handshake after the re-association process and the client device sends a dissociation frame to AP1.

This procedure is summarized in the following picture.

OKC Roaming

OKC eliminates the need for 802.1X/EAP process resulting in a faster handoff between the access points. Time analysis of this demonstration indicated that it took only 2.96 milli seconds after the reassociation response to generate the keys while the initial authentication to AP1 using entire 802.1X/EAP process took about 93.87 milli seconds to generate keys after association phase.

Resources:

  1. CWSP Study Guide  by David A Coleman, David A. Westcott and Bryan Harkins
  2. Packet captures available for download here.