nxAccess CME Pattern Matcher Latency Report

This latency report details the performance of the Enyx nxAccess CME pattern matcher functionality.
To demonstrate the capabilities of the nxAccess solution, this report will detail the latency profile of the solution when triggering orders using the normalized feed handler output and also with the pattern matcher functionality in identical conditions using the same tick-to-cancel algorithm.
These latency numbers were obtained using our standard nxAccess hardware reference design and software demonstration application and we invite our customers to replicate these tests in their labs to confirm our results.

Setup Configuration

Schematics

_images/nxaccess_diagram.png

This testing configuration uses two different servers:

  • A host server:
    • Hosting an Enyx nxAccess FPB2 board running a tick-to-cancel reference algorithm.
    • Running a software test application responsible for configuring and monitoring the FPGA during the test.
  • A replay and capture server:

    • Replaying market data captures at defined speeds.
    • Running a TCP server which the nxAccess execution engine will connect to.
    • Capturing the timestamped raw market data and TCP execution traffic forwarded by the 7130 timestamping device.
The layer 1 switch used in the testing setup allows both the raw market data feed and the TCP segments to be forwarded to a timestamping device for precise timestamping and also forwarded to the capture server for storage and post-analysis.
The analysis of the time-stamped data is used for latency measurements but also for the computation of the effective rate at which the replay server was able to replay the raw market data into the system under test.

Note

All the latency numbers presented in this performance report are Start-Of-Packet (SOP) to Start-Of-Packet (SOP) measurements performed on the switch without any adjustment to account for the content of the data or the length of fibers and devices on the datapath.

Host Server Characteristics

CPU 1 model & version Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz
CPU 1 PCIe devices N/A
CPU 2 model & version Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz
CPU 2 PCIe devices Enyx FPB2

Replay & Capture Server Characteristics

CPU 1 model & version Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
CPU 1 PCIe devices Solarflare SFC9220 10G Adapter Solarflare SFC9220 10G Adapter
CPU 2 model & version Intel(R) Xeon(R) Gold 6128 CPU @ 3.40GHz
CPU 2 PCIe devices N/A

Layer 1 Device Model & Version

Model Arista 7130-48KB
Version MOS 0.17.4
Port Speed 10Gb
Port to port advertised latency 5 ns

Layer 2 Device Model & Version

Model Arista 7150s
Version EOS-4.14.6M
Port Speed 1Gb/10Gb
Timestamping resolution 2.857 nanoseconds
Timestamping trigger First Byte of the FCS

Enyx Solution Characteristics:

FPGA card version Enyx FPB2
Firmware version nxAccess CME Rev 39140
Software version libenyxmd 6.2.0, libenyxoe 5.3.0
Driver version hfp 2.9.5
Thread Binding CPU ID 6-11 (Numa 1)
Card NUMA node 1

Capture Characteristics

Capture CME-Globex-MDP3_2022-01-24_5-minutes
Packet Count 2 357 415
Channel Count 3
Capture Duration 0:04:39
Beginning Date Mon, 24 Jan 2022 09:01:06
End date Mon, 24 Jan 2022 09:05:45
Timestamping resolution Microseconds
Packet Rate Peak (s) 19 514 pkt/s
Packet Rate Peak (100 ms) 47 370 pkt/s
Packet Rate Peak (10 ms) 106 800 pkt/s
Packet Rate Peak (1 ms) 235 000 pkt/s
Bit Rate Peak (s) 46.16 Mbps
Bit Rate Peak (100 ms) 161.77 Mbps
Bit Rate Peak (10 ms) 398.01 Mbps
Bit Rate Peak (1 ms) 616.27 Mbps
_images/CME-Globex-MDP3_2022-01-24_5-minutes_global_Packet_Rate.png _images/CME-Globex-MDP3_2022-01-24_5-minutes_global_Bit_Rate.png

Test Overview

This performance report outlines two identical test scenarios, with and without the new nxAccess pattern matcher functionality enabled.
The algorithm running on the nxAccess FPB2 board is Enyx’s nxAccess HDL reference design which looks into the price difference between the price of a trade summary message and the last top of book update for a given instrument. If this difference is higher than a set threshold, the hardware algorithm triggers a cancel order previously pre-loaded by the software application.
To highlight the performance edge that the new pattern matcher functionality can provide we ran two identical tests, one with the hardware algorithm using the feed handler to listen to the trade summary updates and another one where the hardware algorithm uses the pattern matcher to listen to the trade summary updates. In both cases, the hardware algorithms received top of book updates from the feed handler.
To run this test, we used a default software demonstration application, enyx-nxaccess-hld-end-to-end-demo with the configuration files provided in Configuration files section below using the detailed tutorial available in the nxAccess User Manual.
  • Scenarios:
    • Trade summary updates from the feed handler
    • Trade summary updates from the pattern matcher

Results preview

While this latency report includes a lot of details about the performance of the nxAccess solution under test, here is a preview of the results:

Scenario Min 25% 50% Mean 90% 99% Max
Feed Handler 567 ns 574 ns 578 ns 624 ns 670 ns 1 739 ns 1 871 ns
Pattern Matcher 344 ns 352 ns 355 ns 389 ns 383 ns 1 191 ns 1 314 ns
Latency Improvements 223 ns 222 ns 223 ns 235 ns 287 ns 548 ns 557 ns

As highlighted by the table above, by simply using the nxAccess pattern matcher to process the CME trade summary messages the tick-to-cancel algorithm was able to reduce latency by 223 nanoseconds for the minimums and 557 nanoseconds for the maximums.

Note

As explained in the Setup Configuration section above, all the latency numbers presented in this report are SOP-to-SOP measures without any adjustments to account for the offset of a trading signal in the input market data packet. As a result, the latency of nxAccess tick-to-cancel algorithm tested is directly correlated to the offset of the CME trade summary message in the raw market data packet. With the reference market data capture containing MTU sized packets with trade summary messages received after 1400 Bytes which takes 1,12 µs to serialize over a 10Gbs link.

Configuration files

The following configuration files were used for both test scenario:

enyx-nxaccess-hld-end-to-end-demo configuration

[order_entry]

[order_entry.interface]
ip = "192.168.65.3"
netmask = "255.255.255.0"
gateway = "0.0.0.0"
vlan = 2

[order_entry.remote]
endpoint = "192.168.65.2:16000"

[market_data]
# streams channels and endpoints description file. Use Enyx market default if empty
supportDescriptionPath = "/usr/share/libenyxmd/exchanges/CME-GLOBEX-MDP3/SupportDescription.xml"

# xml market instrument list to subsribe.
subscriptionPath = "/HDL_demo/Subscribe.xml"

# market data formats files. Use Enyx market default if empty
listingFiles = ['/media/captures/Captures/CME_MDP3/20220124-090002/secdef.dat']

# Lane selection ("A", "B", "AB" or "BA")
lanes = "A"

# Lane type (PRODUCTION/CERTIFICATION/NEW_RELEASE_CERTIFICATION)
laneType = "PRODUCTION"

# Feedinput data source type: Pcap / Socket / Hardware
inputType = "Hardware"

[market_data.hardware]
# Suffix appened to nxnet0-SFP1 / SFP2. Used for vlans.
firstInputInterfaceSuffix = ""
secondInputInterfaceSuffix = ""

[market_data.publish]
oders = false
books = true
others = true
uds = false
depth = 1

[strategy]
verbose = true
trigger_once = false
use_trade_summary = true
trigger_type = "Hardware"

[strategy.default]
instrument_id = 0
collections = [1]

[strategy.0]
price_threshold = 0
enabled = true

Subscription file

<?xml version="1.0" encoding="utf-8"?>
<Subscription xmlns="http://enyx.fr/md/subscription" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
     <Exchange name="CME">
        <Instrument name="MESH2"/>
        <Instrument name="ESH2"/>
        <Instrument name="MNQH2"/>
        <Instrument name="NQH2"/>
        <Instrument name="RTYH2"/>
        <Instrument name="M2KH2"/>
        <Instrument name="GEZ4"/>
        <Instrument name="GEH4"/>
        <Instrument name="GEU4"/>
        <Instrument name="GEM4"/>
        <Instrument name="GEH5"/>
        <Instrument name="GEM5"/>
        <Instrument name="GEU5"/>
        <Instrument name="GEZ5"/>
        <Instrument name="GEZ3"/>
        <Instrument name="GEU3"/>
        <Instrument name="GEH6"/>
        <Instrument name="BTCF2"/>
        <Instrument name="IBVG2"/>
        <Instrument name="GEH3"/>
        <Instrument name="GEM6"/>
        <Instrument name="GEM3"/>
        <Instrument name="XAYH2"/>
        <Instrument name="MBTF2"/>
        <Instrument name="XAKH2"/>
        <Instrument name="NIYH2"/>
        <Instrument name="GEZ2"/>
        <Instrument name="EMDH2"/>
        <Instrument name="NKDH2"/>
        <Instrument name="GEU6"/>
        <Instrument name="ETHF2"/>
        <Instrument name="ESM2"/>
        <Instrument name="METF2"/>
        <Instrument name="NKDH2-NIYH2"/>
        <Instrument name="GEU2"/>
        <Instrument name="SR3Z3"/>
        <Instrument name="XAIH2"/>
    </Exchange>
</Subscription>

Scenario #1 : Feed handler updates - 1X Market Rate

Conditions

Instrument Count 37
Subscribed Channel Count 310, 312 and 318
Trading Instrument Count 1 => CME/MESH2
Lane Arbitration A only
Book Builder Configuration Delta Updates
Output Book Depth 5
Requested Replay Rate 1

Observed Replay Rates

_images/Latency_nxAccess-CME-to-OUCH_Replay-1x_Feed-Handler_Packet_Rate.png
Type Rate Replay Ratio
Average Packet Rate (1s resolution) 8 246 pkt/s 0.98
Average Packet Rate (100ms resolution) 9 230 pkt/s 0.97
Average Packet Rate (10ms resolution) 13 281 pkt/s 0.94
Average Packet Rate (1ms resolution) 52 989 pkt/s 2.00
Peak Packet Rate (1s resolution) 19 233 pkt/s 0.99
Peak Packet Rate (100ms resolution) 41 600 pkt/s 0.88
Peak Packet Rate (10ms resolution) 146 200 pkt/s 1.37
Peak Packet Rate (1ms resolution) 618 000 pkt/s 2.63
_images/Latency_nxAccess-CME-to-OUCH_Replay-1x_Feed-Handler_Bit_Rate.png
Type Rate Replay Ratio
Average Bit Rate (1s resolution) 16.32 Mbps 0.99
Average Bit Rate (100ms resolution) 18.76 Mbps 0.98
Average Bit Rate (10ms resolution) 28.29 Mbps 0.94
Average Bit Rate (1ms resolution) 112.39 Mbps 2.01
Peak Bit Rate (1s resolution) 44.67 Mbps 0.97
Peak Bit Rate (100ms resolution) 142.87 Mbps 0.88
Peak Bit Rate (10ms resolution) 487.1 Mbps 1.22
Peak Bit Rate (1ms resolution) 2 641.14 Mbps 4.29

Results

As shown in the replay characteristics above, the replay rate was close to the original market rate with higher burst rates of 2 to 4 times on the 1ms windows.

Latency profile

Min 25% 50% Mean 90% 99% 99.9% 99.99% 99.999% Max Sample Count
567 ns 574 ns 578 ns 624 ns 670 ns 1 739 ns 1 871 ns 1 871 ns 1 871 ns 1 871 ns 399

Sample Distribution

Note

hovering over the dynamic graph above will provide more details about the occurrence for each latency plot as well as the minimum, average and maximum input market data packet size observed for each sample.

Latency Distribution Over Time

Note

hovering over the dynamic graph above will provide more details about the market data sequence number of the maximum and minimum latency observed as well as information about the number of samples observed over this latency for a given time slice.

General Statistics

Input Raw Packet Processed Count 2 357 298
Input FIFO maximal usage 2.12 %
Input Raw Market Data Packet Drop 0
Output Total Order Count 399
Output Order Count on TCP Stack 1 399
Output Order Count on TCP Stack 2 0

Scenario #2 : Pattern matcher updates - 1X Market Rate

Conditions

Instrument Count 37
Subscribed Channel Count 310, 312 and 318
Trading Instrument Count 1 => CME/MESH2
Lane Arbitration A only
Book Builder Configuration Delta Updates
Output Book Depth 5
Requested Replay Rate 1

Observed Replay Rates

_images/Latency_nxAccess-CME-to-OUCH_Replay-1x_Pattern-Matcher_Packet_Rate.png
Type Rate Replay Ratio
Average Packet Rate (1s resolution) 8 234 pkt/s 0.97
Average Packet Rate (100ms resolution) 9 204 pkt/s 0.97
Average Packet Rate (10ms resolution) 13 251 pkt/s 0.94
Average Packet Rate (1ms resolution) 53 508 pkt/s 2.02
Peak Packet Rate (1s resolution) 19 116 pkt/s 0.98
Peak Packet Rate (100ms resolution) 43 210 pkt/s 0.91
Peak Packet Rate (10ms resolution) 141 200 pkt/s 1.32
Peak Packet Rate (1ms resolution) 701 000 pkt/s 2.98
_images/Latency_nxAccess-CME-to-OUCH_Replay-1x_Pattern-Matcher_Bit_Rate.png
Type Rate Replay Ratio
Average Bit Rate (1s resolution) 16.28 Mbps 0.99
Average Bit Rate (100ms resolution) 18.7 Mbps 0.98
Average Bit Rate (10ms resolution) 28.31 Mbps 0.94
Average Bit Rate (1ms resolution) 113.99 Mbps 2.04
Peak Bit Rate (1s resolution) 44.12 Mbps 0.96
Peak Bit Rate (100ms resolution) 144.92 Mbps 0.90
Peak Bit Rate (10ms resolution) 507.39 Mbps 1.27
Peak Bit Rate (1ms resolution) 2 735.26 Mbps 4.44

Results

As shown in the replay characteristics above, the replay rate was close to the original market rate with higher burst rates of 3 to 4 times on the 1ms windows.

Latency Profile

Min 25% 50% Mean 90% 99% 99.9% 99.99% 99.999% Max Sample Count
344 ns 352 ns 355 ns 389 ns 383 ns 1 191 ns 1 314 ns 1 314 ns 1 314 ns 1 314 ns 399

Sample Distribution

Note

hovering over the dynamic graph above will provide more details about the occurrence for each latency plot as well as the minimum, average and maximum input market data packet size observed for each sample.

Latency Distribution Over Time (50 percentile)

Note

hovering over the dynamic graph above will provide more details about the market data sequence number of the maximum and minimum latency observed as well as information about the number of samples observed over this latency for a given time slice.

General Statistics

Input Raw Packet Processed Count 2 357 298
Input FIFO maximal usage 2.37 %
Input Raw Market Data Packet Drop 0
Output Total Order Count 399
Output Order Count on TCP Stack 1 399
Output Order Count on TCP Stack 2 0