Cme Real Time Market Data Feed
Get quick insights with Google Cloud, powered by CME Group real-time data
Risk management requires insight
Market uncertainty poses risk to any trading organization's assets and operations. Organizations that analyze, quantify and mitigate their risks find themselves better prepared for the future. Real-time market data feeds are a key component of risk management strategies, but collecting price data for an array of different instruments can be complex and costly to operate.
The CME Smart Stream real-time market data feed, delivered through Google Cloud, enables organizations to consume this data efficiently and economically. In this demo, we show how it can help risk managers, data scientists, and infrastructure engineers to operate more effectively.
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
The market data series on the Google Cloud blog goes in-depth on real-time market data visualization and creating a serverless market data pipeline.
About the data
In contrast to traditional market data distribution architectures, CME Smart Stream allows you to be much more selective about the subset of market information processed by your application. Oftentimes applications discard large numbers of messages just to extract the prices of instruments they do follow. CME Smart Stream, however, broadcasts each change over Google Cloud's Pub/Sub messaging service and assigns each product, like corn or silver futures, to their own individual topic. Consumers can subscribe only to the symbols of interest, driving down transport, ingest, processing, compute and storage costs.
Architecture
The real-time price chart's architecture is simple: a Google Kubernetes Engine (GKE) deployment consumes Pub/Sub messages and broadcasts them over a websockets stream, which feeds a Google Charts-based UI. The design minimizes time-to-eyeball by visualizing streamed data rather than a persisted copy. GKE provides self-healing and automatic scaling features that reduce operational toil.
Hope is not a strategy
It can be challenging to assess a firm's overall risk exposure in real-time. Pricing data delivered at low-frequency, or limited access to historical training datasets, can paint a picture of risk that is inaccurate, hampering decision-making downstream. Historically, access to real-time risk exposure has required a data operation too costly or complex for many market participants.
CME Group's real-time market data streamed on Google Cloud can help change that. Consuming CME Smart Stream quotes via Pub/Sub and joining this data with diverse data sets is both time- and cost-efficient. Positions can be marked-to-market for each tick from the exchange, simplifying the visualization of exposures in real-time.
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
bedtimeMarket closed
| receipt CONTRACT | timeline BID | timeline ASK | update UPDATED | ||
bedtimeMarket closed
| android TRADER | swap_vert TRADE | timeline PRICE | timeline MARKET | exposure P&L |
bedtimeMarket closed
| android TRADER | swap_vert TRADE | timeline PRICE | timeline MARKET | exposure P&L |
bedtimeMarket closed
| android TRADER | swap_vert TRADE | timeline PRICE | timeline MARKET | exposure P&L |
bedtimeMarket closed
| android TRADER | exposure P&L | swap_vert TRADE | gavel SETTLED |
bedtimeMarket closed
| android TRADER | exposure P&L | swap_vert TRADE | gavel SETTLED |
bedtimeMarket closed
| android TRADER | exposure P&L | swap_vert TRADE | gavel SETTLED |
Charts
Having real-time price data enables a snapshot of current exposures. This assists in the understanding of market behavior, and provides insights into individual trader performance.
There are several technical components involved in producing these charts: Pub/Sub for topic-based messaging, GKE for hosting of the Websocket streams powering the visualizations; Dataflow for data transformation, and BigQuery for data warehousing, calculation and generating performance metrics.
Architecture
Because this demo has multiple data use cases, it relies on multiple data solutions. Bigtable, a high-throughput, low-latency NoSQL database, is used for rapid processing of the most recently persisted time-series events. BigQuery - a serverless, petabyte-scale data warehouse - enables fast analysis of longer-horizon historical events in SQL. The Pub/Sub to Websocket bridge that feeds data to the UI runs a Google Kubernetes Engine cluster per Smart Stream topic (corresponding to a single product - like wheat futures).
BigQuery currently holds about 100 GB of data - four months of top-of-book tick data for the three instruments. CME Smart Stream enables à la carte-style selection of symbol feeds, so instead of sifting through data you don't need to get to the symbols that you do, CME Smart Stream lets developers subscribe to a single Pub/Sub topic per product to get pricing data for all delivery months.
Dataflow
Here is the directed-acyclic graph that is shown within Dataflow for the templated pipeline that ingests CME Smart Stream price changes in real-time to Bigtable.
Cloud functions
Cloud functions provide the RESTful endpoints that perform tasks like settling trades or querying BigQuery for data to visualize.
Using cloud functions allowed the team to represent the application as a series of loosely-coupled, isolated code chunks to be easily tested, debugged, and run.
Firestore was used as the transactional database for stateless, non-historical UI displays such as Open Positions. While Firestore stores only the open positions, BigQuery persists the entire position history for rapid analysis, and exposes query result sets via cloud functions to the web application.
For more information about the tools we used to build the Risk Manager section of this demo, see:
- Pub/Sub to BigQuery using Dataflow
- Using Cloud Functions for microservices
- BigQuery for analytics
- Google Charts
Automation shreds technical risk
To manage complexity sprawl, infrastructure engineers develop repeatable and reliable processes for management routines, but proprietary, legacy or heavily-customized operational components can interfere with this goal.
Google Cloud's approach to infrastructure management embraces the infrastructure-as-code paradigm that provides a solid foundation needed to maximize automation and minimize toil. You can instantiate infrastructure components predictably and declaratively with code that's reviewed, approved and tested before deployment.
Architecture
When all the lights are green, infrastructure engineers risk becoming a team's least appreciated members. Without an unwelcome incident, it's easy to forget the toil involved in maintaining operational availability and performance.
Wherever possible we used serverless components -- such as Cloud Functions, BigQuery and Dataflow -- to scale this demo commensurate with end-user demand and market activity. We also used fully managed services, such as Google Kubernetes Engine, for hosting our Websockets deployment. To automate production builds and repeatably provision cloud resources, we authored Terraform scripts to be used with Cloud Build.
Deploying with these safeguards helps, but systems are rarely immune from developers injecting defects or instituting inefficiencies. The infrastructure engineer's goal is to automate and monitor the platform such that this becomes less likely and, if problems do occur, less severe. Infrastructure-as-code (IaC) component definitions and CI/CD pipelines are practices that facilitate this kind of operational methodology.
From a monitoring perspective, we used Cloud Monitoring to ensure the bots are behaving as intended. This includes an intervaled check of the prices for the Random and Model bots, with a more frequent check of prices for the Momentum bot, as indicated by the graphs below. Additionally, the three graphs with the line value of 1 indicate that the bots have opened a position in the last expected interval. If that were not the case, the line would be at zero and an alert would be triggered.
We are also monitoring the availability of the endpoints and the Istio ingress gateway. To do that, we set up monitoring using uptime checks on the following endpoints. This will trigger an alert if any of these endpoints fail their uptime check.
Real-time predictive analytics is real-time risk management
Data transformation and protracted training durations present a challenge for data scientists. Developing, socializing and refining timely and explainable insights is key to the success of the data science organization.
When real-time data is distributed and consumed in Google Cloud, it seamlessly integrates with all the analytics and modeling tools available to help you train transparent, explainable models - reducing your time-to-insight.
Prediction pipeline
The demo's goal in training predictive models was not to time the market, but to illustrate the ease with which predictive models can be built, deployed, and served on Google Cloud. To create the trading algorithms, the price data we ingested into BigQuery was imported into AutoML tables, which then performed its own feature selection. The trained model was deployed to production and exposed via a RESTful API, which then returns price predictions for a one-minute time horizon.
A Dataflow job retrieves that real-time data from Bigtable, which offers low-latency retrieval of time-series data. The model predicts what the maximum price of each instrument will be one minute from the API call. The trading bot then determines, based upon the prediction and the market's prevailing price, whether to go long, short or not trade at all.
Source: https://showcase.withgoogle.com/marketdata
0 Response to "Cme Real Time Market Data Feed"
Post a Comment