Compute analytics in financial services have evolved over the past decade. Popular tools for forecasting and assessing risk like statistical functions involving linear regression, logistic regression have given way to more sophisticated models. Classic decision tree or regression tree algorithms have evolved to modern variations like random forest techniques and gradient boosted trees. Fraud detection algorithms now involve ability to execute a range of forward and backward Monte Carlo algorithms/simulations. Statistical learning, predictive modeling and data mining now use ML techniques more frequently.
While every ML library brings their own flavor to the financial world, the non functional aspects of the compute is constant.
- Low latency analytics
- Need to parallelize compute
- Need to GPU accelerate the library
- Ability to interface with traditional programing languages (Python, Java, C++ etc)
Now imagine a scenario where you can take your compute, package it as a library, and register it with your database as a custom function and invoke it using your language of choice. These could be ML libraries or any custom module/application written in a programing language of your choice. This in-database machine learning is now possible with Kinetica.
In the short demo, you can see how to calculate volatility and value at risk (VaR) of incoming stock tick data. These 2 functions are usually used to gauge the risk of investments in your portfolio. The steps are simple.
- Choose any compute algorithm of your choice (we choose Vol and VAR)
- Register it as a Kinetica UDF (lets call it Risk Management)
- Determine an input feature set of your choice (in this case – choose a portfolio of Stocks)
- Execute it using a language of your choice (This example uses Java. You could use Python or C++ instead)
What does this mean for financial services firms?
Milliseconds matter in the financial services industry. Running ML models to detect patterns usually involve multiple methodologies. Explore/Exploit is used for extracting signals from unstructured data and determine potential nuggets of “gold” in them. Train/Execute involves training the model by invoking it multiple times with different datasets and compare the output to determine model efficiency. Allow the ML library to make mistakes – mistakes are part of the process. The model could then be self trained to be more accurate. To achieve this, a rinse and repeat methodology is used – running the model multiple times with different feature sets. Accelerated model execution, quicker decision times, streamlined access to data and invocation helps alleviate some of the bottlenecks faced.
For more on Kinetica UDF’s visit: https://www.kinetica.com/product/in-database-analytics/
Or get in touch, for a more in-depth demo of how this can work with your organization.