IBM Machine Learning on z/OS In this short video we’ll compare two options
for deploying a machine learning scoring service; on premises on an IBM Z14 mainframe,
or on a popular cloud service. We’ll examine throughput and response time for our solutions
and how those factors will impact our SLAs. For our use case we chose a banking scenario, since many traditional banks are losing customers to FinTechs,
which are perceived to be more efficient. So for this demonstration we used our core customer data on our IBM Z14
to build and train a customer churn model to predict whether a customer is likely
to take their business to another bank, like a FinTech. We can call the scoring service during
any real-time interaction with a customer through a mobile app, web app, ATM transaction,
telephone transaction, live teller transaction, or in batch to generate lists or reports. By exploiting the scoring service’s predictions in our applications, we can take proactive measures,
like targeting special offers or providing personalized services, to help retain the at-risk customers. A well utilized scoring service can be used
many thousands of times per minute. It needs to support high throughput and fast response time, so let’s compare the two architectures;
on premises on our IBM Z14 versus on the cloud. Since our corporate and customer data resides on premises,
on our IBM Z14 mainframe, our own intuition and the law of data gravity suggest that
running our Machine Learning on premises should win this race, but let’s see for ourselves. When we keep our data in situ we don’t incur the expense
of moving and storing large quantities of data to a data lake. We also don’t expose ourselves to the potential security risks
of duplicating the data outside of our most secure IT infrastructure. We are using Apache JMeter to drive our tests,
which are set to run for two minutes. Our JMeter results are being stored in an InfluxDB database
and rendered here on our Grafana dashboard. The green lines represent our On Premises solution,
and the orange lines are On Cloud. On the left, we can see that On Premises we achieve much greater throughput. In fact, an average of 75 times more throughput in our tests. On the right, we can see that we got 85 times faster response time on average. Perhaps equally compelling is that response times on the cloud are much less consistent. We can see this variability both on the graph and in the analytics below which show the 95th and 99th response time percentiles as being significantly higher, which would give us much lower confidence in our service level agreements (SLAs). So as we’ve now seen, by co-locating our machine learning scoring service
with our corporate data on our IBM Z14 mainframe, we can achieve much more throughput, dramatically better response times and greater confidence in meeting our service level agreements. Thanks for watching!

Leave a Reply

Your email address will not be published. Required fields are marked *