Standards-based, Open-source Middleware for Predictive Analytics Applications
Convert your fitted Scikit-Learn, R or Apache Spark models and pipelines into the standardized Predictive Model Markup Language (PMML) representation, and make quick and dependable predictions in your Java/JVM application.
Standardization enables automation, which in turn enables higher efficiency and higher quality business processes.
Why choose JPMML software?
The gold standard
PMML embodies decades' worth of knowledge and best practices in the Tabular ML field. Do not dismiss it, build on it!
- Joint effort. PMML is developed and maintained by the Data Mining Group (DMG), which is an independent consortium between major statistics and data mining software vendors.
- Backwards- and forwards-compatible. PMML version changes are incremental/evolutionary. Any PMML model that has been generated since early 2000s can be put into production today.
- No challengers. PMML outsmarts and outdoes alternative Tabular ML standardization efforts, whether commercial or (F)OSS.
The “less code, less coding” approach
PMML is designed around human-oriented data structures (as opposed to computer-executable code). By eliminating the code component, PMML eliminates the need for coding roles and coding activities within the organization. The data science team and the application development team can focus on their core competencies; there is no need for extra hybrid teams (“MLOps” or “DevOps”), extra inter-team communication.
- Functionally complete. PMML captures all of feature engineering, modelling and decision engineering.
- Reductionist. PMML promotes a one — the best — way of conceptualizing things. For example, all linear models collapse into a singular PMML “regression table” data structure, all decision tree models collapse into a singular PMML “tree” data structure, etc. Less to learn and remember.
- Auditable and verifiable. PMML holds ample metadata for quality assurance and quality control purposes. A PMML model is a fully self-contained and self-informed artifact, which takes active part in ensuring that its predictions are reliable and correct.
- Explainable. PMML operates at high, real-world abstraction level. The inner workings of a PMML model are easily approachable even for non-experts.
- Stable and secure. PMML models do not require any maintenance while in storage. PMML models cannot be attacked or breached while in production.
Full software stack
The Java PMML API software project provides a right Java-based tool for every PMML job. End user-facing components are equipped with command-line and/or REST interface, which enables integration with alternative application platforms.
- Layered library architecture. Specialized, high-level APIs are derived from common low-level APIs. Simple and minimalistic dependency graph. Automatic assembly of 5 to 10 MB application-oriented library sets.
- Pure Java. The same library set works on any Java SE 8/11 compatible Java VM.
- Most efficient conversion. JPMML converters support most common Tabular ML frameworks, model and transformation types. They perform deep examinations and analyses on the incoming pipeline object in order to come up with the best encoding plan. Sub-optimal encodings (by lesser PMML converters) are difficult to correct after the fact, and will degrade the deployment experience significantly.
- Most efficient evaluation. JPMML evaluators are among the smallest and fastest general-purpose scoring engines for the Java/JVM platform. They challenge the performance of any Tabular ML framework (with or without hardware acceleration) in real-time scoring scenarios.
- End-to-end reproducibility guarantee. JPMML predictions always match the original Tabular ML framework predictions, as proven by extensive integration tests.
Vendor backing and support
In the long run, you get what you paid for. The choice is between upfront, fixed monetary costs vs. after-the-fact, indeterminate business costs.
- Commitment. Openscoring is the only pure-PMML company in existence today.
- Longevity and reliability. Openscoring has over eight years of public track record. Openscoring will be there for the entirety of your software project, your business venture.
- Proof of work. Openscoring conducts original research, and is known for delivering many “Firsts” and “Bests” in the field.
- Direct, personal connection.
JPMML software is grouped into three product verticals:
Model conversion involves translating a fitted model or pipeline object from the original Tabular ML representation into the PMML representation, and saving it as a PMML document.
Install and import the SkLearn2PMML package.
Replace usages of the
sklearn.pipeline.Pipeline class with the
Fit as usual. Export the fitted
PMMLPipeline object into a PMML XML file using the
sklearn2pmml.sklearn2pmml(obj, path) utility function:
Install and import the R2PMML package.
Fit as usual. Export the fitted model object into a PMML XML file using the
r2pmml::r2pmml(obj, path) utility function:
Install and import the JPMML-SparkML library.
Fit as usual. Construct an
org.jpmml.sparkml.PMMLBuilder object based on the
DataFrame schema and the fitted
PipelineModel object, customize it, and export into a PMML XML file using the
PMMLBuilder#buildFile(File) builder method:
Model scoring involves loading a PMML object from a PMML document, and making predictions for new data records. The scoring component resides in the application space, right between “data source” and “data sink” components, which makes the workflow most suitable for real-time prediction (sub-microsecond turnaround times).
Install and import the JPMML-Evaluator library.
org.jpmml.evaluator.Evaluator object based on a PMML XML file.
Verify the state of the evaluator using the previously embedded dataset of tricky data records:
Score new data records. The ordering of dictionary entries is not significant, because in PMML fields are identified by name, not by position:
Install and import the JPMML-Evaluator-Python package. Establish Python-to-Java connectivity by launching a JPype, PyJNIus or Py4J backend. Load, verify and score. The Python API is designed after the Java API.
Turn any PMML document into a RESTful web service, and interact with it from any application, anywhere! The workflow is subject to network latency, and is therefore more suitable for less time-sensitive tasks such as web form prediction and batch prediction (tens to hundreds of milliseconds turnaround times).
Download and run the Openscoring server executable JAR file:
Performing a full deploy-score-undeploy workflow using a command-line cURL application:
Doing the same using the Openscoring-Python package:
Doing the same using the Openscoring-R package:
What are JPMML software license terms?
The base layer of JPMML software is released under the BSD 3-Clause License.
However, the higher and more sophisticated layers of JPMML software are released under the GNU Affero General Public License, version 3.0 (AGPL-v3).
When in doubt, see the contents of the
LICENSE.txt file at the root of each product's GitHub repository.
Your rights and obligations under different licenses are summarized in the adjacent table.
If the terms and conditions of AGPL-v3 are not acceptable, then it is possible to enter into a licensing agreement with Openscoring, which re-licenses the desired parts of the JPMML software from AGPL-v3 to BSD 3-Clause License. The re-licensing process is simple and straightforward. Please initiate it by clicking the button below:Request for BSD 3-Clause License
|BSD 3-Clause License
|Can use commercially:
|Can modify and distribute:
|Must disclose source form to end users:
Further assistance and discussion
Openscoring sells software, but provides support services for free.
General questions about (J)PMML?
Please open a new thread in the JPMML Mailing List.
Other exciting opportunities?
Please contact privately.