TAP for Data Scientists

TAP provides data scientists with extensible tools, scalable algorithms and powerful engines to train and deploy Big Data-driven analytic models. In particular, it provides the tools and services needed to perform common tasks with a focus on supporting the complexity associated with Big Data as well as making advanced Big Data Analytics available more easily deliverable to application developers for consumption in their applications.


Using TAP, data scientists are able to build and train models for Big Data using familiar interfaces, such as iPython notebooks and Eclipse IDE. TAP also provides libraries for graph, deep learning and “classical” machine-learning algorithms. All algorithms in TAP are open source and continue to be curated and enriched in order to support an ever-growing set of new workloads. In addition, almost all algorithms are parallelized and can be executed on a distributed processing system, such as Apache Spark or Apache Hadoop.


Key features of TAP for data scientists:

  • Integrated, self-service environment
  • Tools, engines, frameworks and algorithms for working with Big Data
  • Rich set of predictive APIs
  • Web-based user interface
  • Batch and streaming integration of any data type
  • Analytic model deployment to downstream developers and applications

Key benefits of TAP for data scientists:

  • Ease of use
  • Improved Productivity and agility ( with silicon optimization)
  • Reuse of resulting models by application developers

More Information

For more technical information, visit https://github.com/tapanalyticstoolkit.