

Following table describes each of the component in detail. The above diagram consists of different components. The following diagram illustrates the architecture of Presto. The architecture of Presto is almost similar to classic MPP (massively parallel processing) DBMS architecture. This cross-platform analytic capability allows Presto users to extract maximum business value from gigabytes to petabytes of data. In addition, Presto can reach out from a Hadoop platform to query Cassandra, relational databases, or other data stores. Presto runs on multiple Hadoop distributions.

It allows to easily plug in file systems. Presto has a connector architecture that is Hadoop friendly.

DBVISUALIZER DESCRIBE COMMAND CODE
Though it is built in Java, it avoids typical issues of Java code related to memory allocation and garbage collection. Presto supports standard ANSI SQL which has made it very easy for data analysts and developers. Well, hundreds of employees are running queries each day with the technology. Teradata contribution to Presto makes it easier for more companies to enable all analytical needs.Īirbnb − Presto is an integral part of the Airbnb data infrastructure. Teradata − Teradata provides end-to-end solutions in Big Data analytics and data warehousing. Presto easily scales large velocity of data. Let’s take a look at some of the notable applications.įacebook − Facebook built Presto for data analytics needs. Presto supports most of today’s best industrial applications. Quickly scales petabytes data with low latency.Here is a list of benefits that Apache Presto offers − User-defined functions - Analysts can create custom user-defined functions to migrate easily.Pipelined executions - Avoids unnecessary I/O latency overhead.Pluggable connectors - Presto supports pluggable connector to provide metadata and data for queries.Presto is powerful, and leading companies like Airbnb, DropBox, Groupon, Netflix are adopting it. Presto is built in Java and easy to integrate with other data infrastructure components. Presto runs queries easily and scales without down time even from gigabytes to petabytes.Ī single Presto query can process data from multiple sources like HDFS, MySQL, Cassandra, Hive and many more data sources. What is Apache Presto?Īpache Presto is a distributed parallel query execution engine, optimized for low latency and interactive query analysis. In the year of 2012, Facebook team members designed “Presto” for interactive query analytics that would operate quickly even with petabytes of data. Later, when warehouse data grew to petabytes, they decided to develop a new system with low latency. Facebook warehouse data is stored in Hadoop for large scale computation. Well, big data analytics involves a large amount of data and this process is quite complex, hence companies use different strategies.įor example, Facebook is one of the leading data driven and largest data warehouse company in the world. It is primarily used in many organizations to make business decisions. Data analytics is the process of analyzing raw data to gather relevant information for better decision making.
