Enterprises are struggling to derive business value from the onslaught of structured, semi-structured, and unstructured data. Legacy data warehousing and analytics platforms don’t deliver the speed and scale necessary in today’s big data world.
The Hadoop Distributed File System (HDFS), in turn, is an open-source distributed file system that can serve as an effective storage ground for large amounts of data. Hadoop is well suited for batch processing where immediate interactive analytics are not required. HP Vertica Analytics Platform consists of a massively parallel database and an extensible analytics framework optimized for real time analytics of data scaling from gigabytes to petabytes. HP Vertica and Hadoop are complementary analytics platforms purpose-built for big data. Both are modern, scalable, massively parallel processing (MPP) systems built for commodity hardware at considerably lower total cost of ownership.
Here are some real-world examples on how enterprises are using the HP Vertica Analytics Platform to help accelerate their Hadoop environment.
- Processing social video events
A social video company uses Hadoop for batch processing of logs and HP Vertica Analytics Platform for ETL, ad hoc analytics, and interactive dashboards. In addition, the company uses a KV store for serving low-latency data needs.
- Accelerating drug discovery
A pharmaceutical company sought to analyze gene variants for improved drug targeting and discovery. It uses Hadoop to find the variants between a sample sequence and a reference genome, and uses the HP Vertica Analytics Platform to run analytics on very large sets of data to determine oncology targets.
- Delivering digital consumer insights
A digital intelligence company uses HDFS to store raw input behavioral data and Hadoop to find conversions by determining what type of user clicked on a particular advertisement, and HP Vertica Analytics Platform to store and operationalize high-value business data. This helps the company achieve faster insights that are delivered more consistently with less administrative overhead and lower-cost, commodity hardware.
- Enabling privacy assurance
A company focused on web privacy uses HDFS to collect user privacy reporting requests, MapReduce to process and structure the data into HP Vertica Analytics Platform (ETL), and the platform to analyze statistics for every third-party tag on a website measuring site performance. Consumers benefit from a free browser plug-in that can tell them who is tracking them. Advertisers, in turn, can provide greater transparency to end users and better understand the impact of third-party tags on website performance.