HomeApache SAP Takes Steps to Improve Hadoop Integration
SAP Takes Steps to Improve Hadoop Integration
SAP, an industry leading provider of enterprise application software for businesses of varying sizes and industries, recently used the SAPPHIRE NOW conference in Orlando, Florida, to announce breakthroughs in terms of Hadoop integration. The company also used the conference to reveal examples of advantages offered to customers who have used big data setups based on SAP’s real-time data platform, and also introduced a new big data partner council.
The highlight of SAP’s array of announcements at the SAPPHIRE NOW conference was its declaration of new and improved big data integration capabilities with Hadoop environments. Such capabilities are set to be released with service pack 4 of the SAP HANA platform, and will be specifically delivered via the company’s SAP Data Services and SAP Information Steward software offerings.
With the move, SAP believes that SAP Data Services and SAP Information Steward will combine to give both business and IT customers an innovative solution that offers data integration, text data processing, data quality, data profiling, and metadata management in a consolidated package, all of which should help users leverage the power of big data.
The list of improvements set to ship with SAP Data Services and SAP Information Steward is impressive, beginning with Hadoop integration that includes reading and loading capabilities with Hive and the Hadoop Distributed File System for enhanced performance. Text data processing will be optimized through the extension of the data view, allowing users to execute linguistic analysis and more to thoroughly analyze their data sources.
In terms of data quality, SAP Data Services and SAP Information Steward promise users a greater insight into the accuracy of data through the use of data quality scorecards that are integrated into enterprise and business intelligence applications. Such scorecards offer a quick understanding and assessment of data quality, which in turn allows users to correct any issues in a swift manner as needed.
Steve Lucas, SAP’s global executive vice president and general manager of database and technology, commented on the overall vision behind the company’s latest data integration efforts: “Our goal is to help organizations access build and govern information value chains across all data sources. With our enterprise information management solutions, customers will have the ability to easily understand and access any data source -- be it from an SAP, custom or partner application, enterprise database or new data sources such as Hadoop -- so they can now better manage information throughout the organization.”
Based on SAP’s popular HANA platform, the SAP real-time data platform combines robust data management features from SAP Sybase ASE, SAP Sybase ESP, SAP Enterprise Information Management, and SAP Sybase IQ. Thanks to its real-time capabilities when it comes to absorbing, storing, and processing big data, the real-time data platform gives organizations the power to maximize the value derived from big data in a timely manner.
Lucas described the SAP real-time data platform, saying: “Groundbreaking innovations like SAP HANA help our customers access and deliver information at unprecedented speeds -- up to 100,000 times faster than before -- and empower them with fundamentally new ways to run their businesses and master 'big data. The SAP real-time data platform delivers an information value chain that uncovers and harnesses the right information at the right moment by moving data among SAP HANA, SAP Sybase IQ and Hadoop file systems.”
To show just how powerful SAP’s real-time platform is, the company used a customer showcase during the SAPPHIRE NOW conference that focused on Mitsui Knowledge Industry, a firm that specializes in the analysis of genomes for cancer research and treatment. Mitsui employed SAP HANA with R, an open source programming language and software environment, as well as Hadoop as part of its information value chain. The additions allowed Mitsui to decrease its genome analysis timetables from several days to just 20 minutes.
Commenting on the improvements achieved through SAP’s products, Yukihisa Kat, CTO and director for Mitsui, said: “Going from a process measured in days to one measured in minutes is radically transforming our customer relationships. Using the SAP real-time data platform with SAP HANA at its core will be critical to our DNA going forward and to future business growth.”
Regarding its new big data partner council, SAP representatives noted that it will consist of participants from a wide range of companies, including software providers, hardware vendors, technology services providers, and even startups that will partner with SAP on special projects. The ultimate goal of the council is to improve overall integration with the Hadoop ecosystem. Cloudera, an expert in the field of Hadoop that offers data management software, services, and training, will reportedly play a big role in the council.
Mike Olson, co-founder and CEO of Cloudera, offered his thoughts on working with SAP: “We are very excited to work with SAP to provide customers with real-time insights from their Hadoop environments using our complementary solutions. The SAP real-time data platform, combined with the Cloudera Hadoop Distribution, will deliver unmatched capabilities in next-generation 'big data' applications and analytics to the enterprise.”