Hadoop for Database Professionals
At Gluent we believe there is a fundamental shift underway in IT to include open, software defined, distributed systems like Hadoop and as a result every RDBMS professional should be striving to learn these new technologies or risk being left behind.
We created this one day course for IT professionals that have a deep knowledge of RDBMS systems so they can better understand Hadoop and the benefits it brings to the enterprise. The course will go beyond merely listing the different components of the Hadoop ecosystem and how they fit together, and instead will compare and contrast subjects such as high availability, backup & recovery, data storage, SQL processing, system monitoring and day-to-day operations. We will close with several case studies, providing attendees a first-hand look at how Hadoop solutions have been built in the real world.
This course requires no previous knowledge of Hadoop or Big Data, but rather aims to provide a quick, technical dive into concepts that will help you get started with Hadoop. Detailed Hadoop Administrator topics, such as installation, configuration and optimization are out of scope of this class.
Introduction and Fundamentals of Hadoop
Get up to speed on the history of Hadoop, components involved, and where the Hadoop architecture and technology are headed. Included in this introduction are the following points:
- A very brief history of Hadoop, where it is headed, and why organizations choose Hadoop
- What is so different about Hadoop?
- Discuss hardware and software components of Hadoop
- Provide details about the Hadoop ecosystem of interchangeable components
- Describe typical architectures
- Remove common misconceptions about Hadoop
- Reconcile RDBMS-specific terminology with Hadoop terminology
- Why a regular database will just not cut it for all modern workloads
Hadoop Storage and Data Ingestion
There are several different techniques for data storage in Hadoop, including cloud based options. There are even more ways to move data into Hadoop. This module will focus on these topics, comparing the familiar RDBMS approach with the processes and technologies used in Hadoop, including:
- Compare and contrast Hadoop open data storage formats with RDBMS
- How to get data in and out of Hadoop
- Discuss databases in Hadoop, including NoSQL engines, and how they are similar and different from a standard relational database
- Maintenance and operations, including security administration, backups, and troubleshooting
SQL Processing in Hadoop
In this module, the various SQL on Hadoop technologies will be reviewed and discussed, showing that not all data processing routines must be hand coded in Java.
- Compare and contrast the differences in data processing on RDBMS vs Hadoop
- Discuss the various native SQL engines, such as Hive, Impala, Presto, and Spark
- Review the future of SQL processing in Hadoop
Hadoop in Action
Learn how Hadoop has been successfully implemented in the real world, for both Big Data and traditional enterprise data. We will show you examples from Hadoop-using real world customers in multiple different industries. This gives you a good idea where Hadoop has been proven out in the real world.
- Data lake
- Data hub
- Analytics acceleration platform
- Hadoop-first datasets
There are no upcoming events at this time.
No prior experience or knowledge of Hadoop or Big data is necessary.
This course is lecture based, so there are no hardware or software requirements.
- Database Administrators
- Database Developers
- Data Architects