People to Watch 2020
As we look forward into 2020, a select group of leaders in Big Data have stepped ahead of the rest. In this fifth annual People to Watch feature, we’re offering our readers a look at some of the best and brightest minds in Big Data whose hard work, dedication, and contributions reach far beyond analytics, and are shaping the direction that technology is taking us as a people. This is the cutting edge, and these are the people who are pushing the envelope.
We present the Datanami People to Watch 2020:
Datanami: Snowflake seemed to have come out of nowhere to grab a big portion of the nascent market for data warehouses in the cloud. As its new CEO, what are you doing to ensure that it continues to grow at that outstanding rate?
First of all, our scope is much broader than data warehousing workloads. We are a cloud data platform that hosts a broad variety of database applications. Our growth is a function of a highly differentiated product, an accelerating shift to the cloud, a very high rate of customer acquisition, and rapidly expanding workloads. We aggressively focus and resource our company as the market opportunity unfolds in front of us.
Datanami: There has been some pushback recently against cloud vendors, driven mostly by cost. Do you think there will be a widespread pullback of analytics in the cloud? Or have we passed a point of no return to on-prem?
Cloud is way cheaper because of the utility model and the elasticity of workload capacity applied. The biggest change that customers have encountered is the shift from optimizing fixed capacity to managing a variable storage and compute environment. There are no upward boundaries on the scale of storage and the amount of compute that we can bring to bear on it, and that is a novel equation. Massive scale coupled with blistering performance. Job for job, customers spend much less in the cloud, but they perform dramatically more work than ever before. Because they now can. The market is growing fast because the use of data has become so strategic to digital enterprises.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
For years, I have campaigned a racing sailboat on the west coast, named the “Invisible Hand.” We have won some iconic yacht races like the 2017 Transpac (from LA to Hawaii) with our mostly Kiwi racing crew.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Frank Slootman currently serves as Chairman and CEO at Snowflake. Frank has over 25 years of experience as an entrepreneur and executive in the enterprise software industry. Mr. Slootman served as CEO and President of ServiceNow from 2011 to 2017, taking the organization from around $100M in revenue, through an IPO, to $1.4B. Prior to that, Frank served as President of the Backup Recovery Systems Division at EMC following an acquisition of Data Domain Corporation/Data Domain, Inc., where he served as the Chief Executive Officer and President, leading the company through an IPO to its acquisition by EMC for $2.4B. Slootman holds undergraduate and graduate degrees in economics from the Netherlands School of Economics, Erasmus University Rotterdam.
Datanami: There was turmoil in the BI market in 2019. Why do you think that will or won’t continue in 2020?
The reason why 2019 was so tumultuous for the BI market is because the industry underwent a massive shift: the era of data visualization as the leading innovation is ending and a new age of analytics is taking its place. Emerging technologies like AI and natural language processing are responsible for this overhaul — advancements in these areas have enabled real-time, self-service analytics, effectively replacing business intelligence systems as we know them. Industry analysts agree: in its latest report, Gartner MQ reports that data visualization is now a commodity, not a differentiator. The turmoil will shift in 2020 from vendors to the customers who rely on them to innovate and improve their bottom lines with data-driven decision-making. These companies will need to find new partners if they want to stay ahead of the disruption curve.
Datanami: Industry observers say the BI market is ripe for disruption. What role do you see ThoughtSpot playing there?
Last year’s market consolidation kicked off a wave of disruption in the BI space that will continue into 2020, and ThoughtSpot is at the forefront of this shift. The analytics industry is at an inflection point and must adapt to meet the needs of the modern business person who depends on data to do their job and deliver value to their organization. Gone are the days of static dashboards and visualizations; business users in every industry require the flexibility to access data to make decisions whether they’re in the boardroom, on a showfloor or visiting a customer. Intelligent, real-time reporting that delivers the right insights to the right people at the right time is the future of the industry, so they can add their own expertise and knowledge, and then most importantly, take action. ThoughtSpot provides this data to insight to knowledge to action pipeline, which is so essential to successful digital transformation.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
As someone with a sales background, you might expect that I’m an extrovert. However, I also respect the importance of independence and silent reflection. I believe time alone is just as important as time spent with others if you want to achieve real success.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Sudheesh Nair is CEO of ThoughtSpot, which has built an intuitive Google-like interface for data analytics. He’s committed to building tools that democratize data access and ensure every employee can leverage its benefits. Prior to ThoughtSpot, Sudheesh was president at Nutanix, which had the hottest tech IPO of 2016.
Of all the people to watch in 2020, it can be argued that nobody is more important than the American Consumer, who has trailed his European counterpart in terms of data privacy protection. Individually, each consumer in the United States doesn’t have much power or ability to influence law. But collectively, its 327 million citizens have a tremendous amount of clout determining how data will be collected and used in the future.
We sit at a crossroads when it comes to the use of data today. The advent of big data in the early 2010s heralded a Wild West period where there were few laws restricting use of data. It was a period of “anything goes,” with rampant abuse of people’s privacy on the Web and mobile devices. Companies collected a tremendous amount of data on consumers without their consent, and used that data to sell them goods and services – or sold that data to somebody else who did.
But with the passage of new data regulations like the General Data Protection Regulation (GDPR) in the European Union and the California Consumer Privacy Act (CCPA), the Wild West period of big data has been put on hold, and regular citizens have started to gain new digital rights and control over their data and digital existences.
New data regulations are being pushed in dozens of countries around the world. Here in the US, individual states are crafting their own versions of the CCPA in the hopes of solidifying the digital rights of their citizens, and pushing back against the old abuses that marked the early days of big data. Leaders are also starting to talk about the need for rules requiring the ethical use of AI, but without controls over the underlying data, those discussions are moot.
We are still at the beginning of the data privacy and AI ethics movements, and nobody can predict how the processes will unfold. Will the American Consumer demand that companies stop abusing the digital trails they leave on the Internet, and follow the precepts of CCPA and GDPR? Will ethics be written into the laws governing AI? Or will the early efforts fail, freeing companies to continue to exploit pubic data and AI breakthrough for their private benefit?
None of this is known. But one thing is certain: With national elections scheduled later this year and numerous data regulations on the dockets of legislative houses around the country, the time is ripe for action. If the American Consumer decides to stand up, flex its collective muscle, and demand that its digital rights be respected, then the emerging data industrial complex will have no choice but to abide by its wishes. But if it doesn’t, then the status quo will largely stay in place.
That is why the American Consumer was selected as one of our People to Watch for 2020.
Datanami: Explainability is a hot topic in AI. Are we close to solving this challenge without compromising on the quality of predictions that AI makes?
We are used to metrics such as accuracy, ROC, AUC and alike to measure the quality of predictions. But they are only one component of “good prediction”. To trust and accept an algorithmic decision, we need so much more. We need to know that the decision is fair, that it does not have a potential to harm a user or group of users, that it can be reproduced and verified, that the system itself was safe from tampering, and that we can understand how and why the decision was made. Explaining decisions and outcomes is critical to being able to actually use those decisions in practice, or act based on those recommendations. In a way, explainability is a pre-requisite for the broad adoption of AI. Explainability is at the core of how humans process and navigate the world around us, and how we interact with it. We explain things in so many different ways — we look for examples and counterexamples, we summarize things, we cite most important characteristics, we look for cause and effect, and so on — any technology that aspires to mimic human intelligence has to possess an equally diverse and expressive toolkit of explanations.
Datanami: The adoption of AI is accelerating. What obligations do technologists have to ensure that it’s being used to improve people’s lives as opposed to just making one group richer?
As technologists, we have a responsibility to champion beneficial uses of AI and direct its development towards solutions that serve all. There are so many pressing social and humanitarian challenges where technology can help move the needle: climate, hunger, lack of access to education and healthcare, emerging diseases, to name the few. These are the problems that should motivate and inspire AI research and development. We also must continue to create technical capabilities, safeguards and best practices that would make AI trustworthy, that would help mitigate biases, improve AI safety, and instrument AI solutions and deployments for transparency, so that we can prevent harmful outcomes and foster responsible applications of this powerful technology.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
In my “other” life, mainly on weekends, I am QueenSashy, photographer, writer, foodie, and a face behind James Beard nominated food blog Three Little Halves. It’s something I do when I need refuge from the fast-evolving world of AI and the ever-accelerating rhythm of our lives. There is something deeply soothing in touching things, like stirring a pot of pasta, pressing the shutter on a camera, or typing on a keyboard. Chopping vegetables, taking photos, scribbling formulas, writing code — for me it’s all a form of meditation.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Aleksandra (Saška) Mojsilović is an IBM Fellow, Head of Foundations of Trustworthy AI at IBM Research, and Co-Director of IBM Science for Social Good. Saška has spent last two decades pursuing innovative applications of data science and machine learning in real-world challenges, including IT operations, healthcare, multimedia, finance, insurance, HR, economics, AI ethics, and social good. In her current role, she is leading the development of foundational AI technologies for enabling trustworthy and responsible AI at scale. Saška is one of the pioneers of the rising AI for Good movement. She is the author of over 100 publications and holds 20 patents. She is a Fellow of the IEEE and a member of the IBM Academy of Technology. Saška received her B.S. (92), M.S. (94), and Ph.D. (97) degrees in Electrical Engineering from the University of Belgrade, Belgrade, Serbia.
Datanami: The CEOs of Google and IBM recently came out in support of regulating AI. Do you think that AI regulations are inevitable?
Yes – especially as AI at scale becomes more of a reality for how companies and organizations conduct business. In our Deloitte State of AI in the Enterprise study, we saw that early adopters of AI were ramping their investments, launching more initiatives and getting positive returns. As more organizations across different industries place AI at the center of their business operations, there will be more questions raised about how AI is being used, privacy, ethics, bias, and more.
Not surprisingly, there’s been a rise in discussion around how to regulate AI. Companies can no longer ask the world to trust AI-powered products and solutions blindly. Trust around AI requires fairness, reliability, transparency, security and accountability. AI regulation is not going to be very straightforward – it is uncharted territory for an age where we are trying to bring together the best of human capabilities with the best of machine capabilities. Developing a code of ethics, laws, policies, government oversight, corporate transparency and capability of monitoring the AI space is necessary to achieve AI regulation.
With this, more companies are exploring and factoring in ethical considerations and their organizations’ respective values into the development of their AI solutions. I am also starting to see more governance bodies – whether run by individual governments or companies – start to emerge and create guidelines to help ensure ethical use of AI.
Datanami: As emerging technology such as AI continues to become commonplace, women in tech continues to be an area of slow growth. What must be done to address the issue and get more women into the technology workforce?
Over the course of my career, I have been very passionate about getting women into technology – but as AI continues to develop and infiltrate our everyday lives, I think it’s important to realize we need a diverse tech workforce, now more than ever – with differing opinions, experiences, preferences and views to create technologies for a diverse world population. Otherwise, the technologies we create could potentially be inherently biased. I strongly feel that, we need to ensure that there is good representation, especially in the field of AI, and that’s been the main motivation behind starting my non-profit, Humans For AI. I am also fortunate to work for a firm like Deloitte where enhancing workforce diversity and fostering inclusive growth is top of mind.
Moving the needle on STEM diversity is hard work. However, there is so much that we can do to make it happen. We are seeing many companies stepping up to make public commitments to increasing diversity – whether it is through targeted inclusion training, accountability tools, and consciously removing unconscious biases in recruitment and promotion processes. I believe that we as women also have a responsibility as individuals to step up and serve as a mentor, role model, sponsor or advocate.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I love poetry. In fact, I have been writing poetry since I was in school. William Wordsworth, John Keats and Robert Frost, their poems still inspire me. Sharing a stanza from my all-time favorite poem, “Daffodils,” by William Wordsworth:
“I wandered lonely as a cloud
That floats on high o’er vales and hills,
When all at once I saw a crowd,
A host of golden daffodils;
Beside the lake, beneath the trees,
Fluttering and dancing in the breeze.”
The first computer language I learned was Pascal. I remember, that it somehow evoked the sense of writing poetry – both programming and poetry have the kind of economy of expression that arises from clarity of thought.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Beena Ammanath is AI Managing Director at Deloitte Consulting LLP and Founder of non-profit, Humans For AI. She is an award-winning senior executive with extensive global experience in AI and digital transformation. Her knowledge spans across e-commerce, financial, marketing, telecom, retail, software products, services, and industrial domains. Beena also serves on the industrial advisory board at Cal Poly College of Engineering, and she has been a board member and advisor to several startups including Flerish, Predii, iguazio, CliniVantage, and ProjectileX.
Beena has been honored several times for her contribution to tech and her philanthropic efforts, including: UC Berkeley 2018 Woman of the Year in Business Analytics, San Francisco Business Times’ 2017 Most Influential Women in Bay Area, WITI’s Women in Technology Hall of Fame, National Diversity Council’s Top 50 Multicultural Leaders in Tech, CIO.com and Drexel University’s Analytics 50 innovator, Forbes Top 8 Female Analytics Experts, and Women Super Achiever Award from World Women’s Leadership Congress.
Tina is the Co-Founder of Women in Big Data. Tina’s expertise lies in high tech marketing. She says her life purpose is to help other realize their full potential. Through her love of people, she’s able to help those two things come together. Tina’s specialty is building teams, bringing bold thinking and a can-do attitude to every initiative, and coaching for performance. Tina demonstrates leadership and success throughout her progressive career characterized by innovation, a strong network, and results.
Datanami: You were recently promoted to the CEO job at Anaconda. How does it feel to now be running the data science platform company you co-founded years ago?
I’ve always been focused on making the company and our open-source community successful. Now that I’m stepping into the role of CEO, I am fundamentally excited about what lies ahead because of the great things we get to do for our customers and the community. The health and growth of the Python data science community is a top priority for me, and serving it is key to my vision for the company. The health of this community is also critical to the success of our customers. I’m thrilled to work with an amazing executive team that lives and breathes our core values. We’ve never been more aligned on our vision and the direction we need to take.
Datanami: What big challenges are you tracking in 2020? What’s the one thing the data science community should tackle?
One thing we’re focused on in 2020 as a company is tackling security and governance around open-source data science. Many of our customers either struggle with the time they put into manually managing open-source, or they leave their pipeline vulnerable. We just launched a new product called Anaconda Team Edition that is meant to do just that. For the first time ever, organizations have an easy solution to control what’s in their data science pipeline. As a community, it is essential we promote data science literacy. We need to empower people with data science so that they can ask better questions and make better sense of the world. Everyone needs to be somewhat data science literate, because today everyone leaves a data trail and consumes data. We need to put real data science in the hands of as many people as possible so that they aren’t grossly disadvantaged in comparison with the largest corporations and government organizations. I believe this is a social responsibility everyone in our community should take on. Lastly, to continue to mature as a profession and navigate growing pains, I encourage the entire global community of data science practitioners to form a professional organization – not unlike the IEEE or the American Medical or Bar Associations – that maintains professional standards, establishes practice ethics, and provides guidance to employers and educators. I have started a nascent conversation about this with certain organizations, and there is a high level of interest.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I’ve spent the last 3 years obsessively studying the intersection of tech dystopia and information and memetic warfare. I have given podcast interviews about these topics, blogged about them on Medium, and have been curating materials and writing the core tenets of a new philosophical framework for “informational humans.” My “day job” working on democratizing access to data science & ML/AI expertise goes hand-in-hand with this interest. I firmly believe that humanity can navigate through the challenges of disinformation, deepfakes, etc. to better and smarter societies, but only if everyone has the knowledge and the tools to participate.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Peter Wang has been developing commercial scientific computing and visualization software for over 15 years. He has extensive experience in software design and development across a broad range of areas, including 3D graphics, geophysics, large data simulation and visualization, financial risk modeling, and medical imaging. Peter’s interests in the fundamentals of vector computing and interactive visualization led him to co-found Anaconda. As a creator of the PyData community and conferences, he devotes time and energy to growing the Python data science community and advocating and teaching Python at conferences around the world. Peter holds a BA in physics from Cornell University.
Originally from India, Thomas Kurian came to the US in 1986 to study at Princeton where he received hi bachelor’s of education. He went on to receive his MBA from Stanford Graduate School of Business. From there Kurian went on to work at McKinsey and Company as a consultant serving clients in the software, telecommunications, and financial services industries. In 1996 Kurian joined Oracle where he held various product management and development positions. His first executive role was as Vice-President of Oracle’s e-Business division, driving a number of company-wide initiatives focused on transforming Oracle into an e-Business. Next Kurian took responsibility for the Oracle Fusion Middleware product family. Under Mr. Kurian’s leadership, that business became the company’s fastest-growing business and the industry’s leading middleware product suite. On September 28, 2018 he resigned from his current designation as president of product development at Oracle. On November 16, 2018 Google Cloud announced that Kurian would join as its CEO.
Datanami: We’ve seen a proliferation of databases in the big data era, including many that relax various ACID constraints. Do you think it’s necessary for application developers to accept these types of compromises to achieve scale and throughput?
Until recently, companies that needed a distributed, ACID-compliant database didn’t have many options. Legacy relational databases were designed decades ago, and scaling them can become an operational nightmare. NoSQL solutions emerged to handle massive scale, but they cannot provide ACID transactions and leave data “eventually consistent.” It’s understandable why many database companies chose to relax various ACID constraints in an effort to achieve scale and throughput, but these compromises can create costly, compounding data integrity issues.
While at Google, I watched Spanner develop into the world’s first distributed ACID-compliant database for applications internal to Google. My co-founders and I set out to build a similar system that could be run anywhere – across clouds, on premise, or in a hybrid environment – and that delivers distributed SQL and ACID transactions to all modern cloud applications.
CockroachDB is ACID-compliant and scales globally without the need for a massive architectural overhaul. With a distributed SQL platform, sacrificing data integrity for the sake of scale or throughput is no longer necessary.
Datanami: Cockroach Labs changed from an open source to a proprietary license last year. Have we reached the end of the road for open source software?
Big cloud vendors have preyed on open-source R&D by providing OSS software as a service to edge out small competitors. Combine that with the platform benefits of economies of scale and greater opportunities for integration, and the big cloud providers can drown open source startups. That said, companies eclipsing growth-stage, and legacy companies looking to store mission-critical data in the cloud are becoming wary of big vendors not investing in their R&D. That’s why, in 2019, we, and many others, changed our open source licensing to combat this breed of competitor.
Still, I don’t believe we have reached the end of open source software. I believe the OSS companies are at a pivotal moment as we all figure out the right mix of free and paid features to offer without compromising anything.
Open source has been changing the world for more than 25 years and I expect that to continue for the next several years especially as more companies look for products that provide ease and flexibility as they move to a multi-cloud approach. After the flurry of mega-acquisitions and IPOs of open source companies last year, we’re also seeing a lot of excitement in the space from investors and incoming entrepreneurs.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I have an intense love of architecture and interior design and have had the good fortune to work on a few projects over the past 10 years. In another life, I would have a difficult time deciding which of those two passions to focus on.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Spencer Kimball is the co-founder and CEO of Cockroach Labs, where he maintains a delicate balance between a love for programming distributed systems and the excitement of helping the company grow smoothly. While in university, he was one of the original authors of the GIMP. He cut his teeth on databases during the dot com heyday, and had a front row seat at Google for a decade’s worth of their evolution.
Datanami: What’s been the biggest breakthrough in the field of applied statistics over the past 50 years, and do you think we’re on the cusp of another one?
There are breakthroughs for statistics itself and, as the servant science that it is, statistical breakthroughs for other fields. In the first category, I’d put Brad Efron’s bootstrap method of assessing uncertainty, for which he received the 2018 International Prize in Statistics. In the second, I’d put the Cox Model for analyzing survival data, for which Sir David Cox was awarded the same prize in 2016. However, computing has advanced all aspects of applied statistics and advanced analytics in general. On our SAS journey from mainframe-based calculations to distributed parallel computing in the cloud, we have made many breakthroughs in areas deemed too challenging for many years. An idea which revolutionized data science today was integration of data access, data transformation, model building, model deployment, and decisioning on a single platform. SAS recognized the need for real-time analytics early, an example being usage of SAS for detecting fraudulent credit card transactions. Lately, deep learning is a prime example of massive parallel computational innovation. While it has gained popularity in the last few years, SAS introduced the first neural network implementation back in the early 90s. Computer vision and natural language processing are two prime application areas which benefit from deep learning. Other fascinating areas worth mentioning are the fields of reinforcement learning techniques and probabilistic programming. Are we on the cusp of another breakthrough? Undoubtedly, with recent advances in technology and more companies asking for assistance on their digital transformation journey, we will see many more innovations. Automation and the ability to explain advanced analytics to the layman go hand in hand – while we have made significant progress already, I’m convinced that we will see exciting development very soon.
Datanami: When you founded SAS back in 1976, did you envision having the type of success that you have had?
When we left the university, we were just hoping to cover our salaries. We had a sense that there was some real potential when 200 people showed up for the first users conference before we even incorporated. I wouldn’t say I imagined the kind of success we’re having today. We just loved developing new software and solving problems. If you’re solving problems for your customers, the success will follow.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I’m an avid mineral collector. My mineral collection is on display throughout SAS campus and consists of pieces from all over the world. Some of the pieces are as beautiful as a sculpture and some are just interesting, like the dinosaur egg and the meteorite.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
As the CEO of SAS, the world’s leading business analytics software vendor, Jim Goodnight has led the company since its inception in 1976. Under his leadership, SAS has become renowned for its innovation and corporate culture. His commitment to work-lie balance has made SAS a fixture on best workplaces rankings worldwide.
SAS® software was originally created by Goodnight and North Carolina State University colleagues to analyze agricultural research data. Four decades later, a solid reputation for innovation has secured SAS among the world’s largest software companies. A champion of education reform, Goodnight spearheaded the creation of a national Business Roundtable report calling on business leaders to support and advocate for efforts to improve early learning and third-grade reading proficiency. In North Carolina, he rallied a group of CEOs to the cause, which contributed to increases in funding that allow more eligible children to enter the state’s high-quality pre-kindergarten program. Under his direction, SAS has focused on helping students, educators and independent learners develop sought-after skills through technology offerings. He earned his bachelor’s degree in applied mathematics and his master’s in statistics from North Carolina State University (NCSU). He also earned his doctorate in statistics at NCSU, where he was a faculty member from 1972 – 1976. Harvard Business School named Goodnight a Great American Business Leader for his role in making SAS a business that changed the way Americans lived, worked and interacted over the last several decades. He was also named one of America’s 25 Most Fascinating Entrepreneurs by Inc. magazine. Goodnight is an active participant in the Business Roundtable and the Business Council, where CEOs address global issues and business concerns.
Python’s popularity has exploded over the past few years. What do you think will happen when Ray provides an easy path to parallel processing for Python?
Python is the first language that many people learn, and it is exploding in popularity in part because it is easier to get started with and to learn the basics than most other programming languages. But once you get beyond the basics, the story is much less complete. When people started training deep neural networks, the tooling in the initial years was very immature. People were manually writing GPU code, implementing their own neural network libraries, and calculating gradients by hand (using the chain rule). Tools like TensorFlow and PyTorch have simplified all of these tasks and have made neural network training and optimization much more accessible. And as a consequence, many more people and businesses have been able to apply machine learning, at least the basics, to solve their problems.
Looking beyond deep learning, the next big milestone for Python is simple, general-purpose distributed computing. Distributed computing is becoming the norm, but actually building and programming distributed applications requires tons of expertise, so today it’s largely done by experts. Companies like Google and Facebook typically have large infrastructure teams that build all of the underlying distributed systems and scaffolding needed to run scalable applications, but few companies have that kind of expertise.
With Ray, we’re working to make it as easy to program a cluster of machines as it is to program a laptop. The point is to allow anyone who knows Python to take advantage of all the cluster resources and all the advances in machine learning and to build the same kinds of distributed applications and services that Google can build, all without having to be an expert in distributed computing. They should be able to build the same kinds of distributed applications on their own that normally require teams of experts.
What we will see is a surge in the number of people who are able to take advantage of distributed computing and machine learning, and as a consequence many more people are going to be applying these tools to solve their problems, not just those in the most advanced technology companies.
Ray came out of the RISELab, the successor to AMPLab at UC Berkeley. What’s it like to follow in the shadows of transformative tech projects like Spark, Mesos, and Alluxio (formerly Tachyon)?
Ray wouldn’t have been possible without the collaborative environment fostered by the RISELab and UC Berkeley. My background, before working on Ray, was in machine learning algorithms and theory, so we were deeply familiar with the pain points surrounding machine learning research and development, and began working on Ray to address these pain points. However, when it came to actually building distributed systems, I was a beginner. There were many things we didn’t know and needed to learn, and we were able to do so because we were sitting in the RISELab surrounded by the world’s leading experts in distributed systems, security, and cloud computing, including the people who created those projects.
Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I enjoy most sports and do a lot of running and biking these days. Growing up in California, I spent a lot of time skateboarding.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
Robert Nishihara is a co-founder and CEO of Anyscale, which aims to make it as easy to program clusters of machines as it is to program your laptop. He is one of the creators of Ray, a system for distributed Python and scalable machine learning. Robert studied math as an undergraduate at Harvard and did his PhD in machine learning at UC Berkeley advised by Michael I. Jordan.
Datanami: This is your second rodeo with a Hadoop distributor. What’s the biggest difference this time around? Is it harder this time, or easier?
While Cloudera will always be rooted in the open source community, the company has undergone a significant transformation since its creation over a decade ago. Now, Cloudera delivers the industry’s first enterprise data cloud—a platform that is not only 100% open, but is multi-cloud, multi-function, secure and governed.
As for Hadoop, my colleague Arun Murthy has an eloquent way of putting it: “The old way of thinking about Hadoop is dead—done, and dusted. Hadoop as a philosophy to drive an ever-evolving ecosystem of open source technologies and open data standards that empower people to turn data into insights is alive and enduring.” I’m excited by Cloudera’s vision of bringing this philosophy on openness to the emerging enterprise data cloud market.
Datanami: Cloudera has weathered quite a bit over the past 12 months, but has emerged with a compelling vision for an enterprise data cloud. Why will this strategy resonate with the Global 2000 enterprises?
Cloudera works with some of the biggest companies in the world, including the top ten automotive and telco companies, nine out of the top ten pharmaceutical companies, and eight out of the top ten banking and tech companies. Years of conversation with these customers has given our team an inherent understanding of the challenges today’s organizations face when it comes to unlocking the potential of their data.
Companies today have game-changing opportunities to use data to tackle their biggest business challenges, but recent research shows that only 35% of enterprises think their current data strategy is sufficient in doing so, and 55% indicate their lack of expertise as the main barrier to gaining maximum value from data. Marked by its ability to enable customers to manage data on-premise and in hybrid and public clouds with a consistent, shared data experience, governance and security, the Cloudera Data Platform was created specifically to solve these universal problems.
As the leading provider of telecommunications services in the Philippines, Globe Telecom uses the Cloudera Data Platform to more effectively personalize the customer experience. CDP helps Globe Telecom manage their data, interpret analytics, and put those insights into action with customer segmentation. The company is now able to better tailor how customers discover, purchase, and use their services.
Datanami: Outside of the professional sphere, what can you share about yourself that your colleagues might be surprised to learn – any unique hobbies or stories?
I have a farm near a small town just outside of Atlanta and one of my hobbies is Quail hunting.
Finally, if you have one, please insert a short bio below – this will be included with your interview.
I am an experienced enterprise software executive. I was co-founder and CEO of Hortonworks, a publicly traded open-source company that merged with Cloudera in 2019. I also served as President and COO of SpringSource, a leading provider of open-source developer tools, until its acquisition by VMWare in 2009. Before
joining SpringSource, I served as Entrepreneur in Residence at Benchmark Capital and also served as COO of JBoss, a leading open-source middleware company, until its acquisition by Red Hat in 2006.