
Tag: Spark
LinkedIn’s Translation Engine Linked to Presto
An SQL translation engine unveiled this week by LinkedIn is integrated with other open-source SQL query engines like Presto in a combination aimed at bulging data lakes. The Microsoft unit’s Coral engine handles ana Read more…
Data Exchange Maker Harbr Closes Series A
Harbr, a London startup that helps organizations like Moody’s Analytics to create their own custom data exchanges, yesterday announced that it has completed a Series A round of financing, netting $38.5 million for the Read more…
The Past and Future of In-Memory Computing
When Nikita Ivanov co-founded GridGain Systems back in 2005, he envisioned in-memory computing going mainstream and becoming a massive category unto itself within a few years. That obviously didn’t pan out, but on the Read more…
Aerospike Gives Legacy Infrastructure a Real-Time Boost
A database connector upgrade released this week by Aerospike Inc. links open source frameworks like Apache Spark data streaming to existing enterprise data infrastructure. Among the goals is providing backward compati Read more…
Microsoft Now Developing Its Own Hadoop
Hadoop might be dead, but that’s not stopping public cloud providers from using it. The latest to make a move is Microsoft Azure, which in July announced that it would begin developing its own distribution under its HD Read more…
To Centralize or Not to Centralize Your Data–That Is the Question
Should you strive to centralize your data, or leave it scattered about? It seems like it should be a simple question, but it’s actually a tough one to answer, particularly because it has so many ramifications for how d Read more…
Intel Updates Optane, Expands NAND SSD Offerings
Intel Corp. remains persistent in upgrading its Optane persistent memory series. The chip maker (NASDAQ: INTC) said this week its second generation Optane series is tuned to the latest version of its Xeon Scalable proce Read more…
Staying On Top of ML Model and Data Drift
A lot of things can go wrong when developing machine learning models. You can use poor quality data, mistake correlation for causation, or overfit your model to the training data, just to name a few. But there are also a Read more…
Will Databricks Build the First Enterprise AI Platform?
Ali Ghodsi might have one of the best jobs in technology right now. As the CEO of Databricks, Ghodsi just completed an oversubscribed $400 million round of funding that gave the company a $6.2 billion valuation. Better s Read more…
Simplifying the Big Data Lake Experiences in the Cloud
The cloud is a hot spot for big data lakes these days, thanks largely to the greater technological simplicity and lower upfront costs of getting started in the public cloud. But as organizations grow their cloud data lak Read more…
Presto Moves Under Linux Umbrella
An SQL query engine developed by Facebook and moved earlier this year to a non-profit development group is now being hosted by the Linux Foundation. The new Presto Foundation is seen as a way to scale the popular dist Read more…
More Cash for DataRobot Along with ML Ops Tool
High-flying enterprise AI specialist DataRobot announced another huge funding round along with a machine learning platform for managing predictive models that combines internally developed monitoring framework with so-ca Read more…
Baidu In-Memory Databases Add Intel Optane
Chinese e-commerce giant Baidu is building a new platform based on Intel Corp.’s Optane DC persistent memory as a means of upgrading search engine results delivered by its in-memory databases used to feed its streaming Read more…
Skills Are Critical in Data Science Job Hunt
Those planning a career in data science have a healthy job outlook, as demand for data scientists continues to grow. While an advanced data science degree can definitely help, it's becoming increasingly apparent that hav Read more…
Data Science Back to School: Accelerate Your Education
Are you looking to get a data science degree and join the workforce as a data scientist? Then you're not alone, as thousands of young people around the world are following that same path with the hope of tapping into the Read more…
Intel Turbocharges Spark Workloads with Optane DCPMM
Intel didn't wow chip lovers earlier this year with the launch of its 2nd Generation Intel Xeon Scalable processors "Cascade Lake" processors, which are based on the same 14nm process as the first generation processors. Read more…
Serverless SQL Engine Targets Cloud Analytics
Qubole Inc., the cloud analytics vendor, has added a serverless engine to its platform aimed at simplifying complex tasks like creating data pipelines and server clusters used to scale analytics workloads in the cloud. Read more…
What’s Behind Lyft’s Choices in Big Data Tech
Lyft was a late entrant to the ride-sharing business model, at least compared to its competitor Uber, which pioneered the concept and remains the largest provider. That delay in starting out actually gave Lyft a bit of a Read more…
Microsoft Expands Hadoop on Azure
Microsoft has upgraded its open source analytics services running on Azure with a new version of Hadoop incorporating enhancements of Apache Hive and other open source analytics frameworks. The software giant (NASDAQ: Read more…
How Databricks Keeps Data Quality High with Delta
Data lakes have sprung up everywhere as organizations look for ways to store all their data. But the quality of data in those lakes has posed a major barrier to getting a return on data lake investments. Now Databricks i Read more…